Topic: Suggestion for an alternative pointer/handle/memory consept


Author: pabloh@hpwala.wal.hp.com (Pablo Halpern )
Date: 29 Sep 92 17:50:41 GMT
Raw View
In article <1992Sep24.191342.15458@cadsun.corp.mot.com>, shang@corp.mot.com (David (Lujun) Shang) writes:
|> My question is that: is it possible to write class of a special
|> memory reference protocol that can be used exactly the same way
|> as a standard pointer is used in C++?
|>
|> With language-supported pointers, we can write:
|>
|>    BaseClass * bp;
|>    DerivedClass * dp;
|>    ...
|>    bp = dp = new DerivedClass(...);
|>    bp->SomeVirtualFunction(...);
|>    ...
|>
|> "BaseClass*" can be viewed as a type of pointer<BaseClass>, and
|> "DerivedClass*" as pointer<DerivedClass>, where pointer is a
|> standard memory reference.
|>
|> Now, we consider the MS Windows's memory protocol. Is it possible
|> to write codes as follows:
|>
|>   HANDLE<BaseClass> bh;
|>   HANDLE<DerivedClass> dh;
|>   ...
|>   bh = dh = new DerivedClass(...);
|>   bh->SomeVirtualFunctions(...);
|>
|> Further, is it possible to write codes that independent to any
|> particular memory reference protocols? Like:
|>
|>   REF<BaseClass> bref;
|>   REF<DerivedClass> dref;
|>   ...
|>   bref = dref = new DerivedClass(...);
|>   bref->SomeVirtualFunctions(...);

I do not have a complete solution, but here are some ideas:

Implement a base template class with a pointer-like interface:

template <class T>
class INDIRECT
{
public:
  INDIRECT();
  INDIRECT(const INDIRECT &);
  virtual ~INDIRECT();
  virtual T& operator *();
  virtual T* operator ->() { return & this->operator*(); }
  virtual INDIRECT& New(params);
  virtual void Delete();

  // Assignment and equality operators, etc. ...
};

Then derive a class for each special kind of indirection.

template <class T>
class SMART_POINTER : public INDIRECT<T>
{
  ...
};

template <class T>
class HANDLE : public INDIRECT<T>
{
  ...
};


Instead of using operator new(), you call the New() member function on the
indirect object:

  i.New(args);     instead of    i = new T(args);

One of the  biggest disadvantage to this approach is that the New()
function must take the same parameters regardless of the type of T.  This
can be as follows:

template <class T>
class INDIRECT
{
  ...
  virtual INDIRECT& New(const T&);
  ...
};

HANDLE<String> hs;
hs.New(String("abc"));

This is, of course, very kludgy and could result in undesirable side
effects if the constructor for String is called twice.  I told you it
wasn't a complete solution.  Just some food for thought.



- Pablo

------------------------------------------------------------------------
Pablo Halpern             (617) 290-3542
HP Waltham                pabloh@hpwarq.wal.hp.com

I am self-employed, so my opinions *do* reflect those of my employer.
However, they may not reflect the opinions of my client.
------------------------------------------------------------------------




Author: shang@corp.mot.com (David (Lujun) Shang)
Date: Thu, 24 Sep 92 19:13:42 GMT
Raw View
In article <1992Sep18.012044.28877@ucc.su.OZ.AU> maxtal@extro.ucc.su.OZ.AU
(John MAX Skaller) writes:
> In article <RUJO.92Sep16210421@ulrik.uio.no> rujo@ulrik.uio.no (Rune
> J|rgensen) writes:
> >
> >SUGGESTED SOLUTION:
> >
> >Implement the handle to pointer mapping at processor level. (As well
> >as handle allocation/free/management in general).
> >
>
>  haven't you heard of the 80386? It does exactly what you want.
> You can resize and move objects without changing the pointers.
> They can be paged to disk by virtual memory. All transparent.
> Only problem is the limit on the number of objects (max 8K segments)
> <grin>
>
> Windows 3.0 does all this (except resizing) automatically and
> transparently, you can just set operator new to do a global alloc
> and then a global lock to get a pointer. Global memory is still
> compacted on the fly without changing the pointers, and your
> object can be paged in and out of memory automatically too.
>

I'd like to deviate this topic to an language design issue.

My question is that: is it possible to write class of a special
memory reference protocol that can be used exactly the same way
as a standard pointer is used in C++?

With language-supported pointers, we can write:

   BaseClass * bp;
   DerivedClass * dp;
   ...
   bp = dp = new DerivedClass(...);
   bp->SomeVirtualFunction(...);
   ...

"BaseClass*" can be viewed as a type of pointer<BaseClass>, and
"DerivedClass*" as pointer<DerivedClass>, where pointer is a
standard memory reference.

Now, we consider the MS Windows's memory protocol. Is it possible
to write codes as follows:

  HANDLE<BaseClass> bh;
  HANDLE<DerivedClass> dh;
  ...
  bh = dh = new DerivedClass(...);
  bh->SomeVirtualFunctions(...);

Further, is it possible to write codes that independent to any
particular memory reference protocols? Like:

  REF<BaseClass> bref;
  REF<DerivedClass> dref;
  ...
  bref = dref = new DerivedClass(...);
  bref->SomeVirtualFunctions(...);

If I override virtual functions like storage_read, write, open,
close, I can implement my own storage protocols like persitent
storage, recoverable storage, atomic strorage, etc. The object
protocol is INDEPENDENT to the storage where it is contained.

I fail to reach this goal in C++. Three problems:

1. C++ does not support specialization, thus REF<DerivedClass>
   cannot be a specialized class of REF<BaseClass>;
2. Template is not a real class, thus to declare a variable of
   an uninstantiated template type is impossible;
3. Operator new has no link to a generic reference protocol.

David Shang






Author: rujo@ulrik.uio.no (Rune J|rgensen)
Date: Wed, 16 Sep 1992 20:04:21 GMT
Raw View
If there already exist consepts that solves the described
problem well, then I would like to get to know about them.

If there does not exist such consepts, then some feedback on
my suggestion would be welcome.



PROBLEM:

-  The very core of the problem is that a model of a linaer
   memory space with a fixed size of memory between two addresses is
   a poor model to use when you implement a (object) memory
   management system. (Memory objects have an inconvenient
   habit of changing size).

When you want to implement fast memory-resident
object handling / databases, you will need a very efficient
basic (object) memory management system. (Valid for many C
applications and indeed for C++ applications).

One classical problem occur when memory objects die or change size;
memory fragmentation.

Another problem is reference to other objects: if done directly by
pointers, then you cannot move memory objects, because references
to the memory objects will no longer be valid.

You may solve this by assigning a handle to each object and accessing
each object by the pointer function ptr(handle). A field of record of
type REC would be accessed in C by:

  ((REC *)ptr(handle))->field = ...;

(Using functions like put(handle, &rec) and get(handle, &rec)
requires memory move at each operation and is prone to synconization
problems. E.g.:

  f1()
  {
    get(handle,&rec);
      :
      :
    put(handle,&rec);
  }

  f2()
  {
    get(handle,&rec);
    f1();
      :
      :
    put(handle,&rec);
  }

...the changes in f1() will be ignored.)


The problem with implementing a handle-based system in software,
and using the p = ptr(handle) mechanism, is that when you do
'garbage collection' or 'memory compaction' you move objects, and
the values of pointers like p become invalid. A pointer obtained by
p = ptr(handle) has only a limited life-span.

Alternatively you can spread pairs of p = lock(handle) and
unlock(handle) around your code. If you use too few, then you might
miss a pointer, if you use too many, then the memory compaction may not work
because you impose unmovable areas with lock/unlock.

Anyway you have to be careful when using pointers derived from handles.


SUGGESTED SOLUTION:

Implement the handle to pointer mapping at processor level. (As well
as handle allocation/free/management in general).

A similar mapping is already beeing done when implementing paging;
a virtual pointer vp in your program is mapped to a physical pointer p
wich points into a 'buffer' where the correct page is loaded. Your program
does however never see anything other than vp, which remains the same,
no matter where in 'physical memory' your data is.

I would just like to see this consept taken somewhat further, involving
these two steps (for a basic implementation for C):


1) INTRODUCE A NEW POINTER TYPE CALLED E.G. MOP (MEMORY_OBJECT_POINTER)

The syntax could be as with the usage of __far or __near keywords:

typedef struct tagREC
{
  int i1;
  int i2;
  struct tagREC MOP *mopNext;  /* Reference to another REC. */
   :
   :
}
REC;

char     *pChar;     /* Regular pointer */
char MOP *mopChar;   /* MOP pointers */
REC  MOP *mopRec1;
REC  MOP *mopRec2;

  mopChar = mopAlloc(sizeof(char) * (n + 1));  /* Allocate a string. */
  mopRec1 = mopAlloc(sizeof(REC));             /* Allocate records. */
  mopRec2 = mopAlloc(sizeof(REC));

A MOP would have the same features as regular pointers except:

  o) A MOP would always identify a memory object and the size is known
     and could be derived at run-time. The size would be know at a
     'low level' (processor's mapping tables).

  o) Within one memory object the pointer-arithmetics will be as
     with regular pointers. E.g.: (&mopRec->i2 - &mopRec->i1) == sizeof(int)

  o) Pointer arithmetics between MOPs does not make any sense
     except for == and !=. (E.g. mopRec2 - mopRec1 tells you nothing.)
     mopRec1++ is meaningless.

  o) Arithmetics between regular pointers and MOPs would not give any
     meaning. Assignments (pChar = mopChar or mopChar = pChar) would
     NOT be legal. (*pChar = *mopChar and *mopChar = *pChar would
     of cause be legal.)

  o) I'm not sure of the interpretion of &mopRec1->i2 when e.g. passed
     as an argument to a function, but it would have to be recognised
     at a low level that this is a reference within the memory object
     identified by mopRec1.

Whenever fragmentation require re-organazing of memory objects, then
the MOPs would still be valid, because they are mapped on a lower level
anyhow (EACH TIME A REFERENCE IS DONE).


2) DIFFER BETWEEN PERMANENT AND TEMPORARY MOs

Lets say one bit of the MOP pointer was reserved for telling wheter
it was a temporary or permanent memory object. You would have two
functions for allocating:

  void MOP *mopAlloc(size_t size)

Returning MOPs of temporary memory objects.

and:

  void MOP *mopAllocPermanent(void MOP *mopRequested, size_t size)

If mopRequested != NULL then this is a request to allocate a MO using a
known MOP value. If there is already allocated a MO with this MOP as its id,
then this function returns NULL (failure) else mopRequested.

If mopRequested == NULL then any MOP value is accepted. The function
returns the MOP of a free permanent memory object.

With this consept at hand, you may now save your workspace of permanent
memory objects to file and a table of the corresponding MOPs.
Running your application later, you can load your MOP table, allocate
space with mopAllocPermanent, and load your data from file into
the allocated space. AND ANY REFERENCE IN ONE PERMANENT MEMORY OBJECT
(USING A MOP VALUE) TO ANY OTHER PERMANENT MEMORY OBJECT
WILL STILL BE VALID!!!


CONCLUSION
Wouldn't this consept allow for implementation of faster and more
convenient (object) memory management?


-Rune Jorgensen
 rujo@ulrik.uio.no


P.S. 'Memory object' or 'object memory' is the same thing in this text.

     The term 'MOP' could be 'MO Handle' if that is better.






Author: diamond@jit345.bad.jit.dec.com (Norman Diamond)
Date: Thu, 17 Sep 1992 02:25:00 GMT
Raw View
In article <RUJO.92Sep16210421@ulrik.uio.no> rujo@ulrik.uio.no (Rune J|rgensen) writes:
>PROBLEM:
>-  The very core of the problem is that a model of a linaer
>   memory space with a fixed size of memory between two addresses is
>   a poor model to use when you implement a (object) memory
>   management system.

Thus it is fortunate that C and C++ do not require a linear memory space.
Only the memory space internal to each object must be capable of being
modeled as linear.

>One classical problem occur when memory objects die or change size;
>memory fragmentation.

ANSI C does not require implementations to do garbage collection, but neither
does it prohibit it.  And in C++, some programmers have been known to create
smart pointers and garbage collectors and even have them co-operate.

>Another problem is reference to other objects: if done directly by
>pointers, then you cannot move memory objects, because references
>to the memory objects will no longer be valid.

If an implementation uses simple raw addresses for pointers and then moves
objects around, then it's a pretty bad (and non-conforming) implementation.
A good implementation is perfectly capable of using handles for pointers
and doing everything else you want.

>1) INTRODUCE A NEW POINTER TYPE CALLED E.G. MOP (MEMORY_OBJECT_POINTER)
>The syntax could be as with the usage of __far or __near keywords:

MOP is in the application's address space.  If you want to add a keyword
called __MOP or _MOP or __mop (not _mop though), no problem.  Or you might
want to make this the default, and add a keyword __raw for efficiency (for
access to things that are presumed not to move, or to let the programmer
worry about what happens if they move).

>     The term 'MOP' could be 'MO Handle' if that is better.

I think that would be better, since the word "handle" is already popular.
--
Norman Diamond       diamond@jit081.enet.dec.com
If this were the company's opinion, I wouldn't be allowed to post it.
"Yeah -- bad wiring.  That was probably it.  Very bad."




Author: maxtal@extro.ucc.su.OZ.AU (John MAX Skaller)
Date: Fri, 18 Sep 1992 01:20:44 GMT
Raw View
In article <RUJO.92Sep16210421@ulrik.uio.no> rujo@ulrik.uio.no (Rune J|rgensen) writes:
>
>If there already exist consepts that solves the described
>problem well, then I would like to get to know about them.
>
>If there does not exist such consepts, then some feedback on
>my suggestion would be welcome.
>
>
>
>PROBLEM:
>
>-  The very core of the problem is that a model of a linaer
>   memory space with a fixed size of memory between two addresses is
>   a poor model to use when you implement a (object) memory
>   management system. (Memory objects have an inconvenient
>   habit of changing size).
>
>When you want to implement fast memory-resident
>object handling / databases, you will need a very efficient
>basic (object) memory management system. (Valid for many C
>applications and indeed for C++ applications).
>
>One classical problem occur when memory objects die or change size;
>memory fragmentation.
>
>Another problem is reference to other objects: if done directly by
>pointers, then you cannot move memory objects, because references
>to the memory objects will no longer be valid.
>

[Use handles]

>
>SUGGESTED SOLUTION:
>
>Implement the handle to pointer mapping at processor level. (As well
>as handle allocation/free/management in general).
>

 haven't you heard of the 80386? It does exactly what you want.
You can resize and move objects without changing the pointers.
They can be paged to disk by virtual memory. All transparent.
Only problem is the limit on the number of objects (max 8K segments)
<grin>

Windows 3.0 does all this (except resizing) automatically and
transparently, you can just set operator new to do a global alloc
and then a global lock to get a pointer. Global memory is still
compacted on the fly without changing the pointers, and your
object can be paged in and out of memory automatically too.

--
;----------------------------------------------------------------------
        JOHN (MAX) SKALLER,         maxtal@extro.ucc.su.oz.au
 Maxtal Pty Ltd, 6 MacKay St ASHFIELD, NSW 2131, AUSTRALIA
;--------------- SCIENTIFIC AND ENGINEERING SOFTWARE ------------------