Topic: standard libraries?
Author: fjh@munta.cs.mu.OZ.AU (Fergus James HENDERSON)
Date: Thu, 13 May 1993 08:31:55 GMT Raw View
jimad@microsoft.com (Jim Adcock) writes:
>fjh@munta.cs.mu.OZ.AU (Fergus James HENDERSON) writes:
>|What about method 3?
>| Foo *p;
>| ...
>| write(fd, &p, sizeof p);
>| ...
>| read(fd, &p, sizeof p);
>
>What about it? Neither "read" nor "write" are described in the ANSI/ISO
>C language spec, nor in ARM, thus neither are a C++ language standards
>issue. Rather, with respect to the C++ language, these functions are
>simply implementation issues that can be handled in a manner of the
>implementation's choosing -- if the implementation chooses to support them
>at all. Clearly there is an obvious implementation choice to handle
>these implementation issues -- namely the implementation could choose
>not to move objects, and thus "method 3" is neither a C++ language
>issue nor a C++/GC language issue.
OK, I'm sorry, that should have been "fread" and "fwrite".
Also I didn't make it clear that p gets modified in the mean time.
Let me give a better example.
FILE *f = fopen("/tmp/x","wb");
Foo *p = new Foo;
fwrite(&p, sizeof p, 1, f);
p = 0;
...
// assume that a garbage collection occurs here
...
rewind(f);
read(&p, sizeof p, 1, f);
p->foo(); // *p had better still be there
Now unless I am mistaken, this is legal C++ according to the current
definition of the language. But requiring this sort of code to work
would make even conservative garbage collection impossible.
Thus to make conservative garbage collection possible in a standard
conforming implementation would require changing the standard to
ensure that the above sort of code is illegal.
I think that making such a change would be a good idea, but I wish that
GC advocates would stop pretending that it doesn't require any changes
to the C++ standard, or that it doesn't require the corresponding changes
to C++ semantics.
--
Fergus Henderson This .signature virus might be
fjh@munta.cs.mu.OZ.AU getting old, but you still can't
consistently believe it unless you
Linux: Choice of a GNU Generation copy it to your own .signature file!
Author: jimad@microsoft.com (Jim Adcock)
Date: 10 May 93 17:33:24 GMT Raw View
Author: fjh@munta.cs.mu.OZ.AU (Fergus James HENDERSON)
Date: Tue, 11 May 1993 07:28:20 GMT Raw View
jimad@microsoft.com (Jim Adcock) writes:
>From article <1993May7.192435.8391@csrd.uiuc.edu>, by harrison@sp1.csrd.uiuc.edu (Luddy Harrison):
>| I'm somewhat puzzled by the absense of mention, in these discussions, of
>| the semantics of garbage collection. A copying garbage collector relocates
>| (that is, it changes the address of) a live object. This is a violation
>| of the semantics of C and C++ as far as I am aware. For example, if I write
>| the address of an object to secondary storage and later retrieve it,
>| I can expect the retrieved address to continue to point to the original object
>
>There's basically two approaches you code take to reading/writing such
>an address. 1) You can convert it to/from an integral type, and write/read
>that -- except that such an integral and back conversion is not guaranteed
>to exist, but rather is implementation dependent. 2) You can convert
>it to a void* and use the %p formatter to write/read it back within
>the same program execution. In which case the results are guaranteed to
>be implementation dependent *except* for the fact that the original void* and
>the read-back void* are gauranteed to compare equal.
What about method 3?
Foo *p;
...
write(fd, &p, sizeof p);
...
read(fd, &p, sizeof p);
--
Fergus Henderson This .signature virus might be
fjh@munta.cs.mu.OZ.AU getting old, but you still can't
consistently believe it unless you
Linux: Choice of a GNU Generation copy it to your own .signature file!
Author: db@argon.Eng.Sun.COM (David Brownell)
Date: 11 May 1993 09:10:12 GMT Raw View
steve@taumet.com (Steve Clamage) writes:
> >As a simple example of what I'm talking about: imagine two threads
> >execute the following line of code at "the same" time (i.e. on a
> >multiprocessor, or just interleaved due to scheduling):
>
> > stream << "this" << " is " << "a bug";
>
> >No way to guarantee that the output isn't interleaved, ...
>
> I don't believe this can be addressed in the C++ specification.
> For example, you have the identical problem in C:
>
> fprintf(stream, "%s", "this")
> fprintf(stream, "%s, " is ");
> fprintf(stream, "%s, "a bug");
Posix 1003.4a, last draft I looked at, included stdio functions to
address this. If both threads executed concurently, only one would get
the lock at at time:
flockfile (stream);
fputs ("this", stream);
fputs (" is ", stream);
fputs ("NOT a bug", stream);
funlockfile (stream);
... where I swapped "fprintf" with "fputs" just to emphasize this works
for the entire stdio library. The point being that this problem DOES
have a known solution (all but standard, too! :-).
It might be desirable to such MT issues addressed as an addendum to a
C++ library specification in the way the MT issues were for C. However,
I think the issues are more complex for C++ due to the complexity of
the interfaces exposed; if I'm subclassing a library class, what's the
locking strategy I must use in order to be correct and not deadlock?
Which virtual functions are allowed (expected) to hold which locks?
I'm told about C++ libraries that have callback structures so twisted
that answering those questions isn't practical.
> The larger problem is what happens when you flush the buffer used
> by the C++ stream or the C FILE. How do you control interleaving
> data from different tasks when a buffer flush might come at an
> arbitrary point in doing output? That is, the buffer is flushed
> while you are in the middle of writing one item.
Hmmm ... NOT a multithreading issue, but a general issue with having
multiple processes share the same underlying OS file descriptor. It
was once so common with terminals that the convention arose that all
terminal I/O should be line-buffered!
> Next, what happens if one thread seeks on a stream and another thread
> uses the same stream? How does it find out that the stream moved
> between two of its statements?
My model of "stream" doesn't include seeking, though my model for file
descriptors does. Certainly shared resources such as file offsets
either need to be managed as such, or made nonissues in an MT world the
way pwrite() and pread() take explicitly the offsets that read() and
write() took implicitly (and then modified).
> C and C++ were not designed as multi-processing languages.
I'll disagree on the former. C's been used for UNIX kernel programming
almost as long as it's existed; not all such kernels have had multiple
CPUs (multiprocessing), but they've all been multithreaded. I'd really
not expect to see that key feature of C get lost by C++ !!
Then again, this is maybe crossing the line between "language" and
"environment" ... neither language makes any restrictions about MT,
but there are environments that do.
> The C and
> C++ libraries do not provide multi-processing primitives.
If you look at the POSIX threading specification you'll notice that
the C library's been extended to make it usable for MT programs; I
assume that the entire ANSI-C library was scrutinized. The rest of
the libraries are getting upgraded incrementally; for example, X11R6
is due to have MT-safe intrinsics and Xlib, and there are other MT
safe C library routines in various threaded OS platforms.
I think I have two points about the C++ issue: (a) let's not make
the same mistakes that the C libraries did, standardizing interfaces
that don't work well in MT programs; and (b) let's see those MT
aware C++ standards and libraries ASAP, so they can be sanity checks
on delivery of a MT-compatible standard for (a)!!
Through direct email you've said you're trying to address (a), which
is good. I can understand the desire not to require (b) from compiler
vendors that don't claim to support multithreaded C++ development;
I hope you can appreciate my desire to get (b) compatibly though!!
>
> Steve Clamage, TauMetric Corp, steve@taumet.com
---
David Brownell db@Eng.Sun.COM.
Distributed Object Management
Author: jimad@microsoft.com (Jim Adcock)
Date: 12 May 93 16:27:48 GMT Raw View
In article <9313117.8570@mulga.cs.mu.OZ.AU> fjh@munta.cs.mu.OZ.AU (Fergus James HENDERSON) writes:
|What about method 3?
| Foo *p;
| ...
| write(fd, &p, sizeof p);
| ...
| read(fd, &p, sizeof p);
What about it? Neither "read" nor "write" are described in the ANSI/ISO
C language spec, nor in ARM, thus neither are a C++ language standards
issue. Rather, with respect to the C++ language, these functions are
simply implementation issues that can be handled in a manner of the
implementation's choosing -- if the implementation chooses to support them
at all. Clearly there is an obvious implementation choice to handle
these implementation issues -- namely the implementation could choose
not to move objects, and thus "method 3" is neither a C++ language
issue nor a C++/GC language issue.
Author: mharriso@digi.lonestar.org (Mark Harrison)
Date: 4 May 93 18:56:05 GMT Raw View
Is any committee addressing the need for a set of standard library
components? I'm talking about strings, lists, associative arrays,
and the like.
Personally, I'd be very pleased if something like the USL standard
components were chosen.
--
Mark Harrison, mharriso@dsccc.com, (214)519-6517
Author: db@argon.Eng.Sun.COM (David Brownell)
Date: 5 May 1993 17:18:25 GMT Raw View
mharriso@digi.lonestar.org (Mark Harrison) writes:
> Is any committee addressing the need for a set of standard library
> components? I'm talking about strings, lists, associative arrays,
> and the like.
>
> Personally, I'd be very pleased if something like the USL standard
> components were chosen.
"Something like" is right -- I'd hate to see anything with an
API that's nonviable for multithreaded programs get standardized.
As a simple example of what I'm talking about: imagine two threads
execute the following line of code at "the same" time (i.e. on a
multiprocessor, or just interleaved due to scheduling):
stream << "this" << " is " << "a bug";
No way to guarantee that the output isn't interleaved, etc., as
there are no locks on the stream that'd allow threads to synchronize
their use. I'm told there are many less obvious problems too; they
may all be fixable, but I think that at least some APIs would need
to change address threading.
If any standards committee DOES start looking at C++ class libraries,
please let's make sure that multithreading is addressed properly!!!
--
David Brownell db@Eng.Sun.COM.
Distributed Object Management
Author: steve@taumet.com (Steve Clamage)
Date: Thu, 6 May 1993 16:34:11 GMT Raw View
mharriso@digi.lonestar.org (Mark Harrison) writes:
>Is any committee addressing the need for a set of standard library
>components? I'm talking about strings, lists, associative arrays,
>and the like.
Yes, the ANSI/ISO C++ Committee is addressing the issue.
Iostreams will be standardized. There will almost certainly
be a standard string class. There will probably be a bitset class.
There might be an array class.
It is unlikely that there will be anything else put into the
forthcoming Standard. It is not clear that we will be able to
complete work all of the classes mentioned above.
--
Steve Clamage, TauMetric Corp, steve@taumet.com
Author: harrison@sp1.csrd.uiuc.edu (Luddy Harrison)
Date: Fri, 7 May 93 19:24:35 GMT Raw View
Hans-J. Boehm writes:
>> Hence especially a copying garbage collector does almost
>> no work tracing through accessible objects, and allocation can be
>> very cheap (incrementing a pointer). ...
>> The performance of simple copying collectors is similarly predictable.
Matthew Austern writes:
>> I think we ought to take more seriously the idea that garbage
>> collection should be a central part of the C++ language; being forced
I'm somewhat puzzled by the absense of mention, in these discussions, of
the semantics of garbage collection. A copying garbage collector relocates
(that is, it changes the address of) a live object. This is a violation
of the semantics of C and C++ as far as I am aware. For example, if I write
the address of an object to secondary storage and later retrieve it,
I can expect the retrieved address to continue to point to the original object:
int *p = malloc(sizeof(int));
...
write(myfile,&p,sizeof(int*));
...
int *q;
read(myfile,&q,sizeof(int*)); // from the same position in myfile
// to which p was written
assert (p == q);
A similar example concerns hashing:
myclass *p;
...
int x = hash(p);
...
int y = hash(p);
assert(x == y);
Now, the first example brings up a related but different consideration:
the references that cause a data item to be alive (that is, that cause it to
not to be freed automatically by the garbage collector) must be visible to the
garbage collector. There are many means of "hiding" pointers, including but not
limited to writing them to secondary storage. I have several examples of large
C and C++ applications that use both of the techniques above (hashing pointers and
storing them in files).
Any standardization of garbage collection must look the semantics in the face and
prescibe exactly what will be collected, what will be moved, etc. This is an
entirely different matter than writing a garbage collector for Lisp or Smalltalk
or any other language which was designed with garbage collection in mind, from
its inception. In particular, in languages with garbage collection, one can
ordinarily not make use of addresses as values (that is, as integers). This is
a critical part of the semantics of C and C++, by contrast. I for one am skeptical
that there is a straightforward means of adding garbage collection to an already
monstrously complicated language standard like that of C++.
-Luddy Harrison
Center for Supercomputing Research and Development
University of Illinois at Urbana-Champaign
harrison@csrd.uiuc.edu
Author: grumpy@cbnewse.cb.att.com (Paul J Lucas)
Date: Sat, 8 May 1993 01:55:29 GMT Raw View
Author: steve@taumet.com (Steve Clamage)
Date: Fri, 7 May 1993 17:14:22 GMT Raw View
db@argon.Eng.Sun.COM (David Brownell) writes:
>"Something like" is right -- I'd hate to see anything with an
>API that's nonviable for multithreaded programs get standardized.
>As a simple example of what I'm talking about: imagine two threads
>execute the following line of code at "the same" time (i.e. on a
>multiprocessor, or just interleaved due to scheduling):
> stream << "this" << " is " << "a bug";
>No way to guarantee that the output isn't interleaved, ...
I don't believe this can be addressed in the C++ specification.
For example, you have the identical problem in C:
fprintf(stream, "%s", "this")
fprintf(stream, "%s, " is ");
fprintf(stream, "%s, "a bug");
(Yes, you could write this as one function call, but that doesn't solve
the larger problem.)
The larger problem is what happens when you flush the buffer used
by the C++ stream or the C FILE. How do you control interleaving
data from different tasks when a buffer flush might come at an
arbitrary point in doing output? That is, the buffer is flushed
while you are in the middle of writing one item. You don't (I think)
want to lock out other threads for the entire duration of a complete
output operation -- it could take many seconds.
Next, what happens if one thread seeks on a stream and another thread
uses the same stream? How does it find out that the stream moved
between two of its statements?
C and C++ were not designed as multi-processing languages. The C and
C++ libraries do not provide multi-processing primitives.
The answer to all the above questions is that the syncronization has
to occur somewhere else. Part of it will be at the application design
level, with semaphores around critical regions; this is above the level
of the C++ specification. Part of it will be in the implementation of
the I/O primitives; this is below the level of the C++ specification.
The C++ library specifications are, I think, thread-safe in the sense
of being implementable in a re-entrant way. With a minor exception
in iostreams (xalloc and bitalloc) there is no modifiable static data.
The one problem can be solved with semaphores around the (rare)
allocation of new data for the xalloc arrays. This is an implemenation
detail, and need not be mentioned in the C++ specification.
--
Steve Clamage, TauMetric Corp, steve@taumet.com