Topic: re-entrancy and C++


Author: db@argon.Eng.Sun.COM (David Brownell)
Date: 25 Sep 1993 16:56:23 GMT
Raw View
pete@borland.com (Pete Becker) writes:

> Having all operator<<'s atomic isn't enough:

... though as was brought out earlier, a thread-safe iostream library
needs some work in this area to guarantee that different threads can
use operator<< and operator>> concurrently and not crash the program.

> And having each statement somehow become atomic isn't enough:

But from the iostream library point of view, these are the same point.

>  cout << "Hello ";
>  cout << "world!";
>  cout << endl;
>
> ...
>
>  My claim is that the synchronization issue is a program-level issue,
> not a class-level issue. The user of the iostream classes must handle
> synchronization of all operations on the same stream ...

(I think previous postings have made clear that "all" operations is a
overstatement.  "Some" operations ... forcing threads to synchronize to
establish an order when none is needed is an undesirable API style.)

Now go one step further and you will see where the API to iostreams
needs to change:  suppose there are two independently developed library
modules in the same address space, each with threads that use cout.

Since they are independently developed, they will NOT have randomly
chosen to use the same mechanism to synchronize sequences of operations
on the same stream:

    subsystem 1:
 mutex_lock (&my_mutex);
 cout << "Hello " << "World " << endl;
 mutex_unlock (&my_mutex);

    subsystem 2:
 mutex_lock (&this_subsystem_mutex);
 while (cout_in_use == FALSE)
     cond_wait (&cout_signal, &this_subsystem_mutex);
 cout_inuse = TRUE;
 mutex_unlock (&this_subsystem_mutex);

 ... considerably more complex "cout" use than above;
 ... the locking strategy here is desirable when locks
 ... are held for a long time

 mutex_lock (&this_subsystem_mutex);
 cout_inuse = TRUE;
 cond_signal (&cout_signal);
 mutex_unlock (&this_subsystem_mutex);

Each of those subsystems IN ISOLATION has synchronized use of cout
correctly.  But put them in the same address space and they don't work
together!!  To have the iostreams library behave reasonably in the face
of independently developed software, its API must be extended to include
some kind of per-stream synchronization primitive.

>     The user of the iostream classes must handle
> synchronization of all operations on the same stream, and that in itself
> takes care of virtually all of the synchronization issues within the iostream
> classes themselves.

Pete, you seem to be making a point mostly that an iostream library,
like much -- but not all -- MT-unsafe code, can be used in an MT world
by applying application level locks.  Modulo issues like correct use of
globals instead of thread-specific data (e.g. errno in POSIX) this may
well be true in some environments.

However, the point I've been trying to make is that such an approach is
strongly deficient from the point of view of applications programmers,and
that a standard is needed for how to do MT programming with iostreams.

Regardless, I think the original question about issues programmers need
to be aware of in order to do MT programming in C++ has an answer!

- Dave

--
David Brownell                        db@Eng.Sun.COM.
Distributed Object Management





Author: pete@borland.com (Pete Becker)
Date: Sat, 25 Sep 1993 17:42:22 GMT
Raw View
In article <ma8u1nINNgtd@exodus.eng.sun.com>,
David Brownell <db@argon.Eng.Sun.COM> wrote:
>
>Pete, you seem to be making a point mostly that an iostream library,
>like much -- but not all -- MT-unsafe code, can be used in an MT world
>by applying application level locks.  Modulo issues like correct use of
>globals instead of thread-specific data (e.g. errno in POSIX) this may
>well be true in some environments.
>
>However, the point I've been trying to make is that such an approach is
>strongly deficient from the point of view of applications programmers,and
>that a standard is needed for how to do MT programming with iostreams.
>
>Regardless, I think the original question about issues programmers need
>to be aware of in order to do MT programming in C++ has an answer!
>

 Admittedly I've been arguing a somewhat extreme view, mostly to provoke
some discussion. I suspect that many people panic when they hear the term
multi-threaded, and tend to want to do much more than is necessary because
they don't understand the issues.
 Just as fread() doesn't have to work correctly when passed an invalid
pointer for it's FILE*, I don't think that a stream inserter has to work
correctly when called simultaneously from two different threads. A program that
does either of those things is erroneous, and it's not strictly necessary
for the library to protect itself from such misuse. It may be handy for the
library to protect programmers from their mistakes in such a case, but that
may also conceal their errors, resulting in a more difficult maintenance task
when modifications to that code lead to a condition that the library can't
handle.
 The point about independently developed libraries is a good one. This
is a problem that cannot be avoided by the programmers themselves, one that
probably needs to be addressed at some point. But it looks to me like the
solution is quite simple: something along the line of associating a mutex with
each stream so that users of iostreams can rely on a consistent name for the
mutex that they have to use. Is more than that needed?
 -- Pete





Author: db@argon.Eng.Sun.COM (David Brownell)
Date: 17 Sep 1993 16:23:06 GMT
Raw View
In article <1993Sep14.223418.22669@aquila.sni-usa.com> you write:
> Has anybody used C++ to develop threded (single-processor) applications?,

Yes; the same code runs multiprocessors too of course.  The only thing
changing in threaded code going from uniprocessors to multiprocessors
is how quickly the bugs show up (faster on MPs).

> Are there any issues, problems, concerning the I/O stream libraries?,

Yes:  they don't work unless they've been either heavily modified
or reimplemented.  SunPro submitted a document to the ANSI C++
committee about some of the issues there; interfaces need to change,
and applications sometimes need do locking.

> If there is a re-entrancy problem with this, is there a work-around?

Use stdio instead of iostreams.

Re other potential problems, if you use CFRONT's compiler runtime
(operator new/delete etc) you may be unable to use vector new/delete,
as most (all?) versions are MT-unsafe.  I can't speak for G++ or other
compilers, though I know folk who used a version to do MT development.

If you're interested in POSIX threads, you might want to make sure your
system vendor offers a solution whereby thread cancellation uses the
stack cleanup mechanism defined in C++ (unwinding and exceptions, so
destructors get called etc) rather than the rather ugly (even in C)
solution POSIX specifies.  I had occasion to note that Microsoft has
added extensions to C so that cancellation in NT looks like C++
exception cleanup.

- Dave

--
David Brownell                        db@Eng.Sun.COM.
Distributed Object Management





Author: pete@borland.com (Pete Becker)
Date: Fri, 17 Sep 1993 17:11:06 GMT
Raw View
In article <m9jp3aINNmp@exodus.eng.sun.com>,
David Brownell <db@argon.Eng.Sun.COM> wrote:
>
>> Are there any issues, problems, concerning the I/O stream libraries?,
>
>Yes:  they don't work unless they've been either heavily modified
>or reimplemented.  SunPro submitted a document to the ANSI C++
>committee about some of the issues there; interfaces need to change,
>and applications sometimes need do locking.
>
>> If there is a re-entrancy problem with this, is there a work-around?
>
>Use stdio instead of iostreams.
>

 From a minimalist's perspective, re-entrancy problems in iostreams
are irrelevant, since any sensible multi-threaded program that shares streams
among its threads must do its own synchronization anyway to avoid interleaving
its data streams. The benefit from making iostreams re-entrant is that
they will be better able to survive misuse. That might make debugging easier,
but it is by no means essential to writing programs that work correctly.
 -- Pete





Author: db@argon.Eng.Sun.COM (David Brownell)
Date: 19 Sep 1993 18:07:43 GMT
Raw View
>  From a minimalist's perspective, re-entrancy problems in iostreams
> are irrelevant, since any sensible multi-threaded program that shares streams
> among its threads must do its own synchronization anyway to avoid
> interleaving its data streams.

That's partially true for "cout << a << b << c" style usage, although
the iostream API needs extension to include synchronization primitives
so that all code can agree how to synchronize any given data stream.

Also, there can be other problems, such as (in POSIX) access to "errno"
from code that's not been compiled to be reentrant (-D_REENTRANT on
Solaris and some other MT UNIXes).

Two counterexamples, though:

  Case 1:  Interleaving of output is OK (e.g. log messages)

    Thread 1:   cout << "complete message\n"
    Thread 2:   cout << "other complete message\n"

 From the minimalist "user" point of view, no synchronization
 is needed and the library should support this directly.  I
 could understand a minimalist "provider" not wanting to do
 this work, but would not buy MT support from such providers.

  Case 2:  Mixed input and output

    Thread 1: cout << somevar
    Thread 2: cin >> othervar

 Without making iostreams MT-safe, legal implementations
 could crash when such code is interleaved by the thread
 scheduler (e.g. by timeslicing or multi-CPU scheduling).

>  The benefit from making iostreams re-entrant is that
> they will be better able to survive misuse. That might make debugging easier,
> but it is by no means essential to writing programs that work correctly.

Depends what you mean by "misuse", and how tightly you control issues
like how to lock iostreams and when threads are created.  I'd claim most
library software needing tight control of such issues wasn't written to
be general, reusable software (i.e. not library-quality).

Ideally, one shouldn't require source code to all C++ code living in
your address space -- but without standards on how to acquire locks
(and when they're needed) it'd be hard to use locks to keep threads
from concurrently using the compiled iostream code.
--
David Brownell                        db@Eng.Sun.COM.
Distributed Object Management





Author: pete@borland.com (Pete Becker)
Date: Mon, 20 Sep 1993 17:12:14 GMT
Raw View
In article <m9p7vfINN8d9@exodus.eng.sun.com>,
David Brownell <db@argon.Eng.Sun.COM> wrote:
>
>  Case 2:  Mixed input and output
>
>    Thread 1: cout << somevar
>    Thread 2: cin >> othervar
>
> Without making iostreams MT-safe, legal implementations
> could crash when such code is interleaved by the thread
> scheduler (e.g. by timeslicing or multi-CPU scheduling).
>

 True. But this code is basically useless, since it does not prompt
the user for input. Once you add a prompt (that is, once you start talking
about real programs and not oversimplified examples), the program itself must
take care of synchronizing these two threads so that the output of thread 1
does not appear between the prompt and the input in thread 2. Once you've done
that, the danger of a crash in the iostream code is gone.
 -- Pete





Author: fjh@munta.cs.mu.OZ.AU (Fergus James HENDERSON)
Date: Tue, 21 Sep 1993 15:16:22 GMT
Raw View
pete@borland.com (Pete Becker) writes:

>David Brownell <db@argon.Eng.Sun.COM> wrote:
>>
>>  Case 2:  Mixed input and output
>>
>>    Thread 1: cout << somevar
>>    Thread 2: cin >> othervar
>>
>> Without making iostreams MT-safe, legal implementations
>> could crash when such code is interleaved by the thread
>> scheduler (e.g. by timeslicing or multi-CPU scheduling).
>
> True. But this code is basically useless, since it does not prompt
>the user for input.

What about the following sort of code?

 Thread 1: logfile1 << logmessage1
 Thread 2: logfile2 << logmessage2
 Thread 3: cin >> somevar

If iostreams is not MT-safe, could this crash?

--
Fergus Henderson                     fjh@munta.cs.mu.OZ.AU




Author: pete@borland.com (Pete Becker)
Date: Tue, 21 Sep 1993 16:08:35 GMT
Raw View
In article <9326501.21631@mulga.cs.mu.oz.au>,
Fergus James HENDERSON <fjh@munta.cs.mu.OZ.AU> wrote:
>
>What about the following sort of code?
>
> Thread 1: logfile1 << logmessage1
> Thread 2: logfile2 << logmessage2
> Thread 3: cin >> somevar
>
>If iostreams is not MT-safe, could this crash?
>

 Probably. But there are two issues presented here. Thread 3 shows the
problem of synchronizing output with input, and is a program-level problem, as
I said before.

 Thread 1: logfile << logmessage1part1;
    logfile << logmessage1part2;
 Thread 2: logfile << logmessage2part1;
    logfile << logmessage2part2;

 If you don't want these messages interleaved, you have to synchronize
the threads yourself regardless of whether iostreams synchronize themselves.
If iostreams protect themselves from misuse you can get away with not
synchronizing them if you promise never to do more than one output statement
for each complete message. Sounds fragile to me...
 -- Pete




Author: fjh@munta.cs.mu.OZ.AU (Fergus James HENDERSON)
Date: Wed, 22 Sep 1993 05:53:16 GMT
Raw View
pete@borland.com (Pete Becker) writes:

>In article <9326501.21631@mulga.cs.mu.oz.au>,
>Fergus James HENDERSON <fjh@munta.cs.mu.OZ.AU> wrote:
>>
>>What about the following sort of code?
>>
>> Thread 1: logfile1 << logmessage1
>> Thread 2: logfile2 << logmessage2
>> Thread 3: cin >> somevar
>>
>>If iostreams is not MT-safe, could this crash?
>
> Probably. But there are two issues presented here. Thread 3 shows the
>problem of synchronizing output with input, and is a program-level problem, as
>I said before.

Yes, you're right there are two issues.  But the point of Thread 3 is
that the input doesn't need to be synchronized with the output, since
the output from Threads 1&2 are going to logfiles, which presumably
don't need to be synchronized with input from the keyboard.

> Thread 1: logfile << logmessage1part1;
>    logfile << logmessage1part2;
> Thread 2: logfile << logmessage2part1;
>    logfile << logmessage2part2;

Hang on, I think you missed the point: my Thread1 and Thread2 were
outputting messages to _different_ log files.

> If you don't want these messages interleaved, you have to synchronize
>the threads yourself regardless of whether iostreams synchronize themselves.

But if the messages are going to different files, then you don't need
to synchronize the threads.

--
Fergus Henderson                     fjh@munta.cs.mu.OZ.AU




Author: pete@borland.com (Pete Becker)
Date: Wed, 22 Sep 1993 17:15:06 GMT
Raw View
In article <9326515.11004@mulga.cs.mu.oz.au>,
Fergus James HENDERSON <fjh@munta.cs.mu.OZ.AU> wrote:
>pete@borland.com (Pete Becker) writes:
>
>> Thread 1: logfile << logmessage1part1;
>>    logfile << logmessage1part2;
>> Thread 2: logfile << logmessage2part1;
>>    logfile << logmessage2part2;
>
>Hang on, I think you missed the point: my Thread1 and Thread2 were
>outputting messages to _different_ log files.
>
>> If you don't want these messages interleaved, you have to synchronize
>>the threads yourself regardless of whether iostreams synchronize themselves.
>
>But if the messages are going to different files, then you don't need
>to synchronize the threads.
>

 Ok. I missed that. But since each logfile object has its own internal
state, the only possible problem that could come up would involve references to
global variables in the iostream code. I can only think of one part of
iostreams that necessarily involves globals, and that's bitalloc() and
xalloc(). Easy enough to make those functions thread-safe, without resorting
to the overkill of making all insertions do semaphore locking and unlocking.
 -- Pete




Author: kanze@us-es.sel.de (James Kanze)
Date: 23 Sep 93 16:35:24
Raw View
In article <1993Sep22.171506.23765@borland.com> pete@borland.com (Pete
Becker) writes:

|> In article <9326515.11004@mulga.cs.mu.oz.au>,
|> Fergus James HENDERSON <fjh@munta.cs.mu.OZ.AU> wrote:
|> >pete@borland.com (Pete Becker) writes:

|> >> Thread 1: logfile << logmessage1part1;
|> >>    logfile << logmessage1part2;
|> >> Thread 2: logfile << logmessage2part1;
|> >>    logfile << logmessage2part2;

|> >Hang on, I think you missed the point: my Thread1 and Thread2 were
|> >outputting messages to _different_ log files.

|> >> If you don't want these messages interleaved, you have to synchronize
|> >>the threads yourself regardless of whether iostreams synchronize themselves.

|> >But if the messages are going to different files, then you don't need
|> >to synchronize the threads.


|>  Ok. I missed that. But since each logfile object has its own internal
|> state, the only possible problem that could come up would involve references to
|> global variables in the iostream code. I can only think of one part of
|> iostreams that necessarily involves globals, and that's bitalloc() and
|> xalloc(). Easy enough to make those functions thread-safe, without resorting
|> to the overkill of making all insertions do semaphore locking and unlocking.

Isn't that what thread safety means.  Thread safety (to me, at least)
doesn't necessarily imply semaphore locking.  It simply implies that
the program will work in the presence of multiple threads, and that
any user constraints be documented.  (For example, are all single
operator<<'s guarenteed to be atomic, or must I lock myself to ensure
that characters in two messages are not interleaved.)

Whether the class attains this goal by careful design, by semaphore
locking, or a combination of both, is an implementation detail.
--
James Kanze                             email: kanze@us-es.sel.de
GABI Software, Sarl., 8 rue du Faisan, F-67000 Strasbourg, France
Conseils en informatique industrielle --
                   -- Beratung in industrieller Datenverarbeitung




Author: fjh@munta.cs.mu.OZ.AU (Fergus James HENDERSON)
Date: Thu, 23 Sep 1993 16:37:46 GMT
Raw View
pete@borland.com (Pete Becker) writes:

>Fergus Henderson <fjh@munta.cs.mu.OZ.AU> wrote:
>
>>But if the messages are going to different files, then you don't need
>>to synchronize the threads.
>
> Ok. I missed that. But since each logfile object has its own internal
>state, the only possible problem that could come up would involve references to
>global variables in the iostream code. I can only think of one part of
>iostreams that necessarily involves globals, and that's bitalloc() and
>xalloc(). Easy enough to make those functions thread-safe, without resorting
>to the overkill of making all insertions do semaphore locking and unlocking.

Yes, I agree that that would be overkill.
Ensuring that iostreams are thread-safe when each thread is writing
to a different stream would still require some explicit wording in
the standard.

--
Fergus Henderson                     fjh@munta.cs.mu.OZ.AU




Author: pete@borland.com (Pete Becker)
Date: Thu, 23 Sep 1993 17:20:49 GMT
Raw View
In article <KANZE.93Sep23163524@slsvhdt.us-es.sel.de>,
James Kanze <kanze@us-es.sel.de> wrote:
>
>Isn't that what thread safety means.  Thread safety (to me, at least)
>doesn't necessarily imply semaphore locking.  It simply implies that
>the program will work in the presence of multiple threads, and that
>any user constraints be documented.  (For example, are all single
>operator<<'s guarenteed to be atomic, or must I lock myself to ensure
>that characters in two messages are not interleaved.)
>

Having all operator<<'s atomic isn't enough:

 cout << "Hello " << "world!" << endl;

And having each statement somehow become atomic isn't enough:

 cout << "Hello ";
 cout << "world!";
 cout << endl;


>Whether the class attains this goal by careful design, by semaphore
>locking, or a combination of both, is an implementation detail.

 My claim is that the synchronization issue is a program-level issue,
not a class-level issue. The user of the iostream classes must handle
synchronization of all operations on the same stream, and that in itself
takes care of virtually all of the synchronization issues within the iostream
classes themselves.
 Of course, a program that doesn't synchronize io correctly will benefit
from synchronization within the iostream classes, in that it will produce
interleaved output rather than crashing. But that's a debugging issue...
 -- Pete







Author: santi@camus.sni-usa.com (Santiago Paredes)
Date: Tue, 14 Sep 1993 22:34:18 GMT
Raw View
Has anybody used C++ to develop threded (single-processor) applications?,
Are there any issues, problems, concerning the I/O stream libraries?,
If there is a re-entrancy problem with this, is there a work-around?

Thank you,

-Santi