Thread

Topic: many semaphores
Author: Daniel Miller <daniel.miller@tellabs.com>
Date: Sat, 11 May 2002 03:23:07 GMT Raw View
     Inclusion of comp.std.c++ is due to my discussion below regarding how the
Boost community has been (and continues to propose) going in the wrong direction
with respect to Boost.threads' omission of the following mechanisms: semaphore,
multiproducer-multiconsumer message-queues, and thread-selection-by-kernel
pool-allocation.  This whole topic came up on comp.programming.threads, but
applies as well to what the content of the multithread capabiltiies of C++ are
to be in C++0x.

Joe Seigh wrote:
  >
  > Vu Pham wrote:
  >
  >>Joe Seigh <jseigh@genuity.com> wrote in message
  >>[...]
  >>
  >>>>The idea of having separate semaphore for each deivce is to prevent
  >>>>the case that InThread for data of device A blocks OutThread for data
  >>>>of device B.
  >>>>
  >>>I think you ended up guaranteeing it instead.  If OutThread is blocked
  >>>on device A then it cannot process device B no matter how ready it is.
  >>>
  >>Yes, if OutThread is busy with a device A, it will not process other
  >>devices. But at least it will when InThread processes a device != A.
  >>
  >
  > Yes, but do you want OutThread to not process any ready devices until
  > an ack is received on device A no matter how long that takes and no
  > matter how much work is pending for devices that are ready?
  >
  >>>>Is there any reason not to create so many semaphores ( resource,
  >>>>performance, ... ) ?
  >>>>
  >>>>
  >>>In this case, 1 is too many.  Semaphores are inappropiate to the solution
  >>>of this problem.  Try a condition variable in conjunction with some
  >>>data structure or something.  Otherwise you risk becoming a poster child
  >>>for why semaphores are a Bad Thing.
  >>>
  >>Thanks. I found one case that semaphore is not good for this problem
  >>when there are lost packets, and now I am using mutex to control the
  >>access. But besides the inappropriate logic in this problem, would you
  >>please explain me why using many semaphores are not very good ? How
  >>about multiple mutex ?

     Viewpoints such as the posting quoted below parallels the mistakes made
within the Boost community regarding the intential removal of semaphores from
their Boost.threads library.  Although suggestions to remove semaphore from the
original poster's design are warranted, the posting quoted below instead rails
hard against all usage of semaphore presumably throughout the entire software
industry.  No thank you!  For those readers who wish to hear an opposing
positive treatment regarding the usefulness of semaphores, please read my recent
posting to comp.programming.threads:

     http://groups.google.com/groups?hl=en&selm=3CD1864A.10401%40tellabs.com&rnum=9

     or the whole thread in context

http://groups.google.com/groups?hl=en&threadm=2e53duk57rgfptgh684f01olnm3e5u69v9%404ax.com&rnum=1&prev=/groups%3Fq%3DDaniel%2BMiller%2Bgroup:comp.programming.threads%26hl%3Den%26selm%3D2e53duk57rgfptgh684f01olnm3e5u69v9%25404ax.com%26rnum%3D1

     Or read about real-time embedded programming in general, especially from the
multithreaded asynchronous message-queue school of thought embraced by most
RTOSes (most notably: pSOS+ and VxWorks).

     And/or read onward:


  > The semaphore "solution" doesn't really apply cleanly to anything, not
  > even the fabled single producer / single consumer problem.


     First of all, for readers who might be entertaining such broadly
anti-semaphore claims, it is not a fable.  The producer-consumer mechanism has
been used for decades as a part of embedded software.  The canonical approach to
the producer-consumer mechanism in the RTOS community is called a message-queue,
which acts as a conveyor-belt of sorts, such as between threads within the same
address-space or between processes/separate-address-spaces with shared memory.
Fully leveraging this conveyor-belt analogy in an MT-software
design/architecture typically provides the opportunity for clean & complete
hand-off of a unit-of-work from one thread to another or from one thread out of
many to a kernel-selected thread within a loose pool of threads (where "pool of
threads" is defined to be one or more threads pending on a single message-queue).

     Secondly again for readers who might be entertaining such broadly
anti-semaphore claims, "single producer / single consumer" misrepresents the
topic.  When one models the producer-consumer mechanism with a semaphore, one
does not get a *single-thread* producer to *single-thread* consumer mechanism,
one gets a *one-or-more-thread* producer to *one-or-more-thread* consumer
mechanism which can be highly useful for interthread hand-off of units-of-work
to load-sharing pools of threads.  When a semaphore models an interthread
message-queue mechanism:
     1) multiple threads may post to the same message-queue and
     2) multiple threads may pend on the same message-queue.

     The former is a wonderful way for multiple producer-threads to submit
units-of-work which are off-topic in their mission to some message-queue that
feeds a pool of threads whose mission it is to perform that category of
unit-of-work.  The latter is a wonderfully-efficient low-overhead load-sharing
thread-pool mechanism where the scheduling of threads is done (implicitly) via
thread-selection in the kernel-layer instead of via a manually-written
less-efficient God-controller in user-layer.  Unfortunately all of this
real-time embedded software-architectural elegance is apparently lost on people
who subscribe to the overly-academic anti-semaphore viewpoints, because they do
not focus on how the producer-consumer/message-queue mechanism fits into a
robust, stable, useful overall *asynchronous* architecture for multithreaded
software, such as real-time embedded systems---e.g., telephone equipment,
datacom equipment, high-availability servers, postal equipment, nuclear power
plants, aircraft, marinecraft, military vehicles, military weapons.

  >  Unless of
  > course you never want to shut it down.  If you did want it to shut
  > down you could add a flag and check it after the sem_wait().  Kludgey,
  > but not too horribly kludgey.  But you don't get clean shutdown that
  > way, just abort shutdown since you don't know whether you've processed
  > all the buffers or not, not without some really kludgey logic.


     Once again for readers who might be entertaining such broadly
anti-semaphore claims, "you don't get clean shutdown" misrepresents the topic.
When wanting to shutdown a message-queue (the canonical producer-consumer
mechanism in real-time software) one simply posts a shutdown message to the
message-queue which is enqueued in FIFO order behind all priorly-posted requests
for units-of-work/messages.  Shutdown is a valid nonarcane behavior for a
message-queue, and thus can be considered part of the primary mission of a
message-queue and thus can be part of the public interface of such a
message-queue class.  Upon dequeuing a shutdown request, the message-queue emits
to all pending threads some form of don't-pend-on-me-anymore-I-am-going-away
indicator (e.g., return code or exception, depending on local design's taste).
Alternatively in certain designs, the shutdown message might be processed by
awoken thread which was pending on the queue. In either case: no kludge, just
expected, squarely-within-stated-mission behavior: make sure that all
already-queued units-of-work are permitted to be drawn down.

     Likewise, if a semaphore is used to model permission-to-allocate from a pool
of limited resources, shutdown has an equally clean & robust solution.  Either
this application of semaphore can likewise have an explicit public shutdown
behavior which suppresses further allocation from the pool (because graceful
shutdown would imply that the use of the pool has reached quiescence: no member
of the pool is in use).  Or considering a more garden-variety semaphore, the
thread requesting the shutdown requests n resources (i.e., makes n one-resource
blocking-requests serially) so that the software accomplishing the graceful
shutdown hogs all n resources in an n-sized pool to assure that none are in use.
    Again no kludge, semaphore is working smoothly as designed & applied
enforcing the quiescence which is part of the graceful shutdown: make sure that
no member of the pool is in use.


  > If you deconstruct the XRAT (design rationalization) for semaphores,
  > you can surmise that somebody thought that semaphores could be
  > implemented that more efficient than condition variables,

    If the reader were to obediently comply with such line of reasoning here, the
reader might think that a producer-consumer RTOS-style message-queue can be
implemented (nearly-)equally efficiently either via a semaphore or via a
condition-variable.  Let's see if such apparent claims bear out by using POSIX
semaphores & POSIX condition-variables.

    Before proceeding, please read the following sections from the POSIX standard
now:


http://www.opengroup.org/onlinepubs/007904975/functions/pthread_cond_signal.html#tag_03_515_07

(especially the 1st paragraph in the APPLICATIONS section)

http://www.opengroup.org/onlinepubs/007904975/functions/pthread_cond_signal.html#tag_03_515_08

(especially the 1st & 2nd paragraphs after the code in the RATIONALE section)

    One might use pthread_cond_broadcast for multiprocessors as suggested above
within the POSIX standard.  But this implementation would perform worse than
semaphore on uniprocessors, because all pending threads would awake and evaluate
their logical-predicate in the condition-variable implementation.  If m threads
are waiting for permission to acquire a resource and if only one resource is
available for acquisition, then m-1 threads wasted their time with relatively
expensive thread-switches if interthread-intraprocess or process-switches if
interprocess-intraprocessor.  Note that in the semaphore implementation the
kernel would awake exactly one thread. In this case semaphore would be more
efficient than condition-variable.

    One might use pthread_cond_signal to avoid this inefficiency on
uniprocessors.  As aluded to in the 1st paragraph of the APPLICATIONS section,
this implementation would perform worse than semaphore on multiprocessors,
because only one pending thread would run at a time even if more than one
resource was available for acquisition.  If m threads are waiting for permission
to acquire a resource and if the multiprocessor has p processors where p >= m
and if r resources come available for acquisition where r >= m, then a serial
awaking of one thread at a time could occur where m threads could have awakened
concurrently.  Note that in the semaphore implementation the kernel could awake
up to m threads concurrently.  In this case semaphore would be more efficient
than condition-variable.

    Furthermore semaphore *must* be used instead of condition-variable if the
posting/incrementing/resource-now-available indicator is to originate from
within an interrupt-style signal handler.  For explanation read the 2nd
paragraph of the pthread_cond_signal APPLICATIONS section at:


http://www.opengroup.org/onlinepubs/007904975/functions/pthread_cond_signal.html#tag_03_515_07

(espeically the 2nd paragraph in the APPLICATIONS section)

    Compare & contrast with no such restriction for semaphores:

    http://www.opengroup.org/onlinepubs/007904975/functions/sem_post.html

  > If you deconstruct the XRAT (design rationalization) for semaphores,
  > you can surmise that somebody thought that semaphores could be
  > implemented that more efficient than condition variables, and so
  > it might be useful to have that efficiency in case it could be
  > exploited.  Semaphore are certainly not necessary.  You can get
  > along fine without them.


     Why can't those people who hold anti-semaphore viewpoints, simply avoid
using semaphores in their own software instead of attempting to engage in an
industry-wide suppression of semaphores' correct&beneficial usage?  Semaphores
(as well as semaphores' canonical application to producer-consumer
message-queues and thread-selection-by-kernel pool-allocation) are certainly
useful, robust, efficient, and elegant in multithreaded software, especially in
real-time embedded systems.  Thus semaphores deserve to be part of C++0x's
multithreaded feature-set.  What I can get along fine without is the
anti-semaphore/anti-message-queue/anti-pool-allocator viewpoints which seek to
harshly exclude the long-established practices of real-time embedded software
which is based on semaphores and message-queues.  Likewise, because these
viewpoints have dominated the Boost.threads library, I can get along fine
without the current & proposed chronic limitations of the Boost.threads library
which currently lacks semaphore, multiproducer-multiconsumer message-queues, and
thread-selection-by-kernel pool-allocators and is intended to continue lacking
these mechanisms.

     If these anti-semaphore viewpoints were merely a few lone voices at the
fringes of the universe, I would simply ignore them.  But these viewpoints have
been steering the direction of Boost.threads which thinks that it is on track to
become part of C++0x's standard libraries.  (Egads!)  Some of us in the
real-time embedded systems world perceive the frontal assault on semaphores (and
the related assault on message-queues which support multiple producers posting
to a mesage-queue pended on by each member of pool of consumers) to be a direct
threat to the applicability of C++0x's libraries to the real-time embedded
software world.  If the multithread model of C++0x or its libraries is contorted
to the point which the MT-classes cannot (or do not directly) support
time-honored real-time embedded real-time systems' multithread concepts
(semaphores & multiproducer-multiconsumer message-queues), then C++0x would
shutout a portion of its heritage: software to be embedded within telephone
equipment.  Remember that Bjarne Stroustrup worked for a telephone-equipment
manufacturer when he evolved C into C++ [AT&T: both a telephone service provider
but also a telephone equipment manufacturer at the time, now Lucent] and that a
large motivation for the leanness & efficiency of C++ was to scale down to the
same spaces that C plays well in, including embedded real-time systems,
specifically onboard network elements in the telecommunications industry.


  > You should do your design using the more general mechanism, condition
  > variables, and then, if it lends itself to a semaphore solution without
  > too much contortion, and if the semaphore implementation actually is
  > in fact significantly more efficient, and if performance really is that
  > much of a critical issue, then by all means, go ahead and use semaphores.
  >
  > In the case of your problem a solution would be use some sort of data
  > structure, set, queue, priority queue, or whatever, to communicate to
  > OutThread which devices were ready to communicate with.  You may be
  > using more than one mutex since there is the data structure itself
  > and data associated with each device such as a count.

    Once again, such viewpoints misrepresent the finer points of the topic.  By
focusing the solution heavily on mutexes (i.e., *synchronization* mechanism),
such viewpoints divert the reader from easily seeing the primary purpose of the
multiproducer-multiconsumer message-queues: Asynchronous interaction between
threads, with emphasis on the "a" of asynchronous.  Yes, the
multiproducer-multiconsumer message-queue would need to protect itself extremely
briefly during pushPost and popPend operations, but this would be within the
internal operations of the message-queue itself, typically eliminating
interthread synchronization when accessing the units-of-work conveyed by the
message-queue.  The application-domain would not need to protect the data which
the message-queue is conveying if the canonical rule regarding the use of
message-queues were to be obeyed: the posting/pushing/producing thread
relinquishes all rememberance of the data which it pushed into the message-queue
and the pending/popping/consuming thread relinquishes all of its rememberance of
the data which it popped from the message-queue. Just as mutexes & read-write
locks are mechanisms of interthread *synchronization*, message-queues are a
mechanism of *asynchronous* hand-off of units-of-work between threads.

   Rather than recognize that semaphores & the multiproducer-multiconsumer
message-queue mechanism can be written once & for all as useful general-purpose
infrastructure within a (standard) library, the
antisemaphore/condition-variables-only viewpoint causes an erosion of
OO-encapsulation which otherwise would be possible with a semaphore class.
Instead of encapsulating the granting of permission to access a resource (i.e.,
producer-consumer's produced resource; message-queue's posted/pushed message;
pool's allocatable limited resource) inside a class named semaphore (or inside
each of two classes named producer_consumer and pool_dispenser), this granting
of permission to access a resource is inappropriately placed outside of the
infrastructural (standard) library (layer-N) into the application-domain
(layer-N+1) in a rather messy way.  This
condition-variable-with-predicate-evaluation-in-the-layer-N+1-application-domain
violates my "Are the guts hanging out?" test of whether a design violates
OO-encapsulation.

   Furthermore, the thread-selection-by-kernel allocation-from-pool mechanism
has a corresponding deallocation-back-to-pool behavior.  Leveraging the C++
addage "resource allocation on ctor, resource deallocation on dtor" for
exception-safety, the pool_dispenser class would have a guard class whose ctor
acquired permission to access a member of the pool and whose dtor released
permission to access a member of the pool.  Because the condition-variable
solution has violated OO-encapsulation, there is no ctor for
permission-acquisition nor any dtor for permission-release.  Thus there is less
exception-safety with the condition-variable approach.  Thus there is the
potential for quite harmful resource leaks in the condition-variable approach
which is not possible with the semaphore/pool_dispenser approach with its
accompanying guard class.

    One likes to get dial-tone fast when picking up the telephone handset,
correct?  One likes for the telephone switch in one's community to handle
periods of heavy loads without loss of service, correct?  One likes for the
weaponry of one's military to aim & fire rapidly in real-time, correct?  One
likes for one's air-traffic control systems (or the airplane's control systems)
to operate with low-latency in real-time, correct?  Building *asynchronous*
(with the emphasis on the "a") embedded real-time systems goes a long way toward
accomplishing those goals and has for decades.  Focus less on
interthread/interprocess *synchronization* and more on clean producer-consumer
hand-off mechanisms enabling thread/process *asynchronization" and one will tend
to have an MT-software-system which has higher sustained throughput even under
heavy loads.

    Please ensure that C++0x supports mechanisms for *asynchronous* hand-off of
units-of-work between threads via multiproducer-multiconsumer message-queues as
much as it supports mechanisms for interthread *synchronization*.

  > So OutThread logic is going to look something like
  >
  >  while (outbound data) {
  >   while (ready devices) {
  >    ...
  >   }
  >  }
  >
  > which translated into CV logic
  >
  >  while (outbound data) {
  >   pthread_mutex_lock();
  >   while (no ready devices) {
  >    pthread_cond_wait();
  >   }
  >   ...
  >   pthread_mutex_unlock();
  >   ...
  >  }
  >
  > Joe Seigh


    <glossary>

     RTOS = real-time operating system

     MT = multithread(ed)

     Message-queue = In this context all references to message-queue are
leveraging a conceptual C++ extrapolation of pSOS+'s interthread message-queues.
    Note that these message-queues differ from POSIX or System V message-queues in
that a POSIX/System-V message-queues dictate a message-queue-buffer/payload
structure, whereas pSOS-style message-queues are a conveyor for *any* type
(generally, a pointer to any type versus a single integer in pSOS)

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]