Thread

Topic: Is STL a good standard? (STL library - What is it, exactly?)

Author: tmb@arolla.idiap.ch (Thomas M. Breuel)
Date: 14 Oct 1994 23:16:16 GMT Raw View

In article <9428319.9181@mulga.cs.mu.OZ.AU> fjh@munta.cs.mu.OZ.AU (Fergus Henderson) writes:
|kanze@us-es.sel.de (James Kanze US/ESC 60/3/164 #71425) writes:
|
|>fjh@munta.cs.mu.OZ.AU (Fergus Henderson) writes:
|>
|>|> Note that hashed associative containers would need to have stricter
|>|> requirements in this respect - the specification should say that
|>|> insertion or deletion invalidates all iterators referring to elements
|>|> in the collection.
|>
|>Why?
|
|To give the implementor maximum flexibility to implement them efficiently.
|(If the user wants the additional functionality rather than efficiency,
|they could use the ordered associative containers instead.)

The very idea of giving "the implementor maximum flexibility" in
anything in a standard template library is flawed.

For example, hash tables for serial machines are perfectly well
understood.  There is nothing (except for playing around a little with
word sizes and storage layout/alignment) that will uniformly increase
performance of a hashtable implementation on some standard
architectures but not on others.

A STL should pick an implementation and specify and document it
meticulously and in great detail.  Allowing different vendors to
experiment with different implementations only leads to unpleasant
surprises for users as they are trying to port their code.

    Thomas.

Author: maxtal@physics.su.OZ.AU (John Max Skaller)
Date: Thu, 6 Oct 1994 20:38:47 GMT Raw View

In article <nagleCwHnpC.76B@netcom.com> nagle@netcom.com (John Nagle) writes:
>       Overall, I think the big mistake with STL was trying to replicate
>C pointer semantics, instead of providing wrappers for the built-in
>types with better semantics.

 The whole point of the STL design is to replicate C
pointer semantics so you can use C pointers as iterators.

 Your argument is tantamount to saying we should
discard C altogether. Technically, there is no doubt about that
in my mind, but the conclusion is that if you discard C you
might as well use a "real" modern language like Modula or Eiffel
which have been designed without the constraints of backward
compatibility.

 Backward compatibility has commercial advantages.
Thats why I use C++ as a programming language.

 C++ is bound by constraints of C compatibility:
deliberately: and STL carries on that philosophy faithfully.
As it should.
--
        JOHN (MAX) SKALLER,         INTERNET:maxtal@suphys.physics.su.oz.au
 Maxtal Pty Ltd,
        81A Glebe Point Rd, GLEBE   Mem: SA IT/9/22,SC22/WG21
        NSW 2037, AUSTRALIA     Phone: 61-2-566-2189

Author: fjh@munta.cs.mu.OZ.AU (Fergus Henderson)
Date: Mon, 10 Oct 1994 09:26:57 GMT Raw View

kanze@us-es.sel.de (James Kanze US/ESC 60/3/164 #71425) writes:

>fjh@munta.cs.mu.OZ.AU (Fergus Henderson) writes:
>
>|> Note that hashed associative containers would need to have stricter
>|> requirements in this respect - the specification should say that
>|> insertion or deletion invalidates all iterators referring to elements
>|> in the collection.
>
>Why?

To give the implementor maximum flexibility to implement them efficiently.
(If the user wants the additional functionality rather than efficiency,
they could use the ordered associative containers instead.)
Also, it makes implementation easier.

>I suspect that this depends on the actual implementation of the hash
>table, and in particular, its collision handling.  I've taken the easy
>way out; the hash code simply selects a bucket, which contains a
>linked list of all elements which hash identically.  Although simple,
>this works well if the hash function is good and the table correctly
>dimensioned.

In general, working out the best size for the table can be difficult;
for example, it may depend on the input to the program.  If you choose
a table size that is too small, performance will degenerate if you
insert too many items into the table.  For this reason, the hash table
should automatically resize itself if it becomes too full.  The resize
operation might invalidate all the iterators.

--
Fergus Henderson - fjh@munta.cs.mu.oz.au

Author: kanze@us-es.sel.de (James Kanze US/ESC 60/3/164 #71425)
Date: 10 Oct 1994 18:42:10 GMT Raw View

In article <9428319.9181@mulga.cs.mu.OZ.AU> fjh@munta.cs.mu.OZ.AU
(Fergus Henderson) writes:

|> kanze@us-es.sel.de (James Kanze US/ESC 60/3/164 #71425) writes:

|> >fjh@munta.cs.mu.OZ.AU (Fergus Henderson) writes:
|> >
|> >|> Note that hashed associative containers would need to have stricter
|> >|> requirements in this respect - the specification should say that
|> >|> insertion or deletion invalidates all iterators referring to elements
|> >|> in the collection.

|> To give the implementor maximum flexibility to implement them efficiently.
|> (If the user wants the additional functionality rather than efficiency,
|> they could use the ordered associative containers instead.)
|> Also, it makes implementation easier.

|> >I suspect that this depends on the actual implementation of the hash
|> >table, and in particular, its collision handling.  I've taken the easy
|> >way out; the hash code simply selects a bucket, which contains a
|> >linked list of all elements which hash identically.  Although simple,
|> >this works well if the hash function is good and the table correctly
|> >dimensioned.

|> In general, working out the best size for the table can be difficult;
|> for example, it may depend on the input to the program.  If you choose
|> a table size that is too small, performance will degenerate if you
|> insert too many items into the table.  For this reason, the hash table
|> should automatically resize itself if it becomes too full.  The resize
|> operation might invalidate all the iterators.

This almost suggests the need of two types of hash tables.  In my
applications, I use hash tables principally for mapping; the maximum
number of entries is known when the program is written.  For this
reason, my tables all take a size parameter to the template.  (And if
we were wrong concerning the maximum number of entries, performance
suffers.)  On the other hand, it is convenient to be able to modify
the table while iterating.
--
James Kanze      Tel.: (+33) 88 14 49 00     email: kanze@lts.sel.alcatel.de
GABI Software, Sarl., 8 rue des Francs-Bourgeois, F-67000 Strasbourg, France
Conseils en informatique industrielle --
                              -- Beratung in industrieller Datenverarbeitung

Author: fjh@munta.cs.mu.OZ.AU (Fergus Henderson)
Date: Sun, 2 Oct 1994 11:48:58 GMT Raw View

mbk@inls1.ucsd.edu (Matt Kennel) writes:

>James Kanze US/ESC 60/3/164 #71425 (kanze@us-es.sel.de) wrote:
>
>: Then require the user to provide a global function `hash( Key const& )'.
>
>And why then a *global* function?

Because you need to be able to hash builtin types such as `const char *',
and that can't be done with a member function.

--
Fergus Henderson - fjh@munta.cs.mu.oz.au

Author: fjh@munta.cs.mu.OZ.AU (Fergus Henderson)
Date: Sun, 2 Oct 1994 17:44:38 GMT Raw View

kanze@us-es.sel.de (James Kanze US/ESC 60/3/164 #71425) writes:

>I think that John's point is valid.  STL doesn't just provide
>containers.  As I understand it, it provides a number of template
>functions to manipulate containers (and their elements).

Yes, STL includes both template container classes (with various member
functions such as insert and erase), and also algorithms to manipulate the
container classes (implemented as template functions).

>I am a bit
>disappointed that it doesn't implement hash coding (which I generally
>prefer over tree structures), but I can accept it.

Fair enough.  Personally I think that it is a significant loss.

>I find it much
>more serious that it cannot manipulate such containers with its
>template functions.  This isn't just saying, I don't offer hash
>tables; this is saying, I don't want you to use hash tables.

But it *can* manipulate such containers with its template functions.
The template functions all access the containers only via iterators.
So long as you provide an STL-compliant iterator for your hash table
based collection classes, you can use the STL template functions.

>Wouldn't it be preferable to define two types of associative
>containers, ordered and unordered.  Although the names sound
>exclusive, I think that we can agree that the ordered associative
>containers subsumes the unordered ones.  (Every ordered container is
>also an unordered one.)  The STL is then free to offer only the
>ordered ones, but for the functions where an unordered one makes
>sense, I can use one of my own (or Booch's, for example).

Since STL template functions already access collections via iterators,
your suggestion reduces to just changing the name of the STL
associative containers to "ordered associative containers".
STL already satisfies your other requirements.

--
Fergus Henderson - fjh@munta.cs.mu.oz.au

Author: shoe@objectSpace.com (Brett L. Schuchert)
Date: Sun, 2 Oct 1994 21:03:04 GMT Raw View

Why hasn't the subject line of this thread changed to:
  "Hash funcitons - what your Grandmother has always been
   afraid to ask."

shoe

Author: kanze@us-es.sel.de (James Kanze US/ESC 60/3/164 #71425)
Date: 05 Oct 1994 13:51:32 GMT Raw View

In article <9427501.3099@mulga.cs.mu.OZ.AU> fjh@munta.cs.mu.OZ.AU
(Fergus Henderson) writes:

|> Note that hashed associative containers would need to have stricter
|> requirements in this respect - the specification should say that
|> insertion or deletion invalidates all iterators referring to elements
|> in the collection.

Why?  My SetOf and AssocArrayOf classes are based on hashing
techniques.  Their iterators allow insertion and deletion on the
container.  Deletion of the element under the iterator invalidates the
iterator, but you can still increment an invalide iterator, after
which it becomes valid again.

I suspect that this depends on the actual implementation of the hash
table, and in particular, its collision handling.  I've taken the easy
way out; the hash code simply selects a bucket, which contains a
linked list of all elements which hash identically.  Although simple,
this works well if the hash function is good and the table correctly
dimensioned.
--
James Kanze      Tel.: (+33) 88 14 49 00     email: kanze@lts.sel.alcatel.de
GABI Software, Sarl., 8 rue des Francs-Bourgeois, F-67000 Strasbourg, France
Conseils en informatique industrielle --
                              -- Beratung in industrieller Datenverarbeitung

Author: rfg@netcom.com (Ronald F. Guilmette)
Date: Thu, 29 Sep 1994 09:05:31 GMT Raw View

In article <1994Sep28.140438.3967@sco.COM> simon@sco.COM (Simon Tooke) writes:
>In <rfgCwpI5J.37D@netcom.com> rfg@netcom.com (Ronald F. Guilmette) writes:
>>P.S.  For the benefit of those who are not already aware of it, there will
>>be _no_ rationale document for the ANSI/ISO C++ standard.  The first order
>>reason for this is that no one wanted to write one.  Determination of the
>>second order reason(s) (i.e. the reason(s) why no one wanted to write one)
>>is left as an exercize for the reader.
>
>Incorrect.  Many people (including myself) wanted to write one.

I stand corrected.  I should have said that no one wanted to write one
bad enough to actually do it.

>The problem was time constraints.

Isn't it always?  I guess one might say that the reason we haven't yet
solved the problem of world poverty is `time constraints'.  That may be
a perfectly true statement as it stands, but it doesn't carry a great
deal of semantic content.

--

-- Ron Guilmette, Sunnyvale, CA ---------- RG Consulting -------------------
---- domain addr: rfg@netcom.com ----------- Purveyors of Compiler Test ----
---- uucp addr: ...!uunet!netcom!rfg ------- Suites and Bullet-Proof Shoes -

Author: kanze@us-es.sel.de (James Kanze US/ESC 60/3/164 #71425)
Date: 30 Sep 1994 18:35:16 GMT Raw View

In article <DOUG.94Sep30083832@monet.ads.com> doug@monet.ads.com (Doug
Morgan) writes:

|> In article <KANZE.94Sep30131604@slsvhdt.us-es.sel.de> kanze@us-es.sel.de (James Kanze US/ESC 60/3/164 #71425) writes:
|> |rridge@calum.csclub.uwaterloo.ca (Ross Ridge) writes:
|> |...
|> |Examples of objects without total ordering: complex, SetOf<T>,
|> |PrintedCircuitBoard.

|> This isn't entirely correct.  The axiom of choice is equivalent to the
|> property that every set can be well ordered (Zermelo, 1904).

I expressed myself poorly.  My point was that there is not an implicit
total ordering function immediately available.

|> Well
|> ordering implies ordering.  Therefore you have to deny the axiom of
|> choice to say that some set cannot be ordered.  Such denial is
|> possible and is regularly done, but usually not lightly.  Also,
|> complex numbers can be lexicographically ordered by, say, real part
|> first then imaginary part.  If T's are ordered (and the hypothesis was
|> that all objects could be ordered), then finite SetOf<T>'s can be
|> ordered by sorting the sets and doing a sequence comparison.  I'm not
|> sure about PrintedCircuitBoards, but I suspect that given a spec for
|> such objects it would be easy to design some ordering (not to say the
|> the ordering would be easy to compute).

I believe that you can use the same basic algorithm I use for hashing
(recursive decomposition until you get to something that can be mapped
to an unsigned) can also be used for ordering.  With regards to the
posting which inspired my comments, however, this is not the identity
function.
--
James Kanze      Tel.: (+33) 88 14 49 00     email: kanze@lts.sel.alcatel.de
GABI Software, Sarl., 8 rue des Francs-Bourgeois, F-67000 Strasbourg, France
Conseils en informatique industrielle --
                              -- Beratung in industrieller Datenverarbeitung

Author: matt@physics16.berkeley.edu (Matt Austern)
Date: 30 Sep 1994 19:06:52 GMT Raw View

In article <DOUG.94Sep30083832@monet.ads.com> doug@monet.ads.com (Doug Morgan) writes:

> This isn't entirely correct.  The axiom of choice is equivalent to the
> property that every set can be well ordered (Zermelo, 1904).  Well
> ordering implies ordering.  Therefore you have to deny the axiom of
> choice to say that some set cannot be ordered.

But the axiom of choice is completely useless for our purposes!  You
can prove that every set has a well ordering, but there is no
procedure for constructing that well ordering.  So, for example,
nobody has ever constructed a well ordering for the set of real
numbers.

A nonconstructive existence proof is useful from the standpoint of
pure mathematics, but it has no value from the standpoint of designing
real-world algorithms for computer programs.

(And just to keep our feet on the ground, note that what we want isn't
even just an ordering over the set of all objects of a particular
type, but an ordering that can be computed within space and time
constraints: who wants to write a program where every comparision
between two objects requires a reference to a terrabyte-long lookup
table?  This makes the axiom of choice even less relevant for what
we're talking about.)
--

                               --matt

Author: rridge@calum.csclub.uwaterloo.ca (Ross Ridge)
Date: Fri, 30 Sep 1994 05:20:14 GMT Raw View

James Kanze US/ESC 60/3/164 #71425 <kanze@us-es.sel.de> wrote:
>A hashing function must map to an integral value within the range of
>table indexes.  This means that there are a finite number of elements
>in the target domain.  However, it is possible to hash strings (for
>example), although the set of all strings is infinite.

The set strings are only infinite in theory.

>The identity function may be used as a hashing function for integral
>values.

It's trivial to map a string to an integral value.

       Ross Ridge

--
 l/    Ross Ridge      //
[oo]   The Great HTMU, CSC President   [oo]
-()-         http://csclub.uwaterloo.ca/~rridge/  /()/
 db           +1 519 883 4329                             //

Author: rridge@calum.csclub.uwaterloo.ca (Ross Ridge)
Date: Fri, 30 Sep 1994 05:23:29 GMT Raw View

rridge@calum.csclub.uwaterloo.ca (Ross Ridge) writes:
> Sorry?  You can't have any more trivial a function than H(x) = x.

James Kanze US/ESC 60/3/164 #71425 <kanze@us-es.sel.de> wrote:
>And what does that buy us.  The identity function is neither a hash
>function, nor does it define a total ordering on the objects.

How does it not provide a total ordering on the objects?

      Ross Ridge

--
 l/    Ross Ridge      //
[oo]   The Great HTMU, CSC President   [oo]
-()-         http://csclub.uwaterloo.ca/~rridge/  /()/
 db           +1 519 883 4329                             //

Author: kanze@us-es.sel.de (James Kanze US/ESC 60/3/164 #71425)
Date: 30 Sep 1994 12:16:04 GMT Raw View

In article <CwxGB5.4GH@undergrad.math.uwaterloo.ca>
rridge@calum.csclub.uwaterloo.ca (Ross Ridge) writes:

|> rridge@calum.csclub.uwaterloo.ca (Ross Ridge) writes:
|> > Sorry?  You can't have any more trivial a function than H(x) = x.

|> James Kanze US/ESC 60/3/164 #71425 <kanze@us-es.sel.de> wrote:
|> >And what does that buy us.  The identity function is neither a hash
|> >function, nor does it define a total ordering on the objects.

|> How does it not provide a total ordering on the objects?

If the objects do not have total ordering (and I suspect that most
don't), then the identity function is not going to help.

Examples of objects without total ordering: complex, SetOf<T>,
PrintedCircuitBoard.  Note that with the exception of SetOf<T>, the
comparisons for inequality (<, etc.) are not defined for these.  In
the case of SetOf<T>, the inequalities are defined as subsets, so you
still cannot say that a<b <==> b<a (which is basically what total
ordering implies).  (Note that < means "is a strict subset of", <= is
the actual operator for "is a subset of".)
--
James Kanze      Tel.: (+33) 88 14 49 00     email: kanze@lts.sel.alcatel.de
GABI Software, Sarl., 8 rue des Francs-Bourgeois, F-67000 Strasbourg, France
Conseils en informatique industrielle --
                              -- Beratung in industrieller Datenverarbeitung

Author: doug@monet.ads.com (Doug Morgan)
Date: 30 Sep 94 14:24:01 Raw View

In article <MATT.94Sep30120652@physics16.berkeley.edu> matt@physics16.berkeley.edu (Matt Austern) writes:
|In article <DOUG.94Sep30083832@monet.ads.com> doug@monet.ads.com (Doug Morgan) writes:
|
|> This isn't entirely correct.  The axiom of choice is equivalent to the
|> property that every set can be well ordered (Zermelo, 1904).  Well
|> ordering implies ordering.  Therefore you have to deny the axiom of
|> choice to say that some set cannot be ordered.
|
|But the axiom of choice is completely useless for our purposes! ...

Instead of "Also, complex numbers can be lexicographically ordered by,
say, real part first then imaginary part..."  I should have said
"Seriously, though, complex numbers...."  Either that or give smiley
faces a try.

My only real point was that I agree with Kanze.

Doug
--------------------------------------------------------------------
Doug Morgan, doug@ads.com, (415) 960-7444
Advanced Decision Systems (a division of Booz-Allen & Hamilton Inc.)
1500 Plymouth St., Mountain View, CA 94043-1230
--------------------------------------------------------------------

Author: fjh@munta.cs.mu.OZ.AU (Fergus Henderson)
Date: Sat, 1 Oct 1994 04:51:38 GMT Raw View

Charles (Froggy) Fiterman <cef@geodesic.com> writes:

>Second if an class is connected to an Allocator can I
>build objects on the stack or statically. I suspect
>the answer is no.

Yes, you can.  The Allocator is used to allocate memory
that would otherwise be allocated with operator new,
but that doesn't prevent you from declaring the object
itself as a local or static variable.  For example, in

 main() {
  vector<allocator> foo;
 }

the memory to store the object `foo' will be allocated on the stack,
but the memory to store the elements of the vector will be allocated
using `allocator'.

--
Fergus Henderson - fjh@munta.cs.mu.oz.au

Author: doug@monet.ads.com (Doug Morgan)
Date: 30 Sep 94 08:38:32 Raw View

In article <KANZE.94Sep30131604@slsvhdt.us-es.sel.de> kanze@us-es.sel.de (James Kanze US/ESC 60/3/164 #71425) writes:
|rridge@calum.csclub.uwaterloo.ca (Ross Ridge) writes:
|...
|Examples of objects without total ordering: complex, SetOf<T>,
|PrintedCircuitBoard.

This isn't entirely correct. The axiom of choice is equivalent to the
property that every set can be well ordered (Zermelo, 1904). Well
ordering implies ordering. Therefore you have to deny the axiom of
choice to say that some set cannot be ordered. Such denial is
possible and is regularly done, but usually not lightly. Also,
complex numbers can be lexicographically ordered by, say, real part
first then imaginary part. If T's are ordered (and the hypothesis was
that all objects could be ordered), then finite SetOf<T>'s can be
ordered by sorting the sets and doing a sequence comparison. I'm not
sure about PrintedCircuitBoards, but I suspect that given a spec for
such objects it would be easy to design some ordering (not to say the
the ordering would be easy to compute).

That said, I entirely agree with your stand that hashing is both
useful and often easy to implement efficiently (and when it's not, so
what? --- software doesn't have to be easy). The complexity of hash
plus equality functions is often comparable to the complexity of
comparison functions (again, so what? --- the performance
characteristics of hash tables and trees are completely different,
even if the component functions may be somewhat similar). I fail to
understand any part of Ross' contention that hash tables are somehow
superceded by trees. They are simply different and each can be
unboundedly better than the other for different applications.

Doug
--------------------------------------------------------------------
Doug Morgan, doug@ads.com, (415) 960-7444
Advanced Decision Systems (a division of Booz-Allen & Hamilton Inc.)
1500 Plymouth St., Mountain View, CA 94043-1230
--------------------------------------------------------------------

Author: fjh@munta.cs.mu.OZ.AU (Fergus Henderson)
Date: Sat, 1 Oct 1994 15:10:34 GMT Raw View

jdp@polstra.com (John Polstra) writes:

>Pete Becker <pete@genghis.interbase.borland.com> wrote:
>> John Polstra <jdp@polstra.com> wrote:
>> >There are valid reasons why a programmer might choose to use a hash table
>> >rather than a tree-based container.  The language *standard* should not
>> >preclude making this kind of trade-off.
>>
>>  No, it shouldn't.

I think everyone agrees on this point.

>> And if STL is adopted as currently defined, it
>> won't. There is nothing in STL that prevents you from using hash tables.

Pete Becker is correct.  There is nothing in STL that *prevents* you from
using hash tables.  On the other hand, there is nothing in STL that makes
it easy either.  I think the real question is whether hash tables are
a sufficiently important data structure that they should be part of
the standard library.  And I think the answer is yes!

>This whole sub-thread is about associative containers, isn't it?  The
>STL specification prevents you from using hash tables to implement
>associative containers.

The STL specification doesn't prevent you from using hash tables to
implement associative containers, in the ordinary sense of the term.
It does prevent you from using hash tables to implement "STL
associative containers" (note: proper noun), but that is not
really a problem - you can use hash tables to implement say "hashed
associative containers".

The real question is not whether STL makes "hashed associative
containers" impossible, but whether "hashed associative containers"
should be part of the standard library or not.  To me, it seems
that hashed associative containers are an extremely useful data type,
and that they should be prime candidates for the standard library.  I
think the main reason that some people disagree is that they don't
think the benefit gained by providing hashed associative containers is
worth the additional complexity.

>Pardon my sloppy wording.  Let me try again:  STL decrees that all
>associative containers are ordered associative containers.

I agree that the attempted monopolization of the term "associative
containers" is a problem.  Rather than perpetuate that, lets
use the term "STL associative containers" for what STL calls
associative containers, and use "associative containers" for
the ordinary meaning of the term.

>> It provides several associative containers that are ordered.
>
>It goes beyond that.  It places requirements on associative containers
>which force them to be implemented as ordered containers.

Well, it forces "STL associative containers" to be implemented as
ordered containers, but that doesn't prevent you from defining your
own "hashed associative containers".

>You know, I can't believe that this is even controversial.  The
>distinction between unordered collections and ordered collections is
>nothing new.  Some algorithms need an ordered collection, and others
>don't.  By failing to distinguish between the two, and by promising that
>all associative containers will meet certain requirements, STL forces
>ordered associative containers to be used even when the ordering
>property is not needed.  As others have pointed out, this problem would
>not be very difficult to repair in the specification.  Why is there so
>much argument about it?

It's not really so much that STL failed to distinguish between the two,
but that STL only provided one of them (and gave it a misleading name ;-).
The decision is understandable - apart from efficiency concerns,
there's not really much need for unordered associtive collections.
The question is how much would the interface and implmentation be complicated
if the standard library were to provide hashed associative collections
as well as the existing (ordered) associative collections, and is it
worth it?  IMHO, the answers are "not much" and "definitely".
If hash tables are not part of the standard library, then everyone will
have to reinvent the wheel in order to get reasonable efficiency for
their programs.

--
Fergus Henderson - fjh@munta.cs.mu.oz.au

Author: fjh@munta.cs.mu.OZ.AU (Fergus Henderson)
Date: Sat, 1 Oct 1994 15:39:36 GMT Raw View

kanze@us-es.sel.de (James Kanze US/ESC 60/3/164 #71425) writes:

>fjh@munta.cs.mu.OZ.AU (Fergus Henderson) writes:
>
>|> Implementing any iterator at all is non-trivial but not that
>|> difficult.
>
>Are you taking into account the possibility of the element under the
>iterator being deleted when you say that an iterator is not that
>difficult?

No, I'm not.  STL states that certain operations can "invalidate"
iterators.  The effect of accessing an invalid iterator is undefined.
For STL associative containers, deletion of an element (using `erase')
invalidates any iterator referring to that element.

Note that hashed associative containers would need to have stricter
requirements in this respect - the specification should say that
insertion or deletion invalidates all iterators referring to elements
in the collection.

>I typically find that adding an iterator to a container
>doubles the amount of code.

I guess that's one reason that STL takes the easy way out by defining
that problem to be the responsibility of the user, not the library.

--
Fergus Henderson - fjh@munta.cs.mu.oz.au

Author: nagle@netcom.com (John Nagle)
Date: Sat, 24 Sep 1994 18:18:06 GMT Raw View

pete@genghis.interbase.borland.com (Pete Becker) writes:
>In article <nagleCwHnpC.76B@netcom.com>, John Nagle <nagle@netcom.com> wrote:
>>
>>      But this is a lesser issue.  The real problems revolve around
>>"off-the-end" values.  There's no way to test an iterator for being
>>"off-the-end" or "singular" once one is in one of those states.
>>
>>      "Singular" iterators seem unnecessary.  Do we really need to
>>deliberately replicate the semantics of uninitialized pointers?
>>That's what default constructors exist to prevent.  See the discussion
>>at the bottom of page 6.
>>
>>      There should be at least a predicate
>>
>> a.dereferenceable(p)
>>
>>where a is a container and p is an iterator pointing to that container.
>>"a.dereferenceable(p)" should return "true" when "*p" is meaningful.
>>(This might be called "valid" instead of "dereferenceable", but the STL
>>spec uses the term "dereferencable".) This is easy to implement (although
>>the implementation is different for different container types) and
>>doesn't imply any ovehead when not used.

> And in general it's O(n) to compute.

     Actually, you can do it in constant time.  For containers with
array-like implementations, it's just a range check.  For containers
with more complex data structures, the "past-the-end" values have to
be handled as special cases anyway, so there should be a way to check
for them.  "Singular" values can be handled by making sure the
constructors initialize iterators to either a valid value or something
comparable to "NULL".  Once we have that, we can assume that an iterator
that isn't identifiable as "off-the-end" or "NULL" is valid.

     (Are there "NULL" iterators?  That needs to be nailed down.  The
"singular" stuff suggests there are, but there's no way to test for them,
which makes them useless.  They probably should be explicitly ruled out.)

     The big problem with checking is that iterators need a pointer to
their collection if checking is to work. That adds little overhead but
makes iterators bigger than pointers.

>Seems like a pretty high price to pay to determine whether you're lost.
     That depends on the cost of being lost.  In this area, being lost
usually leads to total program failure for a difficult-to-diagnose
reason.  Of course, for some programs, total program failure isn't too
big a deal.

>>       Overall, I think the big mistake with STL was trying to replicate
>>C pointer semantics, instead of providing wrappers for the built-in
>>types with better semantics.
> This is really the crux of the issue, isn't it? If you disagree with
>the fundamental design assumption of STL, you'll never like it.

      He has a point.  I regard pointer arithmetic as a holdover from
Kernigan and Plauger's attempt to design a language that would go fast on
a small PDP-11 without having to write a global optimizer.  Most of the
newer class libraries avoid pointer arithmetic; Symantec's TCL,
Microsoft's OLE, and Taligent's published policy document all avoid
pointer arithmetic.  (At Taligent, you have to get written permission
from an "architect" to use it.)

      So until STL appeared, it looked like pointer arithmetic was on the
way out, which many people thought was a good thing.

      But if we have to have STL, it should have checking you can turn on.

     John Nagle

Author: pete@genghis.interbase.borland.com (Pete Becker)
Date: Sat, 24 Sep 1994 19:32:00 GMT Raw View

In article <nagleCwnB5o.zF@netcom.com>, John Nagle <nagle@netcom.com> wrote:
>pete@genghis.interbase.borland.com (Pete Becker) writes:
>>In article <nagleCwHnpC.76B@netcom.com>, John Nagle <nagle@netcom.com> wrote:
>>>pete@genghis.interbase.borland.com (Pete Becker) writes:
>>>>In article <nagleCwFtFK.G3r@netcom.com>, John Nagle <nagle@netcom.com> wrote:
>>>>>     Shouldn't storing into the object of a non-output iterator be an
>>>>>error?
>>>>>>Different containers have different rules about when erase and insert
>>>>>>invalidate interators.
>>>>>      Yes.  And those rules aren't documented.
>>>> On the contrary: they are clearly documented. See the discussion of
>>>>insert iterators.
>>>      Where? It's not in section 11.2.2, "Insert Iterators".  The pre- and
>>>post- condition table for sequences (table 10) does not document any
>>>invalidation conditions for iterators for "insert" operations.
>> In my copy of the STL specification, section 11.2.2 begins with the
>>following paragraph:
>> To make it possible to deal with insertion in the same way as
>> writing into an array, a special kind of iterator adaptors, (sic)
> ...
>
>     Actually, it turns out that Pete Becker is right about insert
>iterators, but the correct reference is section 8.2,

 More accurately, "another correct reference is section 8.2", something
that would be obvious to readers if the text included in my original
quotation had been kept intact in this paraphrase.

 -- Pete

Author: tmb@arolla.idiap.ch (Thomas M. Breuel)
Date: 24 Sep 1994 22:11:42 GMT Raw View

In article <CwGouu.8po@undergrad.math.uwaterloo.ca> r-ridge@calum.csclub.uwaterloo.ca (Ross Ridge) writes:
|John Polstra <jdp@polstra.com> wrote:
|>So what?  Doubly-linked lists are more general than singly-linked lists,
|>but that doesn't mean I never want to use singly-linked lists.  There
|>are genuine and important performance considerations that you are
|>ignoring entirely:
|>
|>    * Searching a properly-sized hash table is O(1).  Searching a tree
|>      is O(log n).
|>
|>    * A tree has, minimum, 2 pointers worth of overhead for every node.
|>      In practice, it usually has more.  (For example, a Red-black tree
|>      needs a flag bit to distinguish the red nodes from the black ones.)
|>      In many applications, a hash table will have lower storage overhead
|>      than a tree.
|
|I'm not ingoring them.  The problem, with respect to STL, is that you
|can't guarantee any of this.

A standard library of collection classes _could_ guarantee this: there
is nothing processor or implementation specific that prevents you from
making guarantees about numbers of memory accesses, amounts of storage
needed for each item, number of constructor/destructor calls, nature
and cost of range and bounds checking, and number of memory allocator
calls.  Constants, not just complexities, are crucially important for
real-world applications.

The problem with STL is precisely that it doesn't seem to make such
guarantees.  It seems much more concerned with providing a vastly
general framework for iteration, something that at least to me is of
dubious value.

    Thomas.

Author: bs@alice.att.com (Bjarne Stroustrup <9758-26353> 0112760)
Date: Sat, 24 Sep 1994 21:48:36 GMT Raw View

John Nagle writes:

 >     He has a point.  I regard pointer arithmetic as a holdover from
 > Kernigan and Plauger's attempt to design a language that would go fast on
 > a small PDP-11 without having to write a global optimizer.

Huh?

Neither Kernighan nor Plauger were major contributors to C.
Pointer arithmetic was an integral part of C's ancestor BCPL
(designed by Martin Richards) as it was in almost all of the
languages designed at that time to provide a practical alternative
to assembler.

Author: fjh@munta.cs.mu.OZ.AU (Fergus Henderson)
Date: Sun, 25 Sep 1994 06:44:23 GMT Raw View

r-ridge@calum.csclub.uwaterloo.ca (Ross Ridge) writes:

>James Kanze US/ESC 60/3/164 #71425 <kanze@us-es.sel.de> wrote:
>>
>>Then require the user to provide a global function `hash( Key const& )'.
>
>That's the problem.  The user has to find a good hash function for
>the particular hash table implementation being used.  This doesn't
>save the user a lot of effort.

I disagree.  Implementing a hash function is easy in comparison
to implementing a hash table (especially one with STL-compliant
iterators, etc.).

>>And maybe even something along the lines of the following:
>>
>> template< class T >
>> inline unsigned
>> hashElem( unsigned h , T const& obj )
>> {
>>     return 2047 * h + (unsigned)obj ;
>> }
>>
>>This makes it easy for the user to provide a good hash function; all
>>he has to do is be able to cut his object up into pieces, with each
>>piece convertable to an unsigned.
>
>I'm a bit skeptical about how well this would work.

A hash function doesn't have to be perfect to make hash tables more
efficient than red-black trees.

--
Fergus Henderson - fjh@munta.cs.mu.oz.au

Author: fjh@munta.cs.mu.OZ.AU (Fergus Henderson)
Date: Sun, 25 Sep 1994 06:59:40 GMT Raw View

r-ridge@calum.csclub.uwaterloo.ca (Ross Ridge) writes:

>John Polstra <jdp@polstra.com> wrote:
>>
>>    * Searching a properly-sized hash table is O(1).  Searching a tree
>>      is O(log n).
[...]
>>      In many applications, a hash table will have lower storage overhead
>>      than a tree.
>
>I'm not ingoring them.  The problem, with respect to STL, is that you
>can't guarantee any of this.

What's wrong with saying "searching is linear time, assuming a
sufficiently random hash function"?

In any case, I don't think the guarantees are necessary.
The average-case behaviour (averaged over all vendors ;-) is important,
not just the worst-case behaviour.  Maybe we can't guarantee that
hash tables will be better than a tree-based implementation, but I
suspect that regardless of the lack of guarantees, hash tables
will be better in the important majority of real programs.

>>There are valid reasons why a programmer might choose to use a hash table
>>rather than a tree-based container.  The language *standard* should not
>>preclude making this kind of trade-off.
>
>It doesn't.  It leaves the programmer with probably the best way to
>create a hash table, use a custom built hash table designed to work
>with data being stored.

That may well be the way to create the *best hash tables*, but I don't
think it's the *best way* to create hash tables.  I think that in most
cases, it's more important to avoid reinventing the wheel.

>>> I also think hash tables are outside the scope of STL.  They are hard
>>> to iterate over and don't provide any guarantees about performance.
>>
>>They are trivial to iterate over.
>
>Not as trivial as any of the other iterators.

*Using* a hash table iterator *is* trivial.
Implementing the iterator isn't too hard, although your are right
that it is not trivial - but that's exactly why it should be part of
the standard library, rather than requiring users to implement it
themselves.

>You need to do a two
>level iteration, and again with no guarantees of performance.

Not true - for iteration, you can certainly guarantee linear performance.

--
Fergus Henderson - fjh@munta.cs.mu.oz.au

Author: eyala@applicom.co.il (Eyal Allalouf)
Date: Sun, 25 Sep 1994 08:14:44 GMT Raw View

Matt Austern (matt@physics16.berkeley.edu) wrote:
: In article <DAG.94Sep16081742@bellman.control.lth.se> dag@control.lth.se (Dag Bruck) writes:

: Well, it's not just a matter of syntax; there's also an important
: symantic question here.

: The question is this: if you have an iterator running through some
: sort of collection, and you want to access an element that the
: iterator is referring to, do you do this through an iterator member
: function, or through a collector member function that takes the
: iterator as an argument?  That is (using the most common syntactic
: choice for each mechanism), do you write *p, or A   i   ?  It's this
: distinction that I have in mind when I refer to the distinction
: between pointer and index symantics.

: Why does it matter?  Well, the most obvious answer is the distinction
: between const and non-const collections.

I have faced the same question       question two years ago when I invented my own wheel.
(I wrote a class library). The question of constness did'nt even occur to
me. I chose the approach taken here by STL. The idea was that the
iterator should be autonomous. Thus you have to carry around only one
object in order to access the container and not two (the iterator AND the
container). While not realizing the whole impact of the approach at first
it did seem to feel better. As an example for the possibilities consider
filtering an iterator. A filter is a criterion that either pass a member of
the container or rejects it. Now since the user accesses only one object
I could transparently present the filter as a kind of iterator.
The result is that the user manipulates iterators without having to
be concerened whether they are connected directly to some container.

Any suggestions as to this may be done be done using the A   i    approach
will be very interesting.

-----------------------------------------------------------------------------
    Eyal Alaluf                                eyala@applicom.co.il
    Mainsoft Israel Ltd.                       c/o Applicom Systems Ltd.
                                               16 Abba Hillel Silver St.
    Phone : 972 -(3) 575 5550 ext : 1278       Ramat-Gan, 52506
    Fax   : 972 -(3) 751 0906                  Israel
------------------------------------------------------------------------------

Author: eyala@applicom.co.il (Eyal Allalouf)
Date: Sun, 25 Sep 1994 08:46:13 GMT Raw View

: John Polstra <jdp@polstra.com> wrote:
: >There are valid reasons why a programmer might choose to use a hash table
: >rather than a tree-based container.  The language *standard* should not
: >preclude making this kind of trade-off.

Ross Ridge (r-ridge@calum.csclub.uwaterloo.ca) wrote:
: It doesn't.  It leaves the programmer with probably the best way to
: create a hash table, use a custom built hash table designed to work
: with data being stored.

It does not supply the user with the means to manage the hash table itself.
This management is not altogether trivial and its absence from the STL is
s problem.

: >> I also think hash tables are outside the scope of STL.  They are hard
: >> to iterate over and don't provide any guarantees about performance.
: >
: >They are trivial to iterate over.

: Not as trivial as any of the other iterators.  You need to do a two
: level iteration, and again with no guarantees of performance.

I've implemented a hash table with iteration. It is possible to implement
it with a constant time for accessing the next element no matter how sparse
the table is. (It does cost some memory overhead). I would not call that
implementation trivial though.

The problem STL has with hash tables (and with other data structures) is
that the algorithm is inseparable from the data structure. The STL takes the
approach that algorithms and data structures are orthogonal.
For instance, is it possible to supply an output iterator for an open
hash table?

-----------------------------------------------------------------------------
    Eyal Alaluf                                eyala@applicom.co.il
    Mainsoft Israel Ltd.                       c/o Applicom Systems Ltd.
                                               16 Abba Hillel Silver St.
    Phone : 972 -(3) 575 5550 ext : 1278       Ramat-Gan, 52506
    Fax   : 972 -(3) 751 0906                  Israel
------------------------------------------------------------------------------

Author: g2devi@cdf.toronto.edu (Robert N. Deviasse)
Date: Sun, 25 Sep 1994 14:31:01 GMT Raw View

In article <9426816.10230@mulga.cs.mu.oz.au>,
Fergus Henderson <fjh@munta.cs.mu.OZ.AU> wrote:
>r-ridge@calum.csclub.uwaterloo.ca (Ross Ridge) writes:
>
>>James Kanze US/ESC 60/3/164 #71425 <kanze@us-es.sel.de> wrote:
>>>
>>>Then require the user to provide a global function `hash( Key const& )'.
>>
>>That's the problem.  The user has to find a good hash function for
>>the particular hash table implementation being used.  This doesn't
>>save the user a lot of effort.
>
>I disagree.  Implementing a hash function is easy in comparison
>to implementing a hash table (especially one with STL-compliant
>iterators, etc.).
>

I think that the whole problem with this discussion is that no-one has
written either a formal spec for a hash table template or created an
implementation. Because of this, it's very easy to claim that a hash
table template container can't be defined and implemented in a way that
would make it worthwhile to add to the STL, because this might very well
be the case!

Personally I don't *think* there would be any problems and I would like to
see it added to the STL, but until I *see* a formal spec and an
implementation, I'll remain skeptical.

>>>And maybe even something along the lines of the following:
>>>
>>> template< class T >
>>> inline unsigned
>>> hashElem( unsigned h , T const& obj )
>>> {
>>>     return 2047 * h + (unsigned)obj ;
>>> }
>>>
>>>This makes it easy for the user to provide a good hash function; all
>>>he has to do is be able to cut his object up into pieces, with each
>>>piece convertable to an unsigned.
>>
>>I'm a bit skeptical about how well this would work.
>
>A hash function doesn't have to be perfect to make hash tables more
>efficient than red-black trees.
>

True. Is there any problems with defining a range for the performance
measure? I mean, one could have a lower bound of O(1) for a perfect hash
and an upper bound of O(n) in the worst case. While O(n) might look bad,
the assumption is that the user of the hash table has taken the time to
analyse his/her problem domain well enough so that the O(1) value is
more realistic. There's no free lunch for hash tables, although a simple
CRC hash seems (in my experience at least) to be good for most string hash
tables.

Yes, the interval arithmetic does make the analysis harder, but if *someone*
really wants to have templated hash tables in the STL they'll do it, and
if a public implementation exists that meets these specs, it'll likely
likely become the defacto implementation. I personally don't seriously
expect compiler writers to try to re-write the publically available STL
implimentations -- they already have enough to worry about.

It might be naive in asking, but does anyone here already have a templated
hash table implementation, that is similar to the STL in style, that has
some analysis (albeit informal) already done on it? Does anyone have a
good (in the context of C++) abstract spec in the STL style for such a
container?

We might as well start discussing something concrete, even if it's only
an abstract spec.

>--
>Fergus Henderson - fjh@munta.cs.mu.oz.au

Take care
    Robert
--
/----------------------------------+------------------------------------------\
| Robert N. Deviasse               |"If we have to re-invent the wheel,       |
| EMAIL: g2devi@cdf.utoronto.ca    |  can we at least make it round this time"|
+----------------------------------+------------------------------------------/

Author: rfg@netcom.com (Ronald F. Guilmette)
Date: Sun, 25 Sep 1994 22:22:31 GMT Raw View

In article <TMB.94Sep23160653@arolla.idiap.ch> tmb@idiap.ch writes:
>|>Is there some fundamental reason (other than performance) for
>|>specifying "exponential doubling?"
>|
>|No, but performance *is* a fundamental reason.  Of course the performance
>|constraints should be more abstract and should not specify the implementation,
>|but in this case I think they effectively do anyway.
>
>Leaving implementations unspecified just for the heck of it, without a
>sound technical rationale, is a really bad idea...

I agree.  In fact I would go further and say that doing _anything_ (as
far as a language standard is concerned) without a sound technical
rationale is a really bad idea.

P.S.  For the benefit of those who are not already aware of it, there will
be _no_ rationale document for the ANSI/ISO C++ standard.  The first order
reason for this is that no one wanted to write one.  Determination of the
second order reason(s) (i.e. the reason(s) why no one wanted to write one)
is left as an exercize for the reader.

--

-- Ron Guilmette, Sunnyvale, CA ---------- RG Consulting -------------------
---- domain addr: rfg@netcom.com ----------- Purveyors of Compiler Test ----
---- uucp addr: ...!uunet!netcom!rfg ------- Suites and Bullet-Proof Shoes -

Author: fjh@munta.cs.mu.OZ.AU (Fergus Henderson)
Date: Mon, 26 Sep 1994 00:18:41 GMT Raw View

gnb@bby.com.au (Gregory Bond) writes:

>r-ridge@calum.csclub.uwaterloo.ca (Ross Ridge) writes:
>
>   Essentially my point is the despite the fact hash tables don't require
>   an ordering of the objects being hashed they aren't more general than a
>   container class that requires an ordering of the objects.
>
>The STL is built on ordered relations; in fact on two relations only:
>== and <.  Any object for which those two relations are defined can be
>used in an STL container.   Hash containers are much _less_ general.

As I've said previously, the issue is not generality, it's efficiency.

>The problem with hash containers is defining the "hash" of an object.
>The only way to do it is to require an external hash function (which
>usually must be a friend function) for each type that is used in a
>hash container.  Why force this extra and somewhat non-natural
>requirement on the container user?

Who said anything about *forcing* users to use hash-table based
containers?  My suggestion is that the standard library ought to
provide *both* tree-based containers using ordered relations and
containers based on hash tables.  I think both are fundamental
data-types which should be part of the standard library of any decent
language.

>What if the object the programmer
>wants to put into the hashed container is a class over which she has
>no control, and hence cannot declare the hash function as a friend?

In that case, the hash function would have to be constructed using
only the public access functions of the object.  In some cases it might
not be possible to construct a reasonable hash function.  But you can
run into similar problem if a class doesn't define operator `<'.

>Bonus question:  Why is the following not a workable hash function
>(and I don't mean typos!)?
>
> template <class T>
> unsigned hash(const T& t) {
>  unsigned h = 0;
>  for (int i = 0; i < sizeof(T); i++) {
>   h = h * 31 ^ ((unsigned char *)&t)[i];
>  }
>  return h;
> }
>
>Hint: Purify users not eligible.

Because it doesn't necessarily satisfy `a == b implies hash(a) == hash(b)',
since the object might contain uninitialized or `mutable' members whose
contents varies even for objects which represent the "same" logical
entity.

--
Fergus Henderson - fjh@munta.cs.mu.oz.au

Author: fjh@munta.cs.mu.OZ.AU (Fergus Henderson)
Date: Mon, 26 Sep 1994 01:05:24 GMT Raw View

eyala@applicom.co.il (Eyal Allalouf) writes:

>I've implemented a hash table with iteration. It is possible to implement
>it with a constant time for accessing the next element no matter how sparse
>the table is. (It does cost some memory overhead). I would not call that
>implementation trivial though.

Implementing any iterator at all is non-trivial but not that
difficult.  However, making the iterator take linear time to iterate
through an entire collection and hence takes amortized constant time
for each step is indeed trivial - the obvious implementation satisfies
that.  It does not require any extra memory overhead.

>For instance, is it possible to supply an output iterator for an open
>hash table?

Sure, I don't see why not.

--
Fergus Henderson - fjh@munta.cs.mu.oz.au

Author: matt@physics16.berkeley.edu (Matt Austern)
Date: 26 Sep 1994 07:35:56 GMT Raw View

In article <CwowBq.H3n@cdf.toronto.edu> g2devi@cdf.toronto.edu (Robert N. Deviasse) writes:

> I think that the whole problem with this discussion is that no-one has
> written either a formal spec for a hash table template or created an
> implementation. Because of this, it's very easy to claim that a hash
> table template container can't be defined and implemented in a way that
> would make it worthwhile to add to the STL, because this might very well
> be the case!
>
> Personally I don't *think* there would be any problems and I would like to
> see it added to the STL, but until I *see* a formal spec and an
> implementation, I'll remain skeptical.

Well,...

It's quite possible that adding hash tables to the STL really would
present serious problems.  I'd like to see hash tables in a library of
collection classes that's part of the C++ standard; that doesn't
necessarily mean that I want to see them as part of the STL.

The whole point of the STL is that different data structures look
nearly identical, that the same sort of iteration is used for all of
them, and that algorithms (for things like searching, filtering, and
sorting) are defined and implemented independently of the underlying
data structure.  It's entirely possible that this sort of design would
make it very difficult to implement certain types of collections
within the STL framework.

The STL idea is interesting, and it's nice to know that you can change
from one data structure to another without changing your code in very
many places, but it does also complicate things.  In many ways, it
might have been simpler to provide a whole bunch of popular collection
classes (vectors, lists, hash tables, etc.) without trying to make
them look similar and without trying to relate them to one another.
--

                               --matt

Author: mat@mole-end.matawan.nj.us
Date: Mon, 26 Sep 1994 06:56:22 GMT Raw View

In article <nagleCwFtKH.GCp@netcom.com>, nagle@netcom.com (John Nagle) writes:
  ...
>      An interesting question is whether knowingly, willfully, and
> perhaps negligently adopting a standard which is unsafe can lead to
> civil or criminal liability if (when?) systems based on the standard
> fail in operation.
>
>      John Nagle

Remember, there is _no_ general programming language (power of a Turing
Machine) for which we can program in complete safety.  Quite apart from
the problems of combinatorial complexity, we have Goedel's undecidability
and the Halting problem.

I suggest you go off and read _To Engineer is Human_.
--
 (This man's opinions are his own.)
 From mole-end    Mark Terribile
 mat@mole-end.matawan.nj.us, Somewhere in Matawan, NJ
 (Training and consulting in C, C++, UNIX, etc.)

Author: Charles (Froggy) Fiterman <cef@geodesic.com>
Date: Mon, 26 Sep 1994 13:48:52 GMT Raw View

A few comments on STL.

First the copy() functions are the opposite direction
from the existing copy functions like strcpy() this
is a minor thing but makes things harder to learn and
teach. It looks like a source of user bugs.

Second if an class is connected to an Allocator can I
build objects on the stack or statically. I suspect
the answer is no.

Third the vector class has operations that reallocate.
"Reallocation invalidates all references, pointers,
and iterators refering to elements of the sequence."

This is unnesesary and wrong. There is an article in
Doctor Dobs showing that you start with x elements and
if you need more add another chunk x*2 long then x*4.
Since you rapidly get up to as many chunks as address space
will hold you need a small vector class to hold the
pointers. Access is in constant time.

Containers should not break pointers to things inside
them. It is a guaranteed source of bugs.

But in general I like STL. Pointers are wrapped in a
good way by Allocators and we can add garbage collection
on top easily. Further we can add garbage collection
in such a way that objects in different allocators can
point to each other.

Author: pete@genghis.interbase.borland.com (Pete Becker)
Date: Mon, 26 Sep 1994 17:33:52 GMT Raw View

In article <MATT.94Sep26003556@physics16.berkeley.edu>,
Matt Austern <matt@physics.berkeley.edu> wrote:
>In article <CwowBq.H3n@cdf.toronto.edu> g2devi@cdf.toronto.edu (Robert N. Deviasse) writes:
>
>> I think that the whole problem with this discussion is that no-one has
>> written either a formal spec for a hash table template or created an
>> implementation. Because of this, it's very easy to claim that a hash
>> table template container can't be defined and implemented in a way that
>> would make it worthwhile to add to the STL, because this might very well
>> be the case!
>>
>> Personally I don't *think* there would be any problems and I would like to
>> see it added to the STL, but until I *see* a formal spec and an
>> implementation, I'll remain skeptical.
>
>Well,...
>
>It's quite possible that adding hash tables to the STL really would
>present serious problems.  I'd like to see hash tables in a library of
>collection classes that's part of the C++ standard; that doesn't
>necessarily mean that I want to see them as part of the STL.
>
>The whole point of the STL is that different data structures look
>nearly identical, that the same sort of iteration is used for all of
>them, and that algorithms (for things like searching, filtering, and
>sorting) are defined and implemented independently of the underlying
>data structure.  It's entirely possible that this sort of design would
>make it very difficult to implement certain types of collections
>within the STL framework.

 I can't imagine why. All it takes to support searching and filtering
is an input iterator. That's straightforward to write for a hash table. As
to sorting, it seems to me that a hash table by its very nature defines its
own ordering, and that the notion of sorting a hash table is meaningless.
 -- Pete

Author: kanze@us-es.sel.de (James Kanze US/ESC 60/3/164 #71425)
Date: 26 Sep 1994 17:32:37 GMT Raw View

In article <9426816.10230@mulga.cs.mu.OZ.AU> fjh@munta.cs.mu.OZ.AU
(Fergus Henderson) writes:

|> r-ridge@calum.csclub.uwaterloo.ca (Ross Ridge) writes:

|> >James Kanze US/ESC 60/3/164 #71425 <kanze@us-es.sel.de> wrote:

|> >>Then require the user to provide a global function `hash( Key const& )'.

|> >That's the problem.  The user has to find a good hash function for
|> >the particular hash table implementation being used.  This doesn't
|> >save the user a lot of effort.

|> I disagree.  Implementing a hash function is easy in comparison
|> to implementing a hash table (especially one with STL-compliant
|> iterators, etc.).

Just to back up what Fergus says, my generic hash table consists of
471 lines of code.  It includes a simple iterator (probably not
STL-compliant).  On the other hand, a typical hash function, even for
a complicated object, will rarely be over 10 lines.
--
James Kanze      Tel.: (+33) 88 14 49 00     email: kanze@lts.sel.alcatel.de
GABI Software, Sarl., 8 rue des Francs-Bourgeois, F-67000 Strasbourg, France
Conseils en informatique industrielle --
                              -- Beratung in industrieller Datenverarbeitung

Author: kanze@us-es.sel.de (James Kanze US/ESC 60/3/164 #71425)
Date: 26 Sep 1994 18:19:10 GMT Raw View

In article <9426911.14379@mulga.cs.mu.OZ.AU> fjh@munta.cs.mu.OZ.AU
(Fergus Henderson) writes:

|> eyala@applicom.co.il (Eyal Allalouf) writes:

|> >I've implemented a hash table with iteration. It is possible to implement
|> >it with a constant time for accessing the next element no matter how sparse
|> >the table is. (It does cost some memory overhead). I would not call that
|> >implementation trivial though.

|> Implementing any iterator at all is non-trivial but not that
|> difficult.  However, making the iterator take linear time to iterate
|> through an entire collection and hence takes amortized constant time
|> for each step is indeed trivial - the obvious implementation satisfies
|> that.  It does not require any extra memory overhead.

Are you taking into account the possibility of the element under the
iterator being deleted when you say that an iterator is not that
difficult?  I typically find that adding an iterator to a container
doubles the amount of code.
--
James Kanze      Tel.: (+33) 88 14 49 00     email: kanze@lts.sel.alcatel.de
GABI Software, Sarl., 8 rue des Francs-Bourgeois, F-67000 Strasbourg, France
Conseils en informatique industrielle --
                              -- Beratung in industrieller Datenverarbeitung

Author: kanze@us-es.sel.de (James Kanze US/ESC 60/3/164 #71425)
Date: 26 Sep 1994 18:39:06 GMT Raw View

In article <9426910.13167@mulga.cs.mu.OZ.AU> fjh@munta.cs.mu.OZ.AU
(Fergus Henderson) writes:

|> gnb@bby.com.au (Gregory Bond) writes:

|> >Bonus question:  Why is the following not a workable hash function
|> >(and I don't mean typos!)?
|> >
|> > template <class T>
|> > unsigned hash(const T& t) {
|> >  unsigned h = 0;
|> >  for (int i = 0; i < sizeof(T); i++) {
|> >   h = h * 31 ^ ((unsigned char *)&t)[i];
|> >  }
|> >  return h;
|> > }
|> >
|> >Hint: Purify users not eligible.

|> Because it doesn't necessarily satisfy `a == b implies hash(a) == hash(b)',
|> since the object might contain uninitialized or `mutable' members whose
|> contents varies even for objects which represent the "same" logical
|> entity.

Even more likely, T might contain pointers, and the data pointed to
might be relevant to the the "value" of T.  The most frequent case is
surely String.  If you use the libg++ implementation (a very simple
one), sizeof( String ) == 4 (on a Sun), since the String is a pointer,
and you are guaranteed that any two strings, regardless of value, have
different pointers.

And the reference to Purify is a red herring.

On the other hand, we're not total idiots, even if we are using C++.
--
James Kanze      Tel.: (+33) 88 14 49 00     email: kanze@lts.sel.alcatel.de
GABI Software, Sarl., 8 rue des Francs-Bourgeois, F-67000 Strasbourg, France
Conseils en informatique industrielle --
                              -- Beratung in industrieller Datenverarbeitung

Author: nagle@netcom.com (John Nagle)
Date: Mon, 26 Sep 1994 17:41:19 GMT Raw View

bs@alice.att.com (Bjarne Stroustrup <9758-26353> 0112760) writes:
>John Nagle writes:
> >     He has a point.  I regard pointer arithmetic as a holdover from
> > Kernigan and Plauger's attempt to design a language that would go fast on
> > a small PDP-11 without having to write a global optimizer.
>Huh?

>Neither Kernighan nor Plauger were major contributors to C.
     Sorry about that.
>Pointer arithmetic was an integral part of C's ancestor BCPL
>(designed by Martin Richards) as it was in almost all of the
>languages designed at that time to provide a practical alternative
>to assembler.

     I wish I still had the System 6 C manual, where the mapping of
"p++" to PDP-11 index arithmetic was discussed.  But I tossed it years ago.

     BCPL, sometimes called the British Crummy Programming Language
(no, that's not its real name) was about halfway between an assembler
and a compiler by modern standards.  The basic types were the types
of the machine, and the user was very aware of how wide the machine's
registers were.  I don't think BCPL had structures.  So you were
stuck with pointer arithmetic as a basic primitive.

     The most widely used program in BCPL is probably the Amiga OS
kernel, which is really Cambridge Tripos.  It's been noted for years
for having really obscure out-of-range type bugs.  (Anyone remember
when calling Delay with an argument of 0 overwrote track 42 on the
currently mounted floppy?)

      Pointer arithmetic made sense back then when languages were much
closer to the machine and didn't provide much hiding.  C++ hides too much
for pointer arithmetic to be a sound part of programming.  And, it seems,
pointer arithmetic has been appearing less in commercial C++ code over the
last few years.  Until now, with STL.

      And it's so unnecessary.  Containers don't need pointer arithmetic.
There have been many container implementations for C++.  Most suffer from
being pre-template, which meant some horrible hack involving either
violating type safety or huge macros was necessary to make them work.
But the architecture of some of them could be carried forward into
templates without much trouble.

      Probably Microsoft will update MFC to use template-based MFC-type
collections, Symantec and Borland will follow MFC, and everybody
will use that.

     John Nagle

     John Nagle

     John Nagle

Author: nagle@netcom.com (John Nagle)
Date: Mon, 26 Sep 1994 17:44:33 GMT Raw View

mat@mole-end.matawan.nj.us writes:

>In article <nagleCwFtKH.GCp@netcom.com>, nagle@netcom.com (John Nagle) writes:
>  ...
>>      An interesting question is whether knowingly, willfully, and
>> perhaps negligently adopting a standard which is unsafe can lead to
>> civil or criminal liability if (when?) systems based on the standard
>> fail in operation.
>>
>>      John Nagle

>Remember, there is _no_ general programming language (power of a Turing
>Machine) for which we can program in complete safety.  Quite apart from
>the problems of combinatorial complexity, we have Goedel's undecidability
>and the Halting problem.

      Wrong.  The halting problem doesn't apply to machines with finite
memory.  Also see my paper "Practical Program Verification" in ACM POPL '83.

>I suggest you go off and read _To Engineer is Human_.

      I have.  I'm arguing here for bigger safety margins.

     John Nagle

Author: kohtala@hardy.trs.ntc.nokia.com (Kohtala Marko)
Date: 26 Sep 1994 18:43:09 GMT Raw View

In article <CwLLp9.9HK@borland.com> pete@genghis.interbase.borland.com (Pete Becker) writes:
>  In article <KOHTALA.94Sep23015605@hardy.trs.ntc.nokia.com>,
>  Kohtala Marko <Marko.Kohtala@ntc.nokia.com> wrote:
...
>  >Correct me if I am wrong, but aren't hash tables just fine as an
>  >implementation for an associative container if the container is just
>  >ordered in some way according to the hashing method, whatever that is?
...
>     I don't think it's quite that simple. If I understand this correctly,
>  the argument is that a hash table is a linear container, and we can use that
>  linearity to define an ordering, and treat it as an ordered container. The
>  first problem I see is that even with the same hash function, the order of
>  the elements in the hash table depends on the size of the table.
...
> This doesn't
>  sound to me like something I'd like to refer to as an ordered container.
>   -- Pete

So it is a question of the definition of 'ordered collection'?

Ok, I read the STL again and it STL says that the Comparison object
determines the ordering relation, not any comparison operation on the
keys. This sounds fine, nothing against hash tables, but then it also
says that only the comparison object the container was constructed
from is available later. That is, if the comparison relation changes,
you can not get to know about it.

Also, your point about adding and removing elements leads to problems
elsewhere: STL says `insert does not affect the validity of iterators
and references to the (associative) container, and erase invalidates
only the iterators and references to the erased elements.'  This can
limit the number of usable implementations.

--
---
Marko Kohtala - Marko.Kohtala@ntc.nokia.com, Marko.Kohtala@hut.fi

Author: rridge@calum.csclub.uwaterloo.ca (Ross Ridge)
Date: Tue, 27 Sep 1994 02:03:33 GMT Raw View

In article <TMB.94Sep22095454@arolla.idiap.ch>,
Thomas M. Breuel <tmb@idiap.ch> wrote:
>In article <CwEHC0.1qr@undergrad.math.uwaterloo.ca> r-ridge@calum.csclub.uwaterloo.ca (Ross Ridge) writes:
>|>From a practical point of view, I believe that it is possible to
>|>define an arbitrary complete ordering for just about anything.
>|
>|This is what I mean.  Just use the hashing function F, where F(x) = x,
>|and use the result to order the values.
>
>Except that a hashing function isn't required to be a 1-1 mapping!

If a hashing function exists then there's a hashing function, F(x) = x,
that does provide a 1-1 mapping.

>|Essentially my point is the despite the fact hash tables don't require
>|an ordering of the objects being hashed they aren't more general than a
>|container class that requires an ordering of the objects.
>
>Nobody claimed hash tables were more general.

Actually, it was claimed in another thread, I'm sort of belatedly
responding to it now.

>They are simply considerably more efficient.

When implemented properly, yes.  I don't see it working well in
a general purpose library.

      Ross Ridge

--
 l/    Ross Ridge      //
[oo]   The Great HTMU, CSC President   [oo]
-()-         http://csclub.uwaterloo.ca/~rridge/  /()/
 db           +1 519 883 4329                             //

Author: rridge@calum.csclub.uwaterloo.ca (Ross Ridge)
Date: Tue, 27 Sep 1994 02:14:53 GMT Raw View

James Kanze US/ESC 60/3/164 #71425 <kanze@us-es.sel.de> wrote:
>But the solution of using a balanced tree, or some such, requires the
>user to define a total ordering on the objects.  This is no less
>difficult than finding a good hash function (more difficult given the
>below).

Sorry?  You can't have any more trivial a function than H(x) = x.

>It works very well, in fact.

Will it work well with my hash table implementation?

       Ross Ridge
--
 l/    Ross Ridge      //
[oo]   The Great HTMU, CSC President   [oo]
-()-         http://csclub.uwaterloo.ca/~rridge/  /()/
 db           +1 519 883 4329                             //

Author: fjh@munta.cs.mu.OZ.AU (Fergus Henderson)
Date: Tue, 27 Sep 1994 16:07:21 GMT Raw View

pete@genghis.interbase.borland.com (Pete Becker) writes:

>Matt Austern <matt@physics.berkeley.edu> wrote:
>>g2devi@cdf.toronto.edu (Robert N. Deviasse) writes:
>>
>>> I think that the whole problem with this discussion is that no-one has
>>> written either a formal spec for a hash table template or created an
>>> implementation.

Yes, I agree.

>>It's quite possible that adding hash tables to the STL really would
>>present serious problems.
[...]
> I can't imagine why. All it takes to support searching and filtering
>is an input iterator. That's straightforward to write for a hash table. As
>to sorting, it seems to me that a hash table by its very nature defines its
>own ordering, and that the notion of sorting a hash table is meaningless.

Yes, or to put it more technicially: sorting requires a Random Access
Iterator, whereas hash tables would only provide an Input Iterator and
an Output Iterator.

--
Fergus Henderson - fjh@munta.cs.mu.oz.au

Author: kanze@us-es.sel.de (James Kanze US/ESC 60/3/164 #71425)
Date: 27 Sep 1994 18:52:46 GMT Raw View

In article <Cwrn1x.BMA@undergrad.math.uwaterloo.ca>
rridge@calum.csclub.uwaterloo.ca (Ross Ridge) writes:

|> In article <TMB.94Sep22095454@arolla.idiap.ch>,
|> Thomas M. Breuel <tmb@idiap.ch> wrote:
|> >In article <CwEHC0.1qr@undergrad.math.uwaterloo.ca> r-ridge@calum.csclub.uwaterloo.ca (Ross Ridge) writes:
|> >|>From a practical point of view, I believe that it is possible to
|> >|>define an arbitrary complete ordering for just about anything.
|> >|
|> >|This is what I mean.  Just use the hashing function F, where F(x) = x,
|> >|and use the result to order the values.
|> >
|> >Except that a hashing function isn't required to be a 1-1 mapping!

|> If a hashing function exists then there's a hashing function, F(x) = x,
|> that does provide a 1-1 mapping.

The identity function is *not* (usually) a hashing function.  It is in
fact easy to prove that there are cases where a hashing function with
a 1-1 mapping cannot exist.

A hashing function must map to an integral value within the range of
table indexes.  This means that there are a finite number of elements
in the target domain.  However, it is possible to hash strings (for
example), although the set of all strings is infinite.

The identity function may be used as a hashing function for integral
values.  In this case, however, why use a hash table.  You can use
integers as an index into a classical array, which is even faster than
hash coding.
--
James Kanze      Tel.: (+33) 88 14 49 00     email: kanze@lts.sel.alcatel.de
GABI Software, Sarl., 8 rue des Francs-Bourgeois, F-67000 Strasbourg, France
Conseils en informatique industrielle --
                              -- Beratung in industrieller Datenverarbeitung

Author: kanze@us-es.sel.de (James Kanze US/ESC 60/3/164 #71425)
Date: 27 Sep 1994 19:01:34 GMT Raw View

In article <CwrnKt.C3s@undergrad.math.uwaterloo.ca>
rridge@calum.csclub.uwaterloo.ca (Ross Ridge) writes:

|> James Kanze US/ESC 60/3/164 #71425 <kanze@us-es.sel.de> wrote:
|> >But the solution of using a balanced tree, or some such, requires the
|> >user to define a total ordering on the objects.  This is no less
|> >difficult than finding a good hash function (more difficult given the
|> >below).

|> Sorry?  You can't have any more trivial a function than H(x) = x.

And what does that buy us.  The identity function is neither a hash
function, nor does it define a total ordering on the objects.

|> >It works very well, in fact.

|> Will it work well with my hash table implementation?

Supposing you are refering to my proposed general hash function (since
so much has been cut, it is difficult to know):  it will work well
with all hash table implementations which do not require multiple
hashing.  It would be simple to extend it to support multiple hashing,
but this would complicate the interface, and I haven't needed multiple
hashing yet.  (I generally use an open hash table, and make it big
enough that I don't have to worry about clustering.)
--
James Kanze      Tel.: (+33) 88 14 49 00     email: kanze@lts.sel.alcatel.de
GABI Software, Sarl., 8 rue des Francs-Bourgeois, F-67000 Strasbourg, France
Conseils en informatique industrielle --
                              -- Beratung in industrieller Datenverarbeitung

Author: simon@sco.COM (Simon Tooke)
Date: Wed, 28 Sep 1994 14:04:38 GMT Raw View

In <rfgCwpI5J.37D@netcom.com> rfg@netcom.com (Ronald F. Guilmette) writes:

>In article <TMB.94Sep23160653@arolla.idiap.ch> tmb@idiap.ch writes:
>>|>Is there some fundamental reason (other than performance) for
>>|>specifying "exponential doubling?"
>>|
>>|No, but performance *is* a fundamental reason.  Of course the performance
>>|constraints should be more abstract and should not specify the implementation,
>>|but in this case I think they effectively do anyway.
>>
>>Leaving implementations unspecified just for the heck of it, without a
>>sound technical rationale, is a really bad idea...

>I agree.  In fact I would go further and say that doing _anything_ (as
>far as a language standard is concerned) without a sound technical
>rationale is a really bad idea.

>P.S.  For the benefit of those who are not already aware of it, there will
>be _no_ rationale document for the ANSI/ISO C++ standard.  The first order
>reason for this is that no one wanted to write one.  Determination of the
>second order reason(s) (i.e. the reason(s) why no one wanted to write one)
>is left as an exercize for the reader.

Incorrect.  Many people (including myself) wanted to write one.
The problem was time constraints.

-simon tooke

===============================================================================
Simon Tooke  (not speaking for) SCO Canada, Inc.         Voice:  (416) 960-4041
....!scocan!simon             simon@sco.com                Fax:  (416) 922-2704
130 Bloor St. West. Suite 1001, Toronto, Ontario, Canada  M5S 1N5

Author: pete@genghis.interbase.borland.com (Pete Becker)
Date: Thu, 22 Sep 1994 23:13:25 GMT Raw View

In article <KANZE.94Sep22154411@slsvhdt.us-es.sel.de>,
James Kanze US/ESC 60/3/164 #71425 <kanze@us-es.sel.de> wrote:
>In article <CwHuyr.94v@borland.com> pete@genghis.interbase.borland.com
>(Pete Becker) writes:
>
>|>  STL does not prevent you from using hash tables to implement
>|> associative containers. It prevents you from claiming that such an
>|> implementation conforms to STL's specification.
>
>    [...]
>
>|>  STL "decrees" nothing. It SPECIFIES how its containers operate. You
>|> are free to use containers that operate differently if you want to.
>
>But in article <35po1l$bda@seattle.polstra.com> jdp@polstra.com (John
>Polstra) wrote:
>
>|> It goes beyond that.  It places requirements on associative containers
>|> which force them to be implemented as ordered containers.  Permit me to
>|> quote from the most recent (August 18, 1994) version of the STL
>|> specification, Section 8.2 "Associative Containers":
>
>|>     The fundamental property of iterators of associative containers is
>|>     that they iterate through the containers in the non-descending order
>|>     of keys where non-descending is defined by the comparison that was
>|>     used to construct them. [page 32]
>
>|>     All of them [associative containers] are parameterized on Key and an
>|>     ordering relation Compare that induces a total ordering on elements
>|>     of Key. [page 30]
>
>|> Also, look at the "Associative container requirements", Table 12, page 32.
>|> The following operations are _required_ for all associative containers:
>
>|>     a.lower_bound(k)
>|>  Returns an iterator pointing to the first element with key not
>|>  less than k.  Complexity: logarithmic.
>
>|>     a.upper_bound(k)
>|>  Returns an iterator pointing to the first element with key
>|>  greater than k.  Complexity: logarithmic.
>
>I think that John's point is valid.  STL doesn't just provide
>containers.  As I understand it, it provides a number of template
>functions to manipulate containers (and their elements).  I am a bit
>disappointed that it doesn't implement hash coding (which I generally
>prefer over tree structures), but I can accept it.  I find it much
>more serious that it cannot manipulate such containers with its
>template functions.  This isn't just saying, I don't offer hash
>tables; this is saying, I don't want you to use hash tables.

 I don't see the problem. There are, of course, some algorithms that
shouldn't be applied to hash tables. quick_sort() for one, and
random_shuffle() for antoher. But aside from that issue of appropriateness,
if a hash table provides an iterator that meets the requirements of STL,
the STL algorithms can be used on that hash table. Off the top of my head,
it seems straightforward to provide a bidirectional iterator, although it
should probably be const in all cases, since a hash table, like a sorted
container, has its own idea of where new items should go. But once you've
got a bidirectional iterator, you can copy the contents of the hash table,
look for items that meet certain criteria, etc., all using STL's implementation
of its algorithms.

>
>Wouldn't it be preferable to define two types of associative
>containers, ordered and unordered.  Although the names sound
>exclusive, I think that we can agree that the ordered associative
>containers subsumes the unordered ones.  (Every ordered container is
>also an unordered one.)  The STL is then free to offer only the
>ordered ones, but for the functions where an unordered one makes
>sense, I can use one of my own (or Booch's, for example).

 I don't see the point. Any container you want to write can be made
to work with STL's algorithms, provided you can implement the appropriate
iterators. This is independent of whether its sorted or unsorted, associative,
or whatever. If you can supply the iterator, you can use the algorithm.
 -- Pete

Author: tmb@arolla.idiap.ch (Thomas M. Breuel)
Date: 23 Sep 1994 14:06:53 GMT Raw View

|>Is there some fundamental reason (other than performance) for
|>specifying "exponential doubling?"
|
|No, but performance *is* a fundamental reason.  Of course the performance
|constraints should be more abstract and should not specify the implementation,
|but in this case I think they effectively do anyway.

Leaving implementations unspecified just for the heck of it, without a
sound technical rationale, is a really bad idea.  Probably,
programmers that do that have misunderstood something in their S/E
classes.  The only reason for leaving implementations unspecified is
if there is a reasonable expectations that the permissible choices
will allow the implementor to deliver substantially better performance
in different environments.  For generic collection classes (on serial
machines), that kind of expectation is unreasonable.

And, of course, specifying implementations doesn't prevent you from
also having abstract interfaces: you may still give me ten different
implementations of stacks, all with the same abstract interface, but
each also with minutely specified implementations.

    Thomas.

Author: pete@genghis.interbase.borland.com (Pete Becker)
Date: Fri, 23 Sep 1994 16:35:01 GMT Raw View

In article <1994Sep23.103449.7701@tid.es>,
Pascual Juan <pascual@peggy.tid.es> wrote:
>In article <CwG8wK.Ax0@borland.com>, pete@genghis.interbase.borland.com (Pete Becker) writes:
>|> In article <nagleCwFtFK.G3r@netcom.com>, John Nagle <nagle@netcom.com> wrote:
>|> >[...]
>|> >      Yes.  And those rules aren't documented.
>|>
>|>  On the contrary: they are clearly documented. See the discussion of
>|> insert iterators. The iterator you get from a container when you call begin()
>|> overwrites. If you want to insert instead, create an insert iterator from the
>|> iterator that the container gives you.
>
>I think STL has a very good design indeed, and I think it has been a good
>decision to include it on the standard, but it is far to be *clearly*
>documented. Its documentation is a *formal* (almost mathematical) description
>about the desired behaviour, not as a programmer's guide. I guess it will
>soon appear lots of more didactic documentation.
>

 That's true: the standard will not be a programmer's guide, just as
the ANSI and ISO C standards are not programmer's guides. And, yes, there
will be quite a bit written about the new language features, just as there
was when the C standard was developed.
 -- Pete

Author: pete@genghis.interbase.borland.com (Pete Becker)
Date: Fri, 23 Sep 1994 19:48:45 GMT Raw View

In article <KOHTALA.94Sep23015605@hardy.trs.ntc.nokia.com>,
Kohtala Marko <Marko.Kohtala@ntc.nokia.com> wrote:
>In article <CwHuyr.94v@borland.com> pete@genghis.interbase.borland.com (Pete Becker) writes:
>
>>  STL does not prevent you from using hash tables to implement
>> associative containers. It prevents you from claiming that such an
>> implementation conforms to STL's specification.
>
>I seem to have missed anything that supports this last claim.
>
>Correct me if I am wrong, but aren't hash tables just fine as an
>implementation for an associative container if the container is just
>ordered in some way according to the hashing method, whatever that is?
>Reading the requirements it seems that with some hashing methods only
>problem is providing a type X::key_compare which properly tells the
>order in which two items are found in the container when it is
>iterated through. If that is the problem, do not use that hashing
>method!
>


 I don't think it's quite that simple. If I understand this correctly,
the argument is that a hash table is a linear container, and we can use that
linearity to define an ordering, and treat it as an ordered container. The
first problem I see is that even with the same hash function, the order of
the elements in the hash table depends on the size of the table. For example,
if the hash function returns values from 0 to 99, a table of size 10 will
typically map an object whose hash value is 9 to its 9th slot, and an object
whose hash value is 10 to its 0th slot. If the size of the table is 20, these
go to 9 and 10 instead, so their order has changed. Which means that the
ordering relationship depends on the details of the table in addition to the
details of the objects being put into the table. Thus, we don't have an
ordering of objects, but of objects relative to a particular table. And if
the table is resized, the ordering of the objects changes. This doesn't
sound to me like something I'd like to refer to as an ordered container.
 -- Pete

Author: pascual@peggy.tid.es (Pascual Juan)
Date: Fri, 23 Sep 1994 10:34:49 GMT Raw View

In article <CwG8wK.Ax0@borland.com>, pete@genghis.interbase.borland.com (Pete Becker) writes:
|> In article <nagleCwFtFK.G3r@netcom.com>, John Nagle <nagle@netcom.com> wrote:
|> >[...]
|> >      Yes.  And those rules aren't documented.
|>
|>  On the contrary: they are clearly documented. See the discussion of
|> insert iterators. The iterator you get from a container when you call begin()
|> overwrites. If you want to insert instead, create an insert iterator from the
|> iterator that the container gives you.

I think STL has a very good design indeed, and I think it has been a good
decision to include it on the standard, but it is far to be *clearly*
documented. Its documentation is a *formal* (almost mathematical) description
about the desired behaviour, not as a programmer's guide. I guess it will
soon appear lots of more didactic documentation.

--
-------------------------------------------------------------------------------
 ||||  ##     ####                    |         Pascual Juan         |    _V_
 ||||  ## _|_ ## ##   Telefonica I+D  | E-mail: pascual@gonzo.tid.es |  _(,|,)`
 @@@@  ##  |  ## ##  (Telefonica R&D) |    Phone: +34-1-337-47-04    | | ___ ')
 @@@@  ##     ####                    |    fax:   +34-1-337-42-22    | |_|`__/
-------------------------------------------------------------------------------

Author: fjh@munta.cs.mu.OZ.AU (Fergus Henderson)
Date: Sat, 24 Sep 1994 07:25:29 GMT Raw View

tmb@arolla.idiap.ch (Thomas M. Breuel) writes:

>... if there is going to be a high-quality, free, portable
>implementation that every vendor can use, I don't see the point of
>mandating STL in the standard.

I don't think there was any realistic likelihood of having a
high-quality, free, portable implementation of STL until *after* the
proposal to incorporate STL in the standard.  I'm sure that was what
motivated Mike Stump to start a free implementation, and I don't think
HP would have made their implementation free if STL had been rejected
by the ANSI/ISO C++ committee.  And if the C++ committee had rejected
STL, I don't think many vendors would use it.

--
Fergus Henderson - fjh@munta.cs.mu.oz.au

Author: fjh@munta.cs.mu.OZ.AU (Fergus Henderson)
Date: Sat, 24 Sep 1994 07:26:56 GMT Raw View

r-ridge@calum.csclub.uwaterloo.ca (Ross Ridge) writes:

>Essentially my point is the despite the fact hash tables don't require
>an ordering of the objects being hashed they aren't more general than a
>container class that requires an ordering of the objects.

I don't think the issue is generality - it's efficiency.

--
Fergus Henderson - fjh@munta.cs.mu.oz.au

Author: nagle@netcom.com (John Nagle)
Date: Sat, 24 Sep 1994 17:56:12 GMT Raw View

pete@genghis.interbase.borland.com (Pete Becker) writes:
>In article <nagleCwHnpC.76B@netcom.com>, John Nagle <nagle@netcom.com> wrote:
>>pete@genghis.interbase.borland.com (Pete Becker) writes:
>>>In article <nagleCwFtFK.G3r@netcom.com>, John Nagle <nagle@netcom.com> wrote:
>>>>     Shouldn't storing into the object of a non-output iterator be an
>>>>error?
>>>>>Different containers have different rules about when erase and insert
>>>>>invalidate interators.
>>>>      Yes.  And those rules aren't documented.
>>> On the contrary: they are clearly documented. See the discussion of
>>>insert iterators.
>>      Where? It's not in section 11.2.2, "Insert Iterators".  The pre- and
>>post- condition table for sequences (table 10) does not document any
>>invalidation conditions for iterators for "insert" operations.
> In my copy of the STL specification, section 11.2.2 begins with the
>following paragraph:
> To make it possible to deal with insertion in the same way as
> writing into an array, a special kind of iterator adaptors, (sic)
 ...

     Actually, it turns out that Pete Becker is right about insert
iterators, but the correct reference is section 8.2, "Associative
iterators", paragraph 6: "insert does not affect the validity of
iterators and references to the container".

     This is something you have to be careful about when writing
STL classes.  Replacement and insertion can't move objects.  So
Smalltalk-like collection implementations, with arrays, hashing, and
object moving on array expansion, are prohibited for STL.  This
is OK as long as everybody knows it.

     One assumes that the "mutating sequence operations" also don't
affect the validity of iterators and references to the container, but
the STL spec doesn't say.  That should be nailed down.

     John Nagle

Author: r-ridge@calum.csclub.uwaterloo.ca (Ross Ridge)
Date: Wed, 21 Sep 1994 04:01:30 GMT Raw View

James Kanze US/ESC 60/3/164 #71425 <kanze@us-es.sel.de> wrote:
>What's wrong with simply:
>
> template< class Key , class Value >
> class AssocArray ...
>
>Then require the user to provide a global function `hash( Key const& )'.

That's the problem.  The user has to find a good hash function for
the particular hash table implementation being used.  This doesn't
save the user a lot of effort.

>And maybe even something along the lines of the following:
>
> template< class T >
> inline unsigned
> hashElem( unsigned h , T const& obj )
> {
>     return 2047 * h + (unsigned)obj ;
> }
>
>This makes it easy for the user to provide a good hash function; all
>he has to do is be able to cut his object up into pieces, with each
>piece convertable to an unsigned.

I'm a bit skeptical about how well this would work.

       Ross Ridge

--
 l/    Ross Ridge      //
[oo]   The Great HTMU, CSC President   [oo]
-()-        http://csclub.uwaterloo.ca/u/r-ridge/  /()/
 db           +1 519 883 4329                             //

Author: r-ridge@calum.csclub.uwaterloo.ca (Ross Ridge)
Date: Wed, 21 Sep 1994 04:08:54 GMT Raw View

Ross Ridge <r-ridge@calum.csclub.uwaterloo.ca> wrote:
> Considering the fact that anything you can hash you can
> order, I don't think STL is missing anything crucial.  Put another way,
> ordered collections are more general purpose than hash tables.

John Polstra <jdp@polstra.com> wrote:
>So what?  Doubly-linked lists are more general than singly-linked lists,
>but that doesn't mean I never want to use singly-linked lists.  There
>are genuine and important performance considerations that you are
>ignoring entirely:
>
>    * Searching a properly-sized hash table is O(1).  Searching a tree
>      is O(log n).
>
>    * A tree has, minimum, 2 pointers worth of overhead for every node.
>      In practice, it usually has more.  (For example, a Red-black tree
>      needs a flag bit to distinguish the red nodes from the black ones.)
>      In many applications, a hash table will have lower storage overhead
>      than a tree.

I'm not ingoring them.  The problem, with respect to STL, is that you
can't guarantee any of this.

>There are valid reasons why a programmer might choose to use a hash table
>rather than a tree-based container.  The language *standard* should not
>preclude making this kind of trade-off.

It doesn't.  It leaves the programmer with probably the best way to
create a hash table, use a custom built hash table designed to work
with data being stored.

>> I also think hash tables are outside the scope of STL.  They are hard
>> to iterate over and don't provide any guarantees about performance.
>
>They are trivial to iterate over.

Not as trivial as any of the other iterators.  You need to do a two
level iteration, and again with no guarantees of performance.

>The only reason they would be hard to iterate over under STL is
>because STL decrees that all containers are ordered containers.

No.

       Ross Ridge

--
 l/    Ross Ridge      //
[oo]   The Great HTMU, CSC President   [oo]
-()-        http://csclub.uwaterloo.ca/u/r-ridge/  /()/
 db           +1 519 883 4329                             //

Author: pete@genghis.interbase.borland.com (Pete Becker)
Date: Tue, 20 Sep 1994 23:14:48 GMT Raw View

In article <35n633$6p1@seattle.polstra.com>,
John Polstra <jdp@polstra.com> wrote:
>In article <CwB1D7.70F@undergrad.math.uwaterloo.ca>,
>Ross Ridge <r-ridge@calum.csclub.uwaterloo.ca> wrote:
>> Considering the fact that anything you can hash you can
>> order, I don't think STL is missing anything crucial.  Put another way,
>> ordered collections are more general purpose than hash tables.
>
>So what?  Doubly-linked lists are more general than singly-linked lists,
>but that doesn't mean I never want to use singly-linked lists.  There
>are genuine and important performance considerations that you are
>ignoring entirely:
>
>    * Searching a properly-sized hash table is O(1).  Searching a tree
>      is O(log n).
>
>    * A tree has, minimum, 2 pointers worth of overhead for every node.
>      In practice, it usually has more.  (For example, a Red-black tree
>      needs a flag bit to distinguish the red nodes from the black ones.)
>      In many applications, a hash table will have lower storage overhead
>      than a tree.
>
>There are valid reasons why a programmer might choose to use a hash table
>rather than a tree-based container.  The language *standard* should not
>preclude making this kind of trade-off.
>

 No, it shouldn't. And if STL is adopted as currently defined, it
won't. There is nothing in STL that prevents you from using hash tables.
The issue here is not preclusion, but simplification. If STL provides hash
tables they become easier to use.

>> I also think hash tables are outside the scope of STL.  They are hard
>> to iterate over and don't provide any guarantees about performance.
>
>They are trivial to iterate over.  The only reason they would be hard
>to iterate over under STL is because STL decrees that all containers are
>ordered containers.

 STL does not "decree that all containers are ordered containers". It
provides several sequence containers that are not ordered unless you choose
to sort them. It provides several associative containers that are ordered.
 -- Pete

Author: pete@genghis.interbase.borland.com (Pete Becker)
Date: Tue, 20 Sep 1994 22:24:19 GMT Raw View

In article <nagleCwFtFK.G3r@netcom.com>, John Nagle <nagle@netcom.com> wrote:
>r-ridge@calum.csclub.uwaterloo.ca (Ross Ridge) writes:
>>dag@control.lth.se (Dag Bruck) writes:
> [Wrong.  I (Nagle) wrote the paragraph below.]
>>>      One of the basic problems with pointers in C is that they are used
>>>both to refer to arrays and to elements of arrays.  STL iterators
>>>generalize this ambiguity to collections other than arrays.  This
>>>creates some serious problems.  Does an iterator point to an object in
>>>a container or to a slot in which an object goes?  If an iterator points
>>>to an object in a container, and you replace that object with another,
>>>does the iterator now refer to the old object or the new one?
>
>>It depends on how it's done.  If you do "*iter = x" and iter isn't a
>>output iter, then iter still points to the same place after the
>>assignment.
>     Shouldn't storing into the object of a non-output iterator be an
>error?  In some cases, this will be a compile-time error, because
>"*iter" may be const.  The STL spec is totally silent on error checking,
>an omission which will probably cause many thousands of program crashes
>over the next few years.
>

 No. Attempting to store into a const iterator should be an error.
An output iterator is clearly defined in the STL documentation. It would
make this discussion much clearer if people took the time to read the
documentation, so that they knew what these terms mean rather than guessing.

>>If you replace it by erasing it and inserting something in
>>it's place then any iterators pointing to the container may not be
>>valid anymore.
>
>>>Is that behavior the same for all containers?
>>Different containers have different rules about when erase and insert
>>invalidate interators.
>      Yes.  And those rules aren't documented.

 On the contrary: they are clearly documented. See the discussion of
insert iterators. The iterator you get from a container when you call begin()
overwrites. If you want to insert instead, create an insert iterator from the
iterator that the container gives you.
 -- Pete

Author: hall_j@sat.mot.com (Joseph Hall)
Date: Tue, 20 Sep 1994 23:30:38 GMT Raw View

Seems it was r-ridge@calum.csclub.uwaterloo.ca (Ross Ridge) who said:
>Thomas M. Breuel <tmb@idiap.ch> wrote:
>>-- STL doesn't seem to provide hash tables at all; instead, it
>>provides red-black trees.  Now, first of all, I think hash tables are
>>crucially important (awk and perl get by essentially with just hash
>>tables), much more so than ordered collections.
>
>AWK and Perl(*) get by with just hash tables because they only hash one
>type, strings.  Considering the fact that anything you can hash you can
>order, I don't think STL is missing anything crucial.  Put another way,
>ordered collections are more general purpose than hash tables.

But lookups in an efficiently implemented hash table are several times
faster than in an efficiently implemented tree structure.  I doubt that
you will find many compiler writers who rely heavily on tree-based
symbol tables for production compilers.

>I also think hash tables are outside the scope of STL.  They are hard
>to iterate over and don't provide any guarantees about performance.
>
>>Second, even for ordered collections, red-black trees are a poor
>>choice.
>
>I don't think there is anything STL that requires that the
>implementation use red-black trees.  My own prototype implementation
>uses splay trees.

I've come to use skip lists a lot myself.  Another handy all-purpose
data structure with at least decent performance in most situations.
With node weights it comes pretty close to being the all-talking,
all-singing, all-dancing universal abstract type implementation
structure.

--
Joseph Nathan Hall | Signal to Inform, Signal to Warn, but never Signal
Software Architect | to Ask. -- When in Philly, Drive Like Someone from Jersey
Gorca Systems Inc. |                 joseph@joebloe.maple-shade.nj.us (home)
(on assignment)    | (602) 732-2549 (work)  Joseph_Hall-SC052C@email.mot.com

Author: r-ridge@calum.csclub.uwaterloo.ca (Ross Ridge)
Date: Wed, 21 Sep 1994 04:23:48 GMT Raw View

>>dag@control.lth.se (Dag Bruck) writes:
> [Wrong.  I (Nagle) wrote the paragraph below.]

Sorry.

John Nagle <nagle@netcom.com> wrote:
>      One of the basic problems with pointers in C is that they are used
>both to refer to arrays and to elements of arrays.  STL iterators
>generalize this ambiguity to collections other than arrays.  This
>creates some serious problems.  Does an iterator point to an object in
>a container or to a slot in which an object goes?  If an iterator points
>to an object in a container, and you replace that object with another,
>does the iterator now refer to the old object or the new one?

r-ridge@calum.csclub.uwaterloo.ca (Ross Ridge) writes:
>It depends on how it's done.  If you do "*iter = x" and iter isn't a
>output iter, then iter still points to the same place after the
>assignment.

John Nagle <nagle@netcom.com> wrote:
>     Shouldn't storing into the object of a non-output iterator be an
>error?

Non-output iterator in the sense of being a forward, bidirectional
or random accessor iterator.  You can't store into an input iterator,
and "*iter = x" has peculiar semantics with output iterators.

>In some cases, this will be a compile-time error, because "*iter" may
be const.

Sure, just like pointers.

>The STL spec is totally silent on error checking, an omission which
>will probably cause many thousands of program crashes over the next few
>years.

Again, just like pointers.

>>>Is that behavior the same for all containers?
>>Different containers have different rules about when erase and insert
>>invalidate interators.
>Yes.  And those rules aren't documented.

They are documented.  Where do you think I found out about them?

>>>This is where the generalization of the pointer metaphor breaks down.
>>I don't see how.
>No, you don't.

Please don't try to aggravate me by stating the obvious.  Care to
substaintiate your statement instead?

>      When the same operation behaves differently for different types of
>containers, the illusion of uniform behavior has broken down.

Assume the worse case then.  Still doesn't break the pointer metaphor,
if that's your point here.

Where the pointer metaphor does in fact break down is when you try do
things subtracting non-random access interators.  Iterators in fact
support a subset of pointers operations.

      Ross Ridge

--
 l/    Ross Ridge      //
[oo]   The Great HTMU, CSC President   [oo]
-()-        http://csclub.uwaterloo.ca/u/r-ridge/  /()/
 db           +1 519 883 4329                             //

Author: kanze@us-es.sel.de (James Kanze US/ESC 60/3/164 #71425)
Date: 22 Sep 1994 13:44:11 GMT Raw View

In article <CwHuyr.94v@borland.com> pete@genghis.interbase.borland.com
(Pete Becker) writes:

|>  STL does not prevent you from using hash tables to implement
|> associative containers. It prevents you from claiming that such an
|> implementation conforms to STL's specification.

    [...]

|>  STL "decrees" nothing. It SPECIFIES how its containers operate. You
|> are free to use containers that operate differently if you want to.

But in article <35po1l$bda@seattle.polstra.com> jdp@polstra.com (John
Polstra) wrote:

|> It goes beyond that.  It places requirements on associative containers
|> which force them to be implemented as ordered containers.  Permit me to
|> quote from the most recent (August 18, 1994) version of the STL
|> specification, Section 8.2 "Associative Containers":

|>     The fundamental property of iterators of associative containers is
|>     that they iterate through the containers in the non-descending order
|>     of keys where non-descending is defined by the comparison that was
|>     used to construct them. [page 32]

|>     All of them [associative containers] are parameterized on Key and an
|>     ordering relation Compare that induces a total ordering on elements
|>     of Key. [page 30]

|> Also, look at the "Associative container requirements", Table 12, page 32.
|> The following operations are _required_ for all associative containers:

|>     a.lower_bound(k)
|>  Returns an iterator pointing to the first element with key not
|>  less than k.  Complexity: logarithmic.

|>     a.upper_bound(k)
|>  Returns an iterator pointing to the first element with key
|>  greater than k.  Complexity: logarithmic.

I think that John's point is valid.  STL doesn't just provide
containers.  As I understand it, it provides a number of template
functions to manipulate containers (and their elements).  I am a bit
disappointed that it doesn't implement hash coding (which I generally
prefer over tree structures), but I can accept it.  I find it much
more serious that it cannot manipulate such containers with its
template functions.  This isn't just saying, I don't offer hash
tables; this is saying, I don't want you to use hash tables.

Wouldn't it be preferable to define two types of associative
containers, ordered and unordered.  Although the names sound
exclusive, I think that we can agree that the ordered associative
containers subsumes the unordered ones.  (Every ordered container is
also an unordered one.)  The STL is then free to offer only the
ordered ones, but for the functions where an unordered one makes
sense, I can use one of my own (or Booch's, for example).
--
James Kanze      Tel.: (+33) 88 14 49 00     email: kanze@lts.sel.alcatel.de
GABI Software, Sarl., 8 rue des Francs-Bourgeois, F-67000 Strasbourg, France
Conseils en informatique industrielle --
                              -- Beratung in industrieller Datenverarbeitung

Author: pete@genghis.interbase.borland.com (Pete Becker)
Date: Thu, 22 Sep 1994 15:22:17 GMT Raw View

In article <nagleCwHnpC.76B@netcom.com>, John Nagle <nagle@netcom.com> wrote:
>pete@genghis.interbase.borland.com (Pete Becker) writes:
>>In article <nagleCwFtFK.G3r@netcom.com>, John Nagle <nagle@netcom.com> wrote:
>>>     Shouldn't storing into the object of a non-output iterator be an
>>>error?  In some cases, this will be a compile-time error, because
>>>"*iter" may be const.  The STL spec is totally silent on error checking,
>>>an omission which will probably cause many thousands of program crashes
>>>over the next few years.
>>>
>
>> No. Attempting to store into a const iterator should be an error.
>>An output iterator is clearly defined in the STL documentation. It would
>>make this discussion much clearer if people took the time to read the
>>documentation, so that they knew what these terms mean rather than guessing.
>
>>>>If you replace it by erasing it and inserting something in
>>>>it's place then any iterators pointing to the container may not be
>>>>valid anymore.
>>>
>>>>>Is that behavior the same for all containers?
>>>>Different containers have different rules about when erase and insert
>>>>invalidate interators.
>>>      Yes.  And those rules aren't documented.
>
>> On the contrary: they are clearly documented. See the discussion of
>>insert iterators.
>      Where? It's not in section 11.2.2, "Insert Iterators".  The pre- and
>post- condition table for sequences (table 10) does not document any
>invalidation conditions for iterators for "insert" operations.
>

 In my copy of the STL specification, section 11.2.2 begins with the
following paragraph:

 To make it possible to deal with insertion in the same way as
 writing into an array, a special kind of iterator adaptors,
 called insert iterators, are provided in the library. With regular
 iterator classes,

  while( first != last ) *result++ = *first++;

 causes a range [first, last) to be copied into a range starting
 with result. The same code with result being an insert iterator
 will insert corresponding elements into the container. This
 device allows all of the copying algorithms in the library to
 work in the insert mode instead of the regular overwrite mode.

I don't see anything at all vague in the last sentence: all of the copying
algorithms work in overwrite mode unless you give them an insert iterator.
 That text could probably be put in a different place, which might
make it easier to find. But that does not make it vague.

>      That's one of the problems with this spec; when the going gets tough,
>the spec gets vague.

 The cited example clearly does not support this claim.
 -- Pete

Author: pete@genghis.interbase.borland.com (Pete Becker)
Date: Thu, 22 Sep 1994 15:28:59 GMT Raw View

In article <nagleCwHnpC.76B@netcom.com>, John Nagle <nagle@netcom.com> wrote:
>
>      But this is a lesser issue.  The real problems revolve around
>"off-the-end" values.  There's no way to test an iterator for being
>"off-the-end" or "singular" once one is in one of those states.
>
>      "Singular" iterators seem unnecessary.  Do we really need to
>deliberately replicate the semantics of uninitialized pointers?
>That's what default constructors exist to prevent.  See the discussion
>at the bottom of page 6.
>
>      There should be at least a predicate
>
> a.dereferenceable(p)
>
>where a is a container and p is an iterator pointing to that container.
>"a.dereferenceable(p)" should return "true" when "*p" is meaningful.
>(This might be called "valid" instead of "dereferenceable", but the STL
>spec uses the term "dereferencable".) This is easy to implement (although
>the implementation is different for different container types) and
>doesn't imply any ovehead when not used.

 And in general it's O(n) to compute. Seems like a pretty high price
to pay to determine whether you're lost. But, once again, if you think this
is a useful thing (I don't), you are free to add it to your containers. It
is simple to implement, and doesn't imply any overhead when not used.

>
>       Overall, I think the big mistake with STL was trying to replicate
>C pointer semantics, instead of providing wrappers for the built-in
>types with better semantics.
>

 This is really the crux of the issue, isn't it? If you disagree with
the fundamental design assumption of STL, you'll never like it.
 -- Pete

Author: doug@monet.ads.com (Doug Morgan)
Date: 22 Sep 94 08:23:15 Raw View

In article <KANZE.94Sep22154411@slsvhdt.us-es.sel.de> kanze@us-es.sel.de (James Kanze US/ESC 60/3/164 #71425) writes:
|...
|I think that John's point is valid.  STL doesn't just provide
|containers.  As I understand it, it provides a number of template
|functions to manipulate containers (and their elements).  I am a bit
|disappointed that it doesn't implement hash coding (which I generally
|prefer over tree structures), but I can accept it.  I find it much
|more serious that it cannot manipulate such containers with its
|template functions.  This isn't just saying, I don't offer hash
|tables; this is saying, I don't want you to use hash tables.

STL doesn't say quite that.  You would implement hash tables as
containers (they certainly can't be associative containers as defined
by STL) with forward iterators.  The iterators could actually be
bidirectional, but I'm not sure the extra generality would be as
useful as it would be confusing.  The many algorithm template
functions that work with forward iterators would work just fine with
your hash table iterators.  Unusual semantics would naturally occur
for algorithms with an unsatisfied sequence-ordering-is-significant
assumption (such as lexicographical_compare).

Other than that, I share both your disappointment and grudging
acceptance.

|Wouldn't it be preferable to define two types of associative
|containers, ordered and unordered.  Although the names sound
|exclusive, I think that we can agree that the ordered associative
|containers subsumes the unordered ones.  (Every ordered container is
|also an unordered one.)  The STL is then free to offer only the
|ordered ones, but for the functions where an unordered one makes
|sense, I can use one of my own (or Booch's, for example).

Sounds good.  (Note, I think your usage of "subsumes" may be inverted,
but your intentions are right since the sentence "... is also an ..."
is right.)  I would also go a step farther and say that STL's
associative containers are spec'ed out as tree-based ordered
containers, so the "tree-based" part should also show up in the name.

Doug
--------------------------------------------------------------------
Doug Morgan, doug@ads.com, (415) 960-7444
Advanced Decision Systems (a division of Booz-Allen & Hamilton Inc.)
1500 Plymouth St., Mountain View, CA 94043-1230
--------------------------------------------------------------------

Author: rjl@f111.iassf.easams.com.au (Rohan LENARD)
Date: 23 Sep 1994 06:54:50 +1000 Raw View

Hi there,

In article <MATT.94Sep21161631@physics16.berkeley.edu>,
Matt Austern <matt@physics.berkeley.edu> wrote:
 >In article <35qd4q$2mv@network.ucsd.edu> mbk@inls1.ucsd.edu (Matt Kennel) writes:
 >
 >> : Then require the user to provide a global function `hash( Key const& )'.
 >>
 >> And why then a *global* function?  All you want to to make sure is that class
 >> Key supports a 'hash' method returning int.
 >
 >Actually, that's one thing that's beginning to bother me about the
 >STL: it defines an awful lot of global functions and global class
 >names.  Some of them are harmless, because their signatures are ones
 >that users are unlikely to want for their own functions, but others of
 >them are very likely to clash with user-defined functions.  I always
 >thought that pollution of the global namespace was supposed to be
 >avoided; the STL is a pretty bad offender in that regard.

This is a non issue.  The committee has made all the other standard libraries
appear in the 'std' namespace.  I'm sure they're going to make stl appear in
a namespace, so that you don't get name clashes (remember that names included
by a using declaration are overridden by user defined names).

Regards,
 Rohan
--
----------------------------------------------------------------------------
rjl@iassf.easams.com.au | All quotes can be attributed to my automated quote
Rohan Lenard            | writing tool.  Yours for just $19.95; and if you
+61-2-367-4555          | call now you'll get a free set of steak knives ...

Author: kohtala@hardy.trs.ntc.nokia.com (Kohtala Marko)
Date: 22 Sep 1994 22:16:30 GMT Raw View

In article <35pljc$b7a@seattle.polstra.com> jdp@polstra.com (John Polstra) writes:

> I agree that the nodes should be threaded, and I think the current
> implementations, with their extra doubly-linked list pointers and parent
> pointers, are, well, hobbyist-like.  But note that you don't have to
> "ignore the constraints on iterator access" in any case.

But it should be good to remember that this is a matter of
implementation, not standard. For different purposes and resons I
might want to use different implementation at different times. Even
same compilation unit can use several different implementations, I
believe, with the use of namespaces.

Doubly linked lists are very good at times.

I think STL is correct in not forcing any certain implementation, but
only give certain assertions/conditions that are enough for the use of
any implementation that is given.

A powerfull thing to remember when designing something is that any
type satisfying the requirements are as good as the STL template
instantiated. For example in situations like making a subset A of some
set B by actually returning an instance of a class that only finds the
proper elements (read: provides a class meeting iterator requirements
to do that) in the B when necessary.

I must admit that I can not be sure if excessive use of this might
create situations where the complexity constraints can not be met. The
constraints are not very severe though if you remember that algorithm
taking c*n (constant c > 0) time instead of n is still O(n)
complexity.

Most objections to STL seem to originate from lack of familiarity with
the way of programming it brings. It is an old way, just not very
common outside academic world.
--
---
Marko Kohtala - Marko.Kohtala@ntc.nokia.com, Marko.Kohtala@hut.fi

Author: dovich@sequoia.com (Steven J. Dovich)
Date: Thu, 22 Sep 1994 23:26:58 GMT Raw View

Matt Austern <matt@physics16.berkeley.edu> writes:

   In article <35qd4q$2mv@network.ucsd.edu> mbk@inls1.ucsd.edu (Matt Kennel) writes:

   > : Then require the user to provide a global function `hash( Key const& )'.
   >
   > And why then a *global* function?  All you want to to make sure is that class
   > Key supports a 'hash' method returning int.

   Actually, that's one thing that's beginning to bother me about the
   STL: it defines an awful lot of global functions and global class
   names.  Some of them are harmless, because their signatures are ones
   that users are unlikely to want for their own functions, but others of
   them are very likely to clash with user-defined functions.  I always
   thought that pollution of the global namespace was supposed to be
   avoided; the STL is a pretty bad offender in that regard.

This is where namespaces are intended to be used. Imagine a line something
like this in your source code:

 using namespace STD::STL;

/sjd

--

--
Steven J. Dovich <dovich@sequoia.com>
                                       Sequoia Systems, Inc.
                                       400 Nickerson Rd
Phone: +1 508 480-0800 x1417           Marlborough, MA  01752

Author: kanze@us-es.sel.de (James Kanze US/ESC 60/3/164 #71425)
Date: 22 Sep 1994 18:30:36 GMT Raw View

In article <35qd4q$2mv@network.ucsd.edu> mbk@inls1.ucsd.edu (Matt
Kennel) writes:

|> James Kanze US/ESC 60/3/164 #71425 (kanze@us-es.sel.de) wrote:
|> : In article <35is1d$j3j@crl4.crl.com> es@crl.com (Eric Smith) writes:

|> : |> One problem with providing associative arrays in standard container
|> : |> class libraries is the need to standardize the format of the keys.  For
|> : |> example, one container class library might reqire a key to be an array
|> : |> of char, terminated by '\0', and another might require a key to be of a
|> : |> string type provided by that library.  The best way to get around that
|> : |> problem is to overload it such that any of several common
|> : |> representations of keys can be used at will.

|> : What's wrong with simply:

|> :  template< class Key , class Value >
|> :  class AssocArray ...

|> : Then require the user to provide a global function `hash( Key const& )'.

|> And why then a *global* function?  All you want to to make sure is that class
|> Key supports a 'hash' method returning int.

And how do you do this for `int' or `double' (or `char*')?

I didn't like the global function when I designed it, but it was the
only way I could figure out to be able to define it in the same way
for the built-in types.

Today, I would probably simply instantiate the template with one more
argument, the address of a function.  This would at least leave the
possibility of using a static member function.
--
James Kanze      Tel.: (+33) 88 14 49 00     email: kanze@lts.sel.alcatel.de
GABI Software, Sarl., 8 rue des Francs-Bourgeois, F-67000 Strasbourg, France
Conseils en informatique industrielle --
                              -- Beratung in industrieller Datenverarbeitung

Author: jaf3@ritz.cec.wustl.edu (John Andrew Fingerhut)
Date: 22 Sep 1994 14:20:52 -0500 Raw View

In article <MATT.94Sep21161631@physics16.berkeley.edu>,
Matt Austern <matt@physics.berkeley.edu> wrote:
:In article <35qd4q$2mv@network.ucsd.edu> mbk@inls1.ucsd.edu (Matt Kennel) writes:
:
:> : Then require the user to provide a global function `hash( Key const& )'.
:>
:> And why then a *global* function?  All you want to to make sure is that class
:> Key supports a 'hash' method returning int.
:
:Actually, that's one thing that's beginning to bother me about the
:STL: it defines an awful lot of global functions and global class
:names.  Some of them are harmless, because their signatures are ones
:that users are unlikely to want for their own functions, but others of
:them are very likely to clash with user-defined functions.  I always
:thought that pollution of the global namespace was supposed to be
:avoided; the STL is a pretty bad offender in that regard.
:--
:
:                               --matt

Isn't that the problem that namespaces solves?

Stephen Gevers
sg3235@shelob.sbc.com

Author: kohtala@hardy.trs.ntc.nokia.com (Kohtala Marko)
Date: 22 Sep 1994 22:56:05 GMT Raw View

In article <CwHuyr.94v@borland.com> pete@genghis.interbase.borland.com (Pete Becker) writes:

>  STL does not prevent you from using hash tables to implement
> associative containers. It prevents you from claiming that such an
> implementation conforms to STL's specification.

I seem to have missed anything that supports this last claim.

Correct me if I am wrong, but aren't hash tables just fine as an
implementation for an associative container if the container is just
ordered in some way according to the hashing method, whatever that is?
Reading the requirements it seems that with some hashing methods only
problem is providing a type X::key_compare which properly tells the
order in which two items are found in the container when it is
iterated through. If that is the problem, do not use that hashing
method!

(Here I use STL library for the templates provided, and STL standard
for the requirements they conform to.)

Those templates in STL library might not do it by hashing, but the
standard does not say that you have to use them to make the
container. You just use something that satisfies the requirements and
after that it is the same which class you use as a parameter to a
function template working on that kind of container.

It is then another matter if you can implement an STL library (meaning
the one with template class set which takes the Compare parameter)
which uses hashing. I reckon it is difficult to implement hash table
that actually uses the Compare parameter.

Anyhow, the STL library will be usefull when you are building the
prototype. If you later find out that you need better algorithms and
data structures, STL standard ensures that replacing the STL library
is painless (provided you have not relied on some features of the STL
library beyond STL standard).

Also, seems it is encouraging that you write all your functions taking
these STL conforming types as parameters to be template
functions. Which is not bad at all in my opinion provided you have a
compiler which efficiently instantiates them.
--
---
Marko Kohtala - Marko.Kohtala@ntc.nokia.com, Marko.Kohtala@hut.fi

Author: fjh@munta.cs.mu.OZ.AU (Fergus Henderson)
Date: Tue, 20 Sep 1994 06:17:27 GMT Raw View

kanze@us-es.sel.de (James Kanze US/ESC 60/3/164 #71425) writes:

>tmb@arolla.idiap.ch (Thomas M. Breuel) writes:
>
>|> Looking at other languages, and from my experience, I think this
>|> is what a standard template library should support:
>
>|>  -- stacks with exponential doubling and integer subscripting
>
>Just curious.  I would have thought that the "exponential doubling"
>bit would be an implementation question, not part of the standard.

Yes and no.  The standard should just specify that appending new elements
is amortized constant time (and that accessing elements via integer
subscripting is constant time, etc.).  However, I think these constraints
probable mandate an implementation which allocates memory in
exponentially increasing chunks.  (Not necessarily doubling - a
smaller factor that 2 might well be better.)

>Is there some fundamental reason (other than performance) for
>specifying "exponential doubling?"

No, but performance *is* a fundamental reason.  Of course the performance
constraints should be more abstract and should not specify the implementation,
but in this case I think they effectively do anyway.

--
Fergus Henderson - fjh@munta.cs.mu.oz.au

Author: rfg@netcom.com (Ronald F. Guilmette)
Date: Tue, 20 Sep 1994 07:46:39 GMT Raw View

In article <nagleCwE0zx.H4u@netcom.com> nagle@netcom.com (John Nagle) writes:
>>>This is what John Nagle found scary...
>
>>Well, scary or not, my point was that it's essentially a done deal.
>
>     Not necessarily.  The standard hasn't been adopted yet.
>
>     John Nagle

Over time, the committee has shown a strong propensity for `adopting' things,
and a rather marked aversion to un-adopting things.

If you want to try to get them to un-adopt STL, be my guest.  I'd like to
watch.  It ought to be rather entertaining.

P.S.  For whatever it's worth, let me just mention that STL represents one
of those truly rare aspects of the language (and the standardization process)
about which I personally have no firm opinion either way.  (Regular readers
here will perhaps be shocked and amazed to learn that there is _anything_
about which I have no strong opinion. :-)

--

-- Ron Guilmette, Sunnyvale, CA ---------- RG Consulting -------------------
---- domain addr: rfg@netcom.com ----------- Purveyors of Compiler Test ----
---- uucp addr: ...!uunet!netcom!rfg ------- Suites and Bullet-Proof Shoes -

Author: kanze@us-es.sel.de (James Kanze US/ESC 60/3/164 #71425)
Date: 20 Sep 1994 14:02:08 GMT Raw View

In article <35is1d$j3j@crl4.crl.com> es@crl.com (Eric Smith) writes:

|> One problem with providing associative arrays in standard container
|> class libraries is the need to standardize the format of the keys.  For
|> example, one container class library might reqire a key to be an array
|> of char, terminated by '\0', and another might require a key to be of a
|> string type provided by that library.  The best way to get around that
|> problem is to overload it such that any of several common
|> representations of keys can be used at will.

What's wrong with simply:

 template< class Key , class Value >
 class AssocArray ...

Then require the user to provide a global function `hash( Key const& )'.
Alternatively, there is the solution used in the Booch components (and
probably by many others): use the address of a hash function (and a
comparison function?) as an additional template parameter.

In either case, I would expect to find good general hash functions
provided for the frequent cases (String and const char*).  And maybe
even something along the lines of the following:

 template< class T >
 inline unsigned
 hashElem( unsigned h , T const& obj )
 {
     return 2047 * h + (unsigned)obj ;
 }

This makes it easy for the user to provide a good hash function; all
he has to do is be able to cut his object up into pieces, with each
piece convertable to an unsigned.
--
James Kanze      Tel.: (+33) 88 14 49 00     email: kanze@lts.sel.alcatel.de
GABI Software, Sarl., 8 rue des Francs-Bourgeois, F-67000 Strasbourg, France
Conseils en informatique industrielle --
                              -- Beratung in industrieller Datenverarbeitung

Author: hmiller@osiris.cso.uiuc.edu (Harry Miller)
Date: 20 Sep 1994 16:06:18 GMT Raw View

wstewart@sunterm10.nmo.gtegsc.com (Wayne Stewart) writes:

>Dag Bruck (dag@control.lth.se) wrote:

>: STL standardizes a set of operations that iterators must support, for
>: example unary * for dereferencing, ++ and -- for stepping forward and
>: backward.

>I'm curious about how someone would use this concept to traverse or iterate
>thru 'non-linear' data structures such as trees or graphs, where there may
>be several options (depending on context) for 'next' or 'previous' element.
>For example, how could I offer pre-order, in-order, post-order 'iterators'
>of a general tree container?  Do STL pointer-style iterators even make
>sense in this case?  How would one know by looking at the code which
>traversal order '++' was using?

>Thanks,
>    Wayne

The STL defines Sequences as linear data structures and their iterators as
constant time access. The set/map structures are not defined as linear and
are usually implemented as a red-black tree (from the two implementations
I have seen). Container adapters are based on the linear Sequences.

This shows up a major short comming of the STL. Iterators are defined such
that only operations that can be done in constant time are allowed. I
believe this was done so that the time order of the algorithms could be
calculated easily. Unfortunately non-linear structures such as trees have
iterators that are not accessed in constant time.

One implementation of the set/map structures I looked at used a red-black
tree for its implementation. To conform to the constant time constraint
the node contained pointer to the left and right children, the parent node,
and also next and prev pointers. That is five pointers per node (very
inefficient in terms of memory useage), I know that I would be shot for
writting such a tree structure.

My suggestion (for binary trees) would be to use threaded nodes, and ignore
the constraints on iterator access. Often algoritms that are efficient for
linear structures with constant access iterators will be less efficient for
non-linear structures. These algorithms should be overloaded for tree
structures. The use of threaded trees allows for fairly fast iteration
with the best case being constant time access for leaf nodes, and the worst
case being O(h) with h as the maximum tree height. What is nice here is
that one half of the time the best case iteration is performed, one
quarter of the time the iteration takes 2O(h), one eight of the time it takes
3 O(h) time, etc.. This leads to access times that are comparable to linear
structures while allowing overloaded algorithms to perform more efficiently.

Well just a few thought about a problem I percieve with the STL. In
general though I'm fairly impressed with it.

Harry E. Miller ----------------- just a jerk lurking on the net
hmiller@osiris.cso.uiuc.edu ----- where to send the hate mail

Author: nagle@netcom.com (John Nagle)
Date: Tue, 20 Sep 1994 16:50:08 GMT Raw View

r-ridge@calum.csclub.uwaterloo.ca (Ross Ridge) writes:
>dag@control.lth.se (Dag Bruck) writes:
 [Wrong.  I (Nagle) wrote the paragraph below.]
>>      One of the basic problems with pointers in C is that they are used
>>both to refer to arrays and to elements of arrays.  STL iterators
>>generalize this ambiguity to collections other than arrays.  This
>>creates some serious problems.  Does an iterator point to an object in
>>a container or to a slot in which an object goes?  If an iterator points
>>to an object in a container, and you replace that object with another,
>>does the iterator now refer to the old object or the new one?

>It depends on how it's done.  If you do "*iter = x" and iter isn't a
>output iter, then iter still points to the same place after the
>assignment.
     Shouldn't storing into the object of a non-output iterator be an
error?  In some cases, this will be a compile-time error, because
"*iter" may be const.  The STL spec is totally silent on error checking,
an omission which will probably cause many thousands of program crashes
over the next few years.

>If you replace it by erasing it and inserting something in
>it's place then any iterators pointing to the container may not be
>valid anymore.

>>Is that behavior the same for all containers?
>Different containers have different rules about when erase and insert
>invalidate interators.
      Yes.  And those rules aren't documented.
>>This is where the generalization of the pointer metaphor breaks down.
>I don't see how.
      No, you don't.
      When the same operation behaves differently for different types of
containers, the illusion of uniform behavior has broken down.

      Compare Smalltalk collections, which have more uniform behavior.

     John Nagle

Author: jdp@polstra.com (John Polstra)
Date: 20 Sep 1994 10:28:03 -0700 Raw View

In article <CwB1D7.70F@undergrad.math.uwaterloo.ca>,
Ross Ridge <r-ridge@calum.csclub.uwaterloo.ca> wrote:
> Considering the fact that anything you can hash you can
> order, I don't think STL is missing anything crucial.  Put another way,
> ordered collections are more general purpose than hash tables.

So what?  Doubly-linked lists are more general than singly-linked lists,
but that doesn't mean I never want to use singly-linked lists.  There
are genuine and important performance considerations that you are
ignoring entirely:

    * Searching a properly-sized hash table is O(1).  Searching a tree
      is O(log n).

    * A tree has, minimum, 2 pointers worth of overhead for every node.
      In practice, it usually has more.  (For example, a Red-black tree
      needs a flag bit to distinguish the red nodes from the black ones.)
      In many applications, a hash table will have lower storage overhead
      than a tree.

There are valid reasons why a programmer might choose to use a hash table
rather than a tree-based container.  The language *standard* should not
preclude making this kind of trade-off.

> I also think hash tables are outside the scope of STL.  They are hard
> to iterate over and don't provide any guarantees about performance.

They are trivial to iterate over.  The only reason they would be hard
to iterate over under STL is because STL decrees that all containers are
ordered containers.
--
   John Polstra                                       jdp@polstra.com
   John D. Polstra & Co., Inc.                   Phone (206) 932-6482
   Seattle, Washington USA                         Fax (206) 935-1262
   "Self-knowledge is always bad news."                 -- John Barth

Author: nagle@netcom.com (John Nagle)
Date: Tue, 20 Sep 1994 16:53:05 GMT Raw View

rfg@netcom.com (Ronald F. Guilmette) writes:
>In article <nagleCwE0zx.H4u@netcom.com> nagle@netcom.com (John Nagle) writes:
>>>>This is what John Nagle found scary...
>>>Well, scary or not, my point was that it's essentially a done deal.
>>     Not necessarily.  The standard hasn't been adopted yet.
>>     John Nagle
>Over time, the committee has shown a strong propensity for `adopting' things,
>and a rather marked aversion to un-adopting things.
>If you want to try to get them to un-adopt STL, be my guest.  I'd like to
>watch.  It ought to be rather entertaining.

     An interesting question is whether knowingly, willfully, and
perhaps negligently adopting a standard which is unsafe can lead to
civil or criminal liability if (when?) systems based on the standard
fail in operation.

     John Nagle

Author: branzrpc@branz.org.nz (Rey Crisostomo)
Date: 18 Sep 1994 20:59:38 GMT Raw View

In article <KOHTALA.94Sep14220121@hardy.trs.ntc.nokia.com> Marko.Kohtala@ntc.nokia.com writes:
>...
>HP seems to be giving their implementation for free. They just want to
>see their copyright notice stamped on the documentation. The README
>says it compiles with Borland 4.0, Lucid 3.2, EDG and couple of IBM
>compiler. In time, I bet there will be more compilers supported.
>
>Also there is a GNU version in development.
>
>I think we will have a choise of good implementations.

Its not a matter of STL supporting more compilers in the future.
Its a matter of more compilers supporting the C++ features that
the STL uses.

Rey Crisostomo
branzrpc@branz.org.nz
CompuServe:100237,2164
Wellington, New Zealand

Author: r-ridge@calum.csclub.uwaterloo.ca (Ross Ridge)
Date: Mon, 19 Sep 1994 00:46:56 GMT Raw View

r-ridge@calum.csclub.uwaterloo.ca (Ross Ridge) writes:
>I also think hash tables are outside the scope of STL.

Thomas M. Breuel <tmb@idiap.ch> wrote:
>So, what _is_ the scope of STL?

General purpose iteratable containers and algorithms to use on them.

>Why does one of the key collection types, the hashtable, not fall
>within the scope of the STL?

Why did you edit out the reasons I gave?


--
 l/    Ross Ridge      //
[oo]   The Great HTMU, CSC President   [oo]
-()-        http://csclub.uwaterloo.ca/u/r-ridge/  /()/
 db           +1 519 883 4329                             //

Author: r-ridge@calum.csclub.uwaterloo.ca (Ross Ridge)
Date: Mon, 19 Sep 1994 01:01:27 GMT Raw View

dag@control.lth.se (Dag Bruck) writes:
>      One of the basic problems with pointers in C is that they are used
>both to refer to arrays and to elements of arrays.  STL iterators
>generalize this ambiguity to collections other than arrays.  This
>creates some serious problems.  Does an iterator point to an object in
>a container or to a slot in which an object goes?  If an iterator points
>to an object in a container, and you replace that object with another,
>does the iterator now refer to the old object or the new one?

It depends on how it's done.  If you do "*iter = x" and iter isn't a
output iter, then iter still points to the same place after the
assignment.  If you replace it by erasing it and inserting something in
it's place then any iterators pointing to the container may not be
valid anymore.

>Is that behavior the same for all containers?

Different containers have different rules about when erase and insert
invalidate interators.

>This is where the generalization of the pointer metaphor breaks down.

I don't see how.

       Ross Ridge

--
 l/    Ross Ridge      //
[oo]   The Great HTMU, CSC President   [oo]
-()-        http://csclub.uwaterloo.ca/u/r-ridge/  /()/
 db           +1 519 883 4329                             //

Author: es@crl.com (Eric Smith)
Date: 18 Sep 1994 19:11:57 -0700 Raw View

In article <TMB.94Sep18201807@arolla.idiap.ch>,
Thomas M. Breuel <tmb@idiap.ch> wrote:
>In article <CwB1D7.70F@undergrad.math.uwaterloo.ca> r-ridge@calum.csclub.uwaterloo.ca (Ross Ridge) writes:
>|I also think hash tables are outside the scope of STL.
>
>So, what _is_ the scope of STL?  Why does one of the key collection
>types, the hashtable, not fall within the scope of the STL?
>
>    Thomas.

A hash table is an implementation of an associative array.  A container
class library should have associative arrays rather than hash tables.
Iteration in an associative array can be at "random" or in key
sequence.  At "random" means the iteration will get all elements but
the order is meaningless.  All associative arrays should provide at
least some kind of iteration, if feasible.

One problem with providing associative arrays in standard container
class libraries is the need to standardize the format of the keys.  For
example, one container class library might reqire a key to be an array
of char, terminated by '\0', and another might require a key to be of a
string type provided by that library.  The best way to get around that
problem is to overload it such that any of several common
representations of keys can be used at will.

Associative array classes should be very flexible.  The implementation
can be whatever works, but the interface should be what fits in with
the overall container scheme best.  You should be able to use arrays of
associative arrays, without worrying about using excessive memory for
the empty associative arrays in the array.  One way to implement that
might be to implement all the associative arrays as one big hash table,
with part of the hash key determined by which associative array is
being accessed.  But that might have other problems, such as making
iteration more difficult or whatever.  That's why it takes work, and
why container class libraries are so useful, doing the work once and
making it available for reuse by everyone.

In addition to associative arrays keyed by character strings, other
types of keys should be supported too.  At the very least, you should
be able to use integers and/or pointers as keys.  By using a pointer as
a key, you can effectively use any object as a key, provided it remains
at a constant address.  Using an integer as a key gives you the
functionality of sparse arrays of integers.  With a 32-bit integer key,
you can implement an array of 1000000000 32-bit integers of which 100
are nonzero and the rest are zero, using an extremely tiny fraction of
the 4000000000 bytes implied.  That's good for certain kinds of
histograms, etc.

It's hard to imagine a good general purpose container class library
without associative arrays.  I don't know anything about STL, but would
be interested to know what is intended there.

Author: nagle@netcom.com (John Nagle)
Date: Mon, 19 Sep 1994 03:26:49 GMT Raw View

ark@alice.att.com (Andrew Koenig) writes:
>What finally swung the issue was that several heads of national
>delegations said that they would not approve a standard without
>at least some kind of data structure library.  Since there was no
>alternative to STL on the table, that essentially made STL mandatory.

     Oh.  I begin to see how this happened.

     John Nagle

Author: rfg@netcom.com (Ronald F. Guilmette)
Date: Mon, 19 Sep 1994 08:43:25 GMT Raw View

In article <CwAzzz.5J8@undergrad.math.uwaterloo.ca> r-ridge@calum.csclub.uwaterloo.ca (Ross Ridge) writes:
>
>Ronald F. Guilmette <rfg@netcom.com> wrote:
>>For the benefit of those of you who are not aware of it, these arguments
>>about whether or not STL should be in or out were all essentially rendered
>>moot by the committee's decision back in July to incorporate STL into the
>>draft standard.
>
>This is what John Nagle found scary...

Well, scary or not, my point was that it's essentially a done deal.

--

-- Ron Guilmette, Sunnyvale, CA ---------- RG Consulting -------------------
---- domain addr: rfg@netcom.com ----------- Purveyors of Compiler Test ----
---- uucp addr: ...!uunet!netcom!rfg ------- Suites and Bullet-Proof Shoes -

Author: kanze@us-es.sel.de (James Kanze US/ESC 60/3/164 #71425)
Date: 19 Sep 1994 16:55:24 GMT Raw View

In article <MATT.94Sep16102349@physics16.berkeley.edu>
matt@physics16.berkeley.edu (Matt Austern) writes:

|> In article <DAG.94Sep16081742@bellman.control.lth.se> dag@control.lth.se (Dag Bruck) writes:

|> > STL standardizes a set of operations that iterators must support, for
|> > example unary * for dereferencing, ++ and -- for stepping forward and
|> > backward.
|> >
|> > I agree that the syntax has been borrowed from pointer arithmetic, so
|> > raw pointers can be used when needed.

|> Well, it's not just a matter of syntax; there's also an important
|> symantic question here.

|> The question is this: if you have an iterator running through some
|> sort of collection, and you want to access an element that the
|> iterator is referring to, do you do this through an iterator member
|> function, or through a collector member function that takes the
|> iterator as an argument?  That is (using the most common syntactic
|> choice for each mechanism), do you write *p, or A[i]?  It's this
|> distinction that I have in mind when I refer to the distinction
|> between pointer and index symantics.

|> Why does it matter?  Well, the most obvious answer is the distinction
|> between const and non-const collections.  If you're using index
|> symantics then you can overload the access operation on const so that
|> const correctness can be checked at compile time; if you're using
|> pointer symantics, though, then I don't really see any very good way
|> of doing anything equivalent.  That's the reason why, whenever I
|> write a collection class, I always use index symantics for its
|> associated iterator.

|> I haen't yet seen the STL proposal.  Did its authors find some clever
|> way of handling the const issue?

I've only quickly scanned the STL proposal myself.  (In fact, most of
what I know about it is from a quick explination that Fergus Henderson
gave me in a single email.)  Also, all of my iterators use use index
semantics, for the same reasons you present.

Still, I can understand why STL uses pointer semantics.  Calling
functions with iterators using index semantics can be a pain, since
you have to pass both the container *and* the iterator.  (With pointer
semantics, the iterator alone suffices.)  This is particularly a
problem with template functions, since it implies a larger compatible
interface.

However, all is not lost.  I believe it is possible to have the best
of both worlds.  (Please note, however, that the following is only a
vague idea.  I haven't even begun to try it out.)

My iterators are called cursors, and the type of the iterator is
linked to a specific type of container, e.g.: DLLCursor, etc.  Even
before STL, I had a vague sentiment that I also needed an iterator
that, 1) didn't not require the container to work, and 2) was not
bound to a a specific type of container.  My original idea was to use
inheritance to do this; the various iterators are all abstract base
classes, and these abstract base classes are what are passed to the
function.  (Thus, the same function could be called with an iterator
over a DLList or over an AVLTree, for example.)

This idea still intregues me.  The problem (then and now) is that to
furnish all of the different types of iterators (with the correct type
conversions between them) results in a very complex inheritance
lattice, to say the least.  (ForwardIterator, BidirectionalIterator,
and RandomIterator, all in both const and non-const versions.  Then
add the actual implementation classes.)

I still think that something along these lines is possible, but for a
first effort, I think that I will replace inheritance with templates.
I believe that something along the following lines should work:

    template< class Base , class Container , class Cursor >
    class Iterator
    {
    public :
                            Iterator( Container& container ,
                                      Cursor& cursor ) ;
                            operator Base*() ;
        //  Other necessary functions.
    private :
        Container&          theContainer ;
        Cursor&             theCursor ;
    } ;

    template< class Base , class Container , class Cursor >
    inline
    Iterator< Base , Container , Cursor >::Iterator( Container& container ,
                                                     Cursor& cursor )
        :   theContainer( container )
        ,   theCursor( cursor )
    {
    }

    template< class Base , class Container , class Cursor >
    inline
    Iterator< Base , Container , Cursor >::operator Base*()
    {
        return cursor.finished() ? NULL : &container[ cursor ] ;
    }

Note that the above is just to give a general idea of how I think I
would go about it.  It is far from complete, but I think that
something along these lines should work, at least for a start.  (If
the STL requires assignment and copy of interators, the member
variables will have to be pointers.)

Since (at least in my case) the cursor class already contains a
pointer to the container it is associated with, an alternative
solution might be to add a function to the cursor class to make this
pointer accessible.  In this case, no direct references to the
container would be necessary at the iterator interface.
--
James Kanze      Tel.: (+33) 88 14 49 00     email: kanze@lts.sel.alcatel.de
GABI Software, Sarl., 8 rue des Francs-Bourgeois, F-67000 Strasbourg, France
Conseils en informatique industrielle --
                              -- Beratung in industrieller Datenverarbeitung

Author: kanze@us-es.sel.de (James Kanze US/ESC 60/3/164 #71425)
Date: 19 Sep 1994 17:11:18 GMT Raw View

In article <CwB1D7.70F@undergrad.math.uwaterloo.ca>
r-ridge@calum.csclub.uwaterloo.ca (Ross Ridge) writes:

|> Considering the fact that anything you can hash you can
|> order[...]

I am curious about this statement.  Do you claim that there is
something in hash that implies order?  If so, could you explain what?

Author: kanze@us-es.sel.de (James Kanze US/ESC 60/3/164 #71425)
Date: 19 Sep 1994 17:15:53 GMT Raw View

In article <TMB.94Sep16193709@arolla.idiap.ch> tmb@arolla.idiap.ch
(Thomas M. Breuel) writes:

|> Looking at other languages, and from my experience, I think this
|> is what a standard template library should support:

|>  -- stacks with exponential doubling and integer subscripting

Just curious.  I would have thought that the "exponential doubling"
bit would be an implementation question, not part of the standard.

Is there some fundamental reason (other than performance) for
specifying "exponential doubling?"
--
James Kanze      Tel.: (+33) 88 14 49 00     email: kanze@lts.sel.alcatel.de
GABI Software, Sarl., 8 rue des Francs-Bourgeois, F-67000 Strasbourg, France
Conseils en informatique industrielle --
                              -- Beratung in industrieller Datenverarbeitung

Author: nagle@netcom.com (John Nagle)
Date: Mon, 19 Sep 1994 17:38:20 GMT Raw View

rfg@netcom.com (Ronald F. Guilmette) writes:
>In article <CwAzzz.5J8@undergrad.math.uwaterloo.ca> r-ridge@calum.csclub.uwaterloo.ca (Ross Ridge) writes:
>>
>>Ronald F. Guilmette <rfg@netcom.com> wrote:
>>>For the benefit of those of you who are not aware of it, these arguments
>>>about whether or not STL should be in or out were all essentially rendered
>>>moot by the committee's decision back in July to incorporate STL into the
>>>draft standard.
>>
>>This is what John Nagle found scary...

>Well, scary or not, my point was that it's essentially a done deal.

     Not necessarily.  The standard hasn't been adopted yet.

     John Nagle

Author: mav@gaia.cc.gatech.edu (Maurizio Vitale)
Date: 19 Sep 1994 18:50:00 GMT Raw View

>>>>> "Matt" == Matt Kennel <mbk@inls1.ucsd.edu> writes:

  Matt> Wayne Stewart (wstewart@sunterm10.nmo.gtegsc.com) wrote:
  Matt> : Dag Bruck (dag@control.lth.se) wrote:

  Matt> : I'm curious about how someone would use this concept to
  Matt> : traverse or iterate thru 'non-linear' data structures such
  Matt> : as trees or graphs, where there may be several options
  Matt> : (depending on context) for 'next' or 'previous' element.
  Matt> : For example, how could I offer pre-order, in-order,
  Matt> : post-order 'iterators' of a general tree container?  Do STL
  Matt> : pointer-style iterators even make sense in this case?  How
  Matt> : would one know by looking at the code which traversal order
  Matt> : '++' was using?

  Matt> My opinion is one would want to have multiple kinds of
  Matt> iterators, presumably with different names, none of which is
  Matt> "++" or "--". E.g. routines like

What I'm using in a graph library I'm building on top of STL is
something similar to IO manipulators. You can say any of the
following:

for (it << bfs << graph; it; it++); // everything in breadth first order
for (it << bfs (it1); it; it++);    // only things reachable from it1
for (it << succ (it1); it; it++);   // successors of a node



--
    Maurizio Vitale
 _______________
|        _      |\   e-mail: mav@cc.gatech.edu     | How many times can
|  /|/| '_) | ) | |  voice:  (404) 881-6083 (home) | a man turn his head,
| | | |_(_|_|/  | |          (404) 853-9382 (work) | and pretend that he
|_______________| |                                | just doesn't see ?
 \_______________\|  fax:    (404) 853-9378        |  - Bob Dylan

Author: jason@cygnus.com (Jason Merrill)
Date: Mon, 19 Sep 1994 18:44:04 GMT Raw View

>>>>> Kohtala Marko <kohtala@hardy.trs.ntc.nokia.com> writes:

> Also there is a GNU version in development.

Actually, development of this version stopped when HP released their
implementation, as it would have taken a lot of needless work to finish it.

Jason

Author: r-ridge@calum.csclub.uwaterloo.ca (Ross Ridge)
Date: Mon, 19 Sep 1994 23:31:12 GMT Raw View

r-ridge@calum.csclub.uwaterloo.ca (Ross Ridge) writes:
>Considering the fact that anything you can hash you can order[...]

James Kanze US/ESC 60/3/164 #71425 <kanze@us-es.sel.de> wrote:
>I am curious about this statement.  Do you claim that there is
>something in hash that implies order?

No.

>From a practical point of view, I believe that it is possible to
>define an arbitrary complete ordering for just about anything.

This is what I mean.  Just use the hashing function F, where F(x) = x,
and use the result to order the values.

Essentially my point is the despite the fact hash tables don't require
an ordering of the objects being hashed they aren't more general than a
container class that requires an ordering of the objects.

      Ross ridge

--
 l/    Ross Ridge      //
[oo]   The Great HTMU, CSC President   [oo]
-()-        http://csclub.uwaterloo.ca/u/r-ridge/  /()/
 db           +1 519 883 4329                             //

Author: tmb@arolla.idiap.ch (Thomas M. Breuel)
Date: 16 Sep 1994 21:09:11 GMT Raw View

I wrote:
|I can't compile HP's STL implementation (it only compiles with a few
|compilers), and the documentation that comes with it is a bit sparse.
|However, from an (admittedly) brief glance at the code, I'm also not
|so sure whether STL is a really practical library:
|
| -- Stacks are a "secondary" datatype in STL, implemented via an
|    adaptor; however, stacks with exponential doubling
|    are a very useful datatype for C++, often (though not
|    always) more appropriate than lists.

It seems like the "vector" type actually happens to double as a stack
with exponential doubling.

    Thomas.

Author: tmb@arolla.idiap.ch (Thomas M. Breuel)
Date: 16 Sep 1994 17:37:05 GMT Raw View

In article <DAG.94Sep16081742@bellman.control.lth.se> dag@control.lth.se (Dag Bruck) writes:
|STL standardizes a set of operations that iterators must support, for
|example unary * for dereferencing, ++ and -- for stepping forward and
|backward.
|
|I agree that the syntax has been borrowed from pointer arithmetic, so
|raw pointers can be used when needed.
|
|However, there is nothing in STL that says an iterator must use
|pointer arithmetic, or that any operation must be unchecked. If you
|think safe iterators are appropriate (and I would agree), use them.

Let me make clear again: my gripe with standardizing on STL is not
that it uses notation like pointer arithmetic. I also appreciate the
design of STL and think that it may be a very useful library for some
applications. I think it just doesn't address the most pressing needs
in terms of a set of standard collection types.

What we need is a small, simple set of collection types that is easy
to understand, easy to implement (possibly with compiler support), and
that sets standards for how to pass fixed and variable sized
collections of objects among different libraries. The value of such a
standard lies in its restraint and judicious choice of primitives, not
in trying to provide a wide range of functionality or policies.

Looking at other languages, and from my experience, I think this
is what a standard template library should support:

-- stacks with exponential doubling and integer subscripting
-- 1D through 4D arrays
-- hash tables with a simple forward iterator
-- an option to disable range checking
-- reference counting

(I actually have a library of these datatypes. I'll see whether I can
put it together in some distributable form with documentation.)

Again, the important thing, I believe is to standardize just this so
that everybody will be encouraged to use a common set of collection
types in interfaces. In this case, providing more functionality is

I can't compile HP's STL implementation (it only compiles with a few
compilers), and the documentation that comes with it is a bit sparse.
However, from an (admittedly) brief glance at the code, I'm also not
so sure whether STL is a really practical library:

-- Stacks are a "secondary" datatype in STL, implemented via an
adaptor; however, stacks with exponential doubling
are a very useful datatype for C++, often (though not
always) more appropriate than lists.
-- STL doesn't seem to provide hash tables at all; instead,
it provides red-black trees. Now, first of all,
I think hash tables are crucially important (awk
and perl get by essentially with just hash tables),
much more so than ordered collections. Second,
even for ordered collections, red-black trees are
a poor choice.

It seems like a lot of thought and care has gone into the abstractions
in STL. That's good, and it would be nice to have such a library
available optionally. But, again, I don't think it makes a good
standard for a library of template classes in C++.

Thomas.

Author: rfg@netcom.com (Ronald F. Guilmette)
Date: Fri, 16 Sep 1994 21:59:31 GMT Raw View

In article <nagleCw6M8F.7Kr@netcom.com> nagle@netcom.com (John Nagle) writes:
>tmb@arolla.idiap.ch (Thomas M. Breuel) writes:
>>|In article <nagleCvvFu2.Gw7@netcom.com> nagle@netcom.com (John Nagle) writes:
>>|
>>|>       That's scary.  STL is a clever implementation of a radical concept.
>>|
>>Well, some general observations about STL.  Designing something like
>>STL is great fun, and it looks really neat.  However, I think in
>>practice, it's probably not a good choice to standardize on as part of
>>the ANSI C++ standard.
>
>       Agreed.  Standardizing on something like STL is premature...

For the benefit of those of you who are not aware of it, these arguments
about whether or not STL should be in or out were all essentially rendered
moot by the committee's decision back in July to incorporate STL into the
draft standard.

--

-- Ron Guilmette, Sunnyvale, CA ---------- RG Consulting -------------------
---- domain addr: rfg@netcom.com ----------- Purveyors of Compiler Test ----
---- uucp addr: ...!uunet!netcom!rfg ------- Suites and Bullet-Proof Shoes -

Author: doug@monet.ads.com (Doug Morgan)
Date: 16 Sep 94 15:29:54 Raw View

In article <MATT.94Sep16102349@physics16.berkeley.edu> matt@physics16.berkeley.edu (Matt Austern) writes:
|In article <DAG.94Sep16081742@bellman.control.lth.se> dag@control.lth.se (Dag Bruck) writes:
|
|> STL standardizes a set of operations that iterators must support, for
|> example unary * for dereferencing, ++ and -- for stepping forward and
|> backward.
|>
|> I agree that the syntax has been borrowed from pointer arithmetic, so
|> raw pointers can be used when needed.
|
|Well, it's not just a matter of syntax; there's also an important
|symantic question here.
|
|The question is this: if you have an iterator running through some
|sort of collection, and you want to access an element that the
|iterator is referring to, do you do this through an iterator member
|function, or through a collector member function that takes the
|iterator as an argument?  That is (using the most common syntactic
|choice for each mechanism), do you write *p, or A[i]?  It's this
|distinction that I have in mind when I refer to the distinction
|between pointer and index symantics.

Some thoughts on this issue:

The key semantic point of STL is not the use of *p vs. A[i].  It is
the use of both an iterator object and an end-test object (It just so
happens that in STL, both are full-fledged iterators).  This is the
key that opens the possibility of STL using register allocatable
objects (pointers or simple classes with just a pointer member).

When the end-test is built into a single iterator (or
iterator-container combination), it is difficult-to-impossible for a
compiler to see what could go into a register for the end test.  With
a separate end-test object, the quantity is in plain sight for any
compiler to stuff in a register.

It is possible to achieve the same efficiency as STL (with a little
bit more generality) using A[i].  However, it still demands a separate
end-test object.  Also, if i has member functions in its iterface
(e.g., i.next() instead of next(A,i)), you have to have a compiler
smart enough to put objects of small classes into registers.  Since
then, i must be a class object, but could be just a single pointer
wrapped in a class.  A[i] is clearly a little more general than *p
(since a template operator[] can always expand A[i] to *i, but you can
never get the A out of *p).

Finally, I don't know about your compiler, but g++2.6.0 misses obvious
opportunities to register allocate reference arguments to in-line
functions and, in worst case, a loop involving  an inlined A[i] can
run almost an order of magitude slower than an exactly equivalent loop
involving *p++ (where p is a pointer, the compiler being smart enough
to do the register allocation in this case).

In summary,

 O In principle, A[i] is a bit more general (actions on a true pointer
   object can still get to A if they have to) and can be just as
   efficient.  This gives index syntax an in-principle edge.
 O A separate end-test object is alway a must for the efficiency of
   register allocated pointers.
 O With some of today's compilers (at least one) *p++ is definitely
   faster.

|Why does it matter?  Well, the most obvious answer is the distinction
|between const and non-const collections.  If you're using index
|symantics then you can overload the access operation on const so that
|const correctness can be checked at compile time; if you're using
|pointer symantics, though, then I don't really see any very good way
|of doing anything equivalent.  That's the reason why, whenever I
|write a collection class, I always use index symantics for its
|associated iterator.
|
|I haen't yet seen the STL proposal.  Did its authors find some clever
|way of handling the const issue?

When you get an iterator from a container (e.g. using begin() or
end()), you get a const_iterator for const constainers.  They just
overloaded the functions that hand you iterators.

|      --matt

Doug
--------------------------------------------------------------------
Doug Morgan, doug@ads.com, (415) 960-7444
Advanced Decision Systems (a division of Booz-Allen & Hamilton Inc.)
1500 Plymouth St., Mountain View, CA 94043-1230
--------------------------------------------------------------------

Author: kanze@us-es.sel.de (James Kanze US/ESC 60/3/164 #71425)
Date: 16 Sep 1994 12:24:53 GMT Raw View

In article <nagleCw6M8F.7Kr@netcom.com> nagle@netcom.com (John Nagle)
writes:

|> tmb@arolla.idiap.ch (Thomas M. Breuel) writes:
|> >|In article <nagleCvvFu2.Gw7@netcom.com> nagle@netcom.com (John Nagle) writes:

|> >|>       That's scary.  STL is a clever implementation of a radical concept.

|> >Well, some general observations about STL.  Designing something like
|> >STL is great fun, and it looks really neat.  However, I think in
|> >practice, it's probably not a good choice to standardize on as part of
|> >the ANSI C++ standard.

|>        Agreed.  Standardizing on something like STL is premature.  The
|> committee should make an implementation available and let the community
|> of C++ users have a year or so of experience before nailing it into the
|> standard.

|>        Some of the problem could be alleviated if it weren't called the
|> "Standard Template Library".  Calling it, say, the "Pointer Arithmetic
|> Container Template Library" would be less controversial.
|> Then we could have other libraries, such as the "Safe Array Container
|> Template Library" (the Detlefs/Ellis proposal), on equal terms.
|> Standardizing on pointer-arithmetic based iterators is an aggressive
|> proposal, and one which represents a high-risk decision.  It's quite
|> possible that programs using this approach will have serious reliability
|> problems, and more time is needed to evaluate that.

I think that there is some misunderstanding.  The Standard Template
Library defines the interface syntax of a set of iterators.  To invoke
a function of the library with an iterator, it must be conform to this
syntax.

It makes no statement with regards to the semantics of the iterator
(which is provided by the user to instantiate the template).  There is
absolutely nothing preventing an iterator from using pointer `syntax'
and being safe.
--
James Kanze      Tel.: (+33) 88 14 49 00     email: kanze@lts.sel.alcatel.de
GABI Software, Sarl., 8 rue des Francs-Bourgeois, F-67000 Strasbourg, France
Conseils en informatique industrielle --
                              -- Beratung in industrieller Datenverarbeitung

Author: pete@genghis.interbase.borland.com (Pete Becker)
Date: Sat, 17 Sep 1994 03:35:31 GMT Raw View

In article <MATT.94Sep16102349@physics16.berkeley.edu>,
Matt Austern <matt@physics.berkeley.edu> wrote:
>In article <DAG.94Sep16081742@bellman.control.lth.se> dag@control.lth.se (Dag Bruck) writes:
>
>> STL standardizes a set of operations that iterators must support, for
>> example unary * for dereferencing, ++ and -- for stepping forward and
>> backward.
>>
>> I agree that the syntax has been borrowed from pointer arithmetic, so
>> raw pointers can be used when needed.
>
>Well, it's not just a matter of syntax; there's also an important
>symantic question here.
>
>The question is this: if you have an iterator running through some
>sort of collection, and you want to access an element that the
>iterator is referring to, do you do this through an iterator member
>function, or through a collector member function that takes the
>iterator as an argument?  That is (using the most common syntactic
>choice for each mechanism), do you write *p, or A[i]?  It's this
>distinction that I have in mind when I refer to the distinction
>between pointer and index symantics.
>
>Why does it matter?  Well, the most obvious answer is the distinction
>between const and non-const collections.  If you're using index
>symantics then you can overload the access operation on const so that
>const correctness can be checked at compile time; if you're using
>pointer symantics, though, then I don't really see any very good way
>of doing anything equivalent.  That's the reason why, whenever I
>write a collection class, I always use index symantics for its
>associated iterator.
>
>I haen't yet seen the STL proposal.  Did its authors find some clever
>way of handling the const issue?
>--
>
>                               --matt


 template <class T> class Container
 {
 public:
  typedef whatever iterator;
  typedef const whatever const_iterator;

  iterator begin();
  const_iterator begin() const;
 };

 The type 'iterator' provides an operator * that returns a T&. The
type 'const_iterator' provides an operator * that returns a const T&.
 -- Pete

Author: dag@control.lth.se (Dag Bruck)
Date: 17 Sep 1994 08:44:44 GMT Raw View

>>>>> "W" == Wayne Stewart <wstewart@sunterm10.nmo.gtegsc.com> writes:

W> For example, how could I offer
W> pre-order, in-order, post-order 'iterators' of a general tree
W> container?  Do STL pointer-style iterators even make sense in this
W> case?  How would one know by looking at the code which traversal
W> order '++' was using?

I would implement iterator types for each of the traversals, and you
would have to look at the type of the iterator to find out what '++'
means.

     -- Dag

Author: dag@control.lth.se (Dag Bruck)
Date: 17 Sep 1994 08:51:17 GMT Raw View

>>>>> "J" == John Nagle <nagle@netcom.com> writes:

J>       A few months back, members of the committee were saying that
J> the draft C++ standard was essentially frozen in terms of new
J> features.  Then, out of the blue, this wierd STL library appears.
J> What happened?

(1)  STL has been discussed in the committee for about a year now.

(2)  If I remember correctly, members of the committee have for while
said the the C++ language is essentially frozen.  The standard C++
library is less so, as demonstrated by the incorporation of STL.

(3)  I think you will find STL less weird if you studied it closer.


      -- Dag Bruck

Author: fjh@munta.cs.mu.OZ.AU (Fergus Henderson)
Date: Sat, 17 Sep 1994 08:53:33 GMT Raw View

matt@physics16.berkeley.edu (Matt Austern) writes:

[Regarding STL using pointer syntax:]

>Well, it's not just a matter of syntax; there's also an important
>symantic question here.
[...]
>Why does it matter?  Well, the most obvious answer is the distinction
>between const and non-const collections.  If you're using index
>symantics then you can overload the access operation on const so that
>const correctness can be checked at compile time; if you're using
>pointer symantics, though, then I don't really see any very good way
>of doing anything equivalent.  That's the reason why, whenever I
>write a collection class, I always use index symantics for its
>associated iterator.
>
>I haen't yet seen the STL proposal.  Did its authors find some clever
>way of handling the const issue?

In C there are two different sorts of pointers, e.g.  `char *' and
`const char *'; similarly in STL there are iterators and
const_iterators.

--
Fergus Henderson - fjh@munta.cs.mu.oz.au

Author: ua302aa@sun2.lrz-muenchen.de (Kurt Watzka)
Date: 17 Sep 1994 09:51:41 GMT Raw View

wstewart@sunterm10.nmo.gtegsc.com (Wayne Stewart) writes:

>Dag Bruck (dag@control.lth.se) wrote:

>: STL standardizes a set of operations that iterators must support, for
>: example unary * for dereferencing, ++ and -- for stepping forward and
>: backward.

>I'm curious about how someone would use this concept to traverse or iterate
>thru 'non-linear' data structures such as trees or graphs, where there may
>be several options (depending on context) for 'next' or 'previous' element.
>For example, how could I offer pre-order, in-order, post-order 'iterators'
>of a general tree container?

By using explicit stacks in the iterators. This works quite well for
in-order and pre-order iterators. For post-order iterators it will
(more or less) transform the tree into a list for iteration purposes.

>Do STL pointer-style iterators even make
>sense in this case?  How would one know by looking at the code which
>traversal order '++' was using?

Author: matt@physics16.berkeley.edu (Matt Austern)
Date: 16 Sep 1994 17:23:49 GMT Raw View

In article <DAG.94Sep16081742@bellman.control.lth.se> dag@control.lth.se (Dag Bruck) writes:

> STL standardizes a set of operations that iterators must support, for
> example unary * for dereferencing, ++ and -- for stepping forward and
> backward.
>
> I agree that the syntax has been borrowed from pointer arithmetic, so
> raw pointers can be used when needed.

Well, it's not just a matter of syntax; there's also an important
symantic question here.

The question is this: if you have an iterator running through some
sort of collection, and you want to access an element that the
iterator is referring to, do you do this through an iterator member
function, or through a collector member function that takes the
iterator as an argument?  That is (using the most common syntactic
choice for each mechanism), do you write *p, or A[i]?  It's this
distinction that I have in mind when I refer to the distinction
between pointer and index symantics.

Why does it matter?  Well, the most obvious answer is the distinction
between const and non-const collections.  If you're using index
symantics then you can overload the access operation on const so that
const correctness can be checked at compile time; if you're using
pointer symantics, though, then I don't really see any very good way
of doing anything equivalent.  That's the reason why, whenever I
write a collection class, I always use index symantics for its
associated iterator.

I haen't yet seen the STL proposal.  Did its authors find some clever
way of handling the const issue?
--

                               --matt

Author: wstewart@sunterm10.nmo.gtegsc.com (Wayne Stewart)
Date: 16 Sep 1994 16:40:32 GMT Raw View

Dag Bruck (dag@control.lth.se) wrote:

: STL standardizes a set of operations that iterators must support, for
: example unary * for dereferencing, ++ and -- for stepping forward and
: backward.

I'm curious about how someone would use this concept to traverse or iterate
thru 'non-linear' data structures such as trees or graphs, where there may
be several options (depending on context) for 'next' or 'previous' element.
For example, how could I offer pre-order, in-order, post-order 'iterators'
of a general tree container?  Do STL pointer-style iterators even make
sense in this case?  How would one know by looking at the code which
traversal order '++' was using?

(Please excuse if you've seen this question before - our outgoing postings
have been unreliable lately.)

Thanks,
    Wayne

Author: doug@monet.ads.com (Doug Morgan)
Date: 16 Sep 94 16:35:43 Raw View

In article <TMB.94Sep16193709@arolla.idiap.ch> tmb@arolla.idiap.ch (Thomas M. Breuel) writes:
|...
|Let me make clear again: my gripe with standardizing on STL is not
|that it uses notation like pointer arithmetic.  I also appreciate the
|design of STL and think that it may be a very useful library for some
|applications.  I think it just doesn't address the most pressing needs
|in terms of a set of standard collection types.

After having strongly endorsed STL early in this thread, let me first
agree completely with the last statment.  STL definitely does not
define enough standard collections.  However, it is the first library
I have seen that defines what it does define without introducing flat
out obnoxious constraints that unduly restrict what new collections
you can define and implement to be "compatible" with the existing
library code.  Some libraries either give you classes with no virtual
functions and no room for expansion (e.g., AT&T, COOL) or they give
you virtual functions which run so slowly, you alway get hackers
by-passing the library for efficiency.  STL takes a route that: can
give you the highest efficiency, is wide open for arbitrary use of
inheritance, reuses lines of code (and abstract interface definitions)
like mad.

|...
|Looking at other languages, and from my experience, I think this
|is what a standard template library should support:
|
|    -- stacks with exponential doubling and integer subscripting
|    -- 1D through 4D arrays
|    -- hash tables with a simple forward iterator
|    -- an option to disable range checking
|    -- reference counting

Definitely.

|...
|I can't compile HP's STL implementation (it only compiles with a few
|compilers), and the documentation that comes with it is a bit sparse.
|However, from an (admittedly) brief glance at the code, I'm also not
|so sure whether STL is a really practical library:
|
|    -- Stacks are a "secondary" datatype in STL, implemented via an
|       adaptor; however, stacks with exponential doubling
|       are a very useful datatype for C++, often (though not
|       always) more appropriate than lists.

I think you get this behavior with stack<vector<T> >.  In any case, at
some level of detail, no library will give everyone the exact class
they want.  However, with STL, even if stack<vector<T> > did not
exist, you could write one and it would work with global template
function that expect a stack.

|    -- STL doesn't seem to provide hash tables at all; instead,
|       it provides red-black trees.  Now, first of all,
|       I think hash tables are crucially important (awk
|       and perl get by essentially with just hash tables),
|       much more so than ordered collections.  Second,
|       even for ordered collections, red-black trees are
|       a poor choice.

I also think this is a serious (almost fatal) oversight --- hash
tables are well within what should be available in a standard.  I have
suggested that set should be renamed tree_set, placed under a
collection interface definition for ordered sets which is itself under
a set interface.  hash_set (and its cousins) would fall under the set
interface.  Similarly for map, so that hash_table would fall under
(unordered, non-tree) map.  I don't know the status of this
suggestion.  Although, by my argument above I could say that
hash_table could be added by the user, hash tables are just too common
to leave out (just as you say).

|It seems like a lot of thought and care has gone into the abstractions
|in STL.  That's good, and it would be nice to have such a library
|available optionally.  But, again, I don't think it makes a good
|standard for a library of template classes in C++.

You might be right, but STL seems so far beyond the norm of C++
language developments that I'm inclined to jump at it (partly for fear
of what will otherwise come, and when).

|       Thomas.

Doug
--------------------------------------------------------------------
Doug Morgan, doug@ads.com, (415) 960-7444
Advanced Decision Systems (a division of Booz-Allen & Hamilton Inc.)
1500 Plymouth St., Mountain View, CA 94043-1230
--------------------------------------------------------------------

Author: ark@alice.att.com (Andrew Koenig)
Date: Sat, 17 Sep 1994 15:33:12 GMT Raw View

In article <nagleCw9G5t.H9C@netcom.com> nagle@netcom.com (John Nagle) writes:

>       That seems unusual procedure.  ANSI normally acts to codify existing
> practice, not to define new technology.

>       A few months back, members of the committee were saying that the
> draft C++ standard was essentially frozen in terms of new features.
> Then, out of the blue, this wierd STL library appears.  What happened?

Out of the blue?  The proposal was first made, informally, at the
November 1993 meeting.  It was revised and presented formally at the
March 1994 meeting, then revised again, presented, and approved
at the July 1994 meeting.

The library itself is hardly new.  At the time of its proposal, it
had already been implemented and used for some time.  Moreover, it
was based on an earlier library by the same author that has been
commercially available since 1990 or so.
--
    --Andrew Koenig
      ark@research.att.com

Author: ark@alice.att.com (Andrew Koenig)
Date: Sat, 17 Sep 1994 15:38:43 GMT Raw View

In article <DAG.94Sep17105117@bellman.control.lth.se> dag@control.lth.se (Dag Bruck) writes:

> (2)  If I remember correctly, members of the committee have for while
> said the the C++ language is essentially frozen.  The standard C++
> library is less so, as demonstrated by the incorporation of STL.

In the July 1993 meeting in Munich, the committee decided that
the November 1993 meeting would be the last one at which substantive
technical changes could be proposed.  STL was proposed, albeit
informally, at the November 1993 meeting.  Because of the `albeit
informally' part, most of the debate in the committee centered
not on its technical merits but on whether it was too late to adopt
it, regardless of its merits.

What finally swung the issue was that several heads of national
delegations said that they would not approve a standard without
at least some kind of data structure library.  Since there was no
alternative to STL on the table, that essentially made STL mandatory.
--
    --Andrew Koenig
      ark@research.att.com

Author: nagle@netcom.com (John Nagle)
Date: Sat, 17 Sep 1994 06:17:53 GMT Raw View

rfg@netcom.com (Ronald F. Guilmette) writes:
>For the benefit of those of you who are not aware of it, these arguments
>about whether or not STL should be in or out were all essentially rendered
>moot by the committee's decision back in July to incorporate STL into the
>draft standard.

      That seems unusual procedure.  ANSI normally acts to codify existing
practice, not to define new technology.

      A few months back, members of the committee were saying that the
draft C++ standard was essentially frozen in terms of new features.
Then, out of the blue, this wierd STL library appears.  What happened?

      John Nagle

Author: mbk@inls1.ucsd.edu (Matt Kennel)
Date: 18 Sep 1994 00:56:52 GMT Raw View

Wayne Stewart (wstewart@sunterm10.nmo.gtegsc.com) wrote:
: Dag Bruck (dag@control.lth.se) wrote:

: : STL standardizes a set of operations that iterators must support, for
: : example unary * for dereferencing, ++ and -- for stepping forward and
: : backward.

: I'm curious about how someone would use this concept to traverse or iterate
: thru 'non-linear' data structures such as trees or graphs, where there may
: be several options (depending on context) for 'next' or 'previous' element.
: For example, how could I offer pre-order, in-order, post-order 'iterators'
: of a general tree container?  Do STL pointer-style iterators even make
: sense in this case?  How would one know by looking at the code which
: traversal order '++' was using?

My opinion is one would want to have multiple kinds of iterators, presumably
with different names, none of which is "++" or "--". E.g. routines like

in_order_next()
post_order_previous().

: Thanks,
:     Wayne

--
-Matt Kennel    mbk@inls1.ucsd.edu
-Institute for Nonlinear Science, University of California, San Diego
-*** AD: Archive for nonlinear dynamics papers & programs: FTP to
-***     lyapunov.ucsd.edu, username "anonymous".

Author: r-ridge@calum.csclub.uwaterloo.ca (Ross Ridge)
Date: Sun, 18 Sep 1994 02:23:58 GMT Raw View

nagle@netcom.com (John Nagle) writes:
>That's scary.  STL is a clever implementation of a radical concept.

tmb@arolla.idiap.ch (Thomas M. Breuel) writes:
>Well, some general observations about STL.  Designing something like
>STL is great fun, and it looks really neat.  However, I think in
>practice, it's probably not a good choice to standardize on as part of
>the ANSI C++ standard.

nagle@netcom.com (John Nagle) writes:
>Agreed.  Standardizing on something like STL is premature...

Ronald F. Guilmette <rfg@netcom.com> wrote:
>For the benefit of those of you who are not aware of it, these arguments
>about whether or not STL should be in or out were all essentially rendered
>moot by the committee's decision back in July to incorporate STL into the
>draft standard.

This is what John Nagle found scary.  These one sided "arguments" were
about whether or not the committee's decision was a wise one.

I think the STL looks solid, and I think C++ needs a standard component
library like this but unfortunately it's untested.  I think it was big
gamble for the committee to incorporate it into the draft standard.

       Ross Ridge

--
 l/    Ross Ridge      //
[oo]   The Great HTMU, CSC President   [oo]
-()-        http://csclub.uwaterloo.ca/u/r-ridge/  /()/
 db           +1 519 883 4329                             //

Author: r-ridge@calum.csclub.uwaterloo.ca (Ross Ridge)
Date: Sun, 18 Sep 1994 02:53:31 GMT Raw View

Thomas M. Breuel <tmb@idiap.ch> wrote:
>-- STL doesn't seem to provide hash tables at all; instead, it
>provides red-black trees.  Now, first of all, I think hash tables are
>crucially important (awk and perl get by essentially with just hash
>tables), much more so than ordered collections.

AWK and Perl(*) get by with just hash tables because they only hash one
type, strings.  Considering the fact that anything you can hash you can
order, I don't think STL is missing anything crucial.  Put another way,
ordered collections are more general purpose than hash tables.

I also think hash tables are outside the scope of STL.  They are hard
to iterate over and don't provide any guarantees about performance.

>Second, even for ordered collections, red-black trees are a poor
>choice.

I don't think there is anything STL that requires that the
implementation use red-black trees.  My own prototype implementation
uses splay trees.

       Ross Ridge

(*) I actually know nothing about Perl.
--
 l/    Ross Ridge      //
[oo]   The Great HTMU, CSC President   [oo]
-()-        http://csclub.uwaterloo.ca/u/r-ridge/  /()/
 db           +1 519 883 4329                             //

Author: nagle@netcom.com (John Nagle)
Date: Sun, 18 Sep 1994 17:37:54 GMT Raw View

dag@control.lth.se (Dag Bruck) writes:
>>>>>> "J" == John Nagle <nagle@netcom.com> writes:
>J>       A few months back, members of the committee were saying that
>J> the draft C++ standard was essentially frozen in terms of new
>J> features.  Then, out of the blue, this wierd STL library appears.
>J> What happened?

>(1)  STL has been discussed in the committee for about a year now.
>(2)  If I remember correctly, members of the committee have for while
>said the the C++ language is essentially frozen.  The standard C++
>library is less so, as demonstrated by the incorporation of STL.
>(3)  I think you will find STL less weird if you studied it closer.

      Actually, the more I look at it, the wierder it looks.

      One of the basic problems with pointers in C is that they are used
both to refer to arrays and to elements of arrays.  STL iterators
generalize this ambiguity to collections other than arrays.  This
creates some serious problems.  Does an iterator point to an object in
a container or to a slot in which an object goes?  If an iterator points
to an object in a container, and you replace that object with another,
does the iterator now refer to the old object or the new one?  Is
that behavior the same for all containers?

      I suspect that the behavior will differ depending on whether the
container is implemented as an array or as a linked structure.  So
algorithms can break if the underlying container type is changed.

      This is where the generalization of the pointer metaphor breaks down.

      John Nagle

Author: pete@genghis.interbase.borland.com (Pete Becker)
Date: Sun, 18 Sep 1994 17:59:18 GMT Raw View

In article <35g38k$mth@network.ucsd.edu>,
Matt Kennel <mbk@inls1.ucsd.edu> wrote:
>Wayne Stewart (wstewart@sunterm10.nmo.gtegsc.com) wrote:
>: Dag Bruck (dag@control.lth.se) wrote:
>
>: : STL standardizes a set of operations that iterators must support, for
>: : example unary * for dereferencing, ++ and -- for stepping forward and
>: : backward.
>
>: I'm curious about how someone would use this concept to traverse or iterate
>: thru 'non-linear' data structures such as trees or graphs, where there may
>: be several options (depending on context) for 'next' or 'previous' element.
>: For example, how could I offer pre-order, in-order, post-order 'iterators'
>: of a general tree container?  Do STL pointer-style iterators even make
>: sense in this case?  How would one know by looking at the code which
>: traversal order '++' was using?
>
>My opinion is one would want to have multiple kinds of iterators, presumably
>with different names, none of which is "++" or "--". E.g. routines like
>
>in_order_next()
>post_order_previous().
>

 This misses the point of STL. An iterator is an object, not an
operation. You ask the container for an iterator and you use the iterator to
get at the elements. If there are several different sensible ways to iterate
through a container, with externally visible differences (like in-order vs.
post-order), the container provides multiple interators, one for each
iteration technique.
 -- Pete

Author: pete@genghis.interbase.borland.com (Pete Becker)
Date: Sun, 18 Sep 1994 18:14:11 GMT Raw View

In article <nagleCwC6B6.ICE@netcom.com>, John Nagle <nagle@netcom.com> wrote:
>dag@control.lth.se (Dag Bruck) writes:
>>(3)  I think you will find STL less weird if you studied it closer.
>
>      Actually, the more I look at it, the wierder it looks.
>
>      One of the basic problems with pointers in C is that they are used
>both to refer to arrays and to elements of arrays.  STL iterators
>generalize this ambiguity to collections other than arrays.  This
>creates some serious problems.  Does an iterator point to an object in
>a container or to a slot in which an object goes?  If an iterator points
>to an object in a container, and you replace that object with another,
>does the iterator now refer to the old object or the new one?  Is
>that behavior the same for all containers?
>
>      I suspect that the behavior will differ depending on whether the
>container is implemented as an array or as a linked structure.  So
>algorithms can break if the underlying container type is changed.
>
>      This is where the generalization of the pointer metaphor breaks down.
>

 As with any metaphor, you can't apply it to everything. What STL does
is define an iterator as a type that has a certain set of operations
defined on it. It says nothing about how you implement those operations and,
in particular, it says nothing about what data that type has to contain.
 It's certainly true that an iterator for a linked list and an iterator
for a char array will implement operator++() differently. That doesn't matter.
What's important is that once you've applied operator++() to an iterator,
operator*() will give you the next item in the container. That must be true,
because that is the definition of an iterator. Any correctly implemented
algorithm that relies on that behavior will work correctly on any
container that provides a valid iterator.
 -- Pete

Author: kohtala@hardy.trs.ntc.nokia.com (Kohtala Marko)
Date: 14 Sep 1994 19:01:21 GMT Raw View

In article <TMB.94Sep14130425@arolla.idiap.ch> tmb@arolla.idiap.ch (Thomas M. Breuel) writes:

> On
> the other hand, if there is going to be a high-quality, free, portable
> implementation that every vendor can use, I don't see the point of
> mandating STL in the standard.

HP seems to be giving their implementation for free. They just want to
see their copyright notice stamped on the documentation. The README
says it compiles with Borland 4.0, Lucid 3.2, EDG and couple of IBM
compiler. In time, I bet there will be more compilers supported.

Also there is a GNU version in development.

I think we will have a choise of good implementations.

--
---
Marko Kohtala - Marko.Kohtala@ntc.nokia.com, Marko.Kohtala@hut.fi

Author: tmb@arolla.idiap.ch (Thomas M. Breuel)
Date: 14 Sep 1994 11:04:23 GMT Raw View

|In article <nagleCvvFu2.Gw7@netcom.com> nagle@netcom.com (John Nagle) writes:
|
|>       That's scary.  STL is a clever implementation of a radical concept.
|
|Iterators are hardly radical.  Iterators with interfaces that look
|like pointers are downright conservative.  Why the fear?

Well, some general observations about STL.  Designing something like
STL is great fun, and it looks really neat.  However, I think in
practice, it's probably not a good choice to standardize on as part of
the ANSI C++ standard.

In other languages, I have observed that complex "standard" libraries
like STL have a habit of not being implemented very well by vendors.
I expect that the less-used features in STL are going to be buggy
and/or implemented inefficiently.  Any serious and experienced user
will likely end up having to reimplement that functionality anyway.
If vendors are each going to implement their own implementations of
STL, I see little reason why it should be any different for STL.  On
the other hand, if there is going to be a high-quality, free, portable
implementation that every vendor can use, I don't see the point of
mandating STL in the standard.

I suspect that standardizing as part of ANSI C++ a minimal library of
concrete standard collection classes (nD-array, stack, hashtable)
would be best for the C++ community.  As for iterators, arrays and
stacks can be accessed by integer indexes, and hashtables by a simple
forward iterator.

Such an approach would set a sufficient standard for exchanging
collections of objects (as arguments and return values) among
different libraries, and it would result in simple, high-quality
implementations that could even be enhanced by specific compiler
support (e.g., the compiler might know about range checking for
arrays).  More ambitious libraries could be left to free and
commercial implementations outside the standard.

But it's probably too late now given that STL has already been
accepted.  Well, we'll have to see how things will work out...

    Thomas.

Author: nagle@netcom.com (John Nagle)
Date: Thu, 15 Sep 1994 17:36:14 GMT Raw View

tmb@arolla.idiap.ch (Thomas M. Breuel) writes:
>|In article <nagleCvvFu2.Gw7@netcom.com> nagle@netcom.com (John Nagle) writes:
>|
>|>       That's scary.  STL is a clever implementation of a radical concept.
>|
>Well, some general observations about STL.  Designing something like
>STL is great fun, and it looks really neat.  However, I think in
>practice, it's probably not a good choice to standardize on as part of
>the ANSI C++ standard.

       Agreed.  Standardizing on something like STL is premature.  The
committee should make an implementation available and let the community
of C++ users have a year or so of experience before nailing it into the
standard.

       Some of the problem could be alleviated if it weren't called the
"Standard Template Library".  Calling it, say, the "Pointer Arithmetic
Container Template Library" would be less controversial.
Then we could have other libraries, such as the "Safe Array Container
Template Library" (the Detlefs/Ellis proposal), on equal terms.
Standardizing on pointer-arithmetic based iterators is an aggressive
proposal, and one which represents a high-risk decision.  It's quite
possible that programs using this approach will have serious reliability
problems, and more time is needed to evaluate that.

       Has the Taligent architecture team commented on STL?  They publish
some good papers on C++ program architecture.

     John Nagle

Author: dag@control.lth.se (Dag Bruck)
Date: 16 Sep 1994 06:17:41 GMT Raw View

>>>>> "JN" == John Nagle <nagle@netcom.com> writes:

JN> Standardizing on pointer-arithmetic
JN> based iterators is an aggressive proposal, and one which
JN> represents a high-risk decision.

STL standardizes a set of operations that iterators must support, for
example unary * for dereferencing, ++ and -- for stepping forward and
backward.

I agree that the syntax has been borrowed from pointer arithmetic, so
raw pointers can be used when needed.

However, there is nothing in STL that says an iterator must use
pointer arithmetic, or that any operation must be unchecked.  If you
think safe iterators are appropriate (and I would agree), use them.


      -- Dag Bruck