Thread

Topic: No lower_bound or upper_bound for hash containers

Author: David Abrahams <dave@boost-consulting.com>
Date: Sat, 4 Nov 2006 20:29:06 CST Raw View

"Greg Herlihy" <greghe@pacbell.net> writes:

> Moreover the mere existence of lower_bound() and upper_bound() methods
> for unordered container class would likely prove to be very misleading.
> The client is apt to expect performance comparable to the ordered
> container versions of the same routines - but in reality the
> performance will be orders of magnitude worse.

I agree that it would be misleading, but not because of performance.
The problem is that in i = c.lower_bound(x), i isn't just an iterator
to the first element equal to x.  It also induces a partition on c so
that for all positions p that precede i in c, *p < x, and for all other
positions q in c, !(x < *q).  You can't provide those invariants with
any lower_bound that can be implemented for unordered containers as
specified in the TR (**).

(**) The Dinkum-style unordered container is an exception, but getting
lower_bound in that design imposes more requirements on users, namely
that they supply an ordering relation, and IIRC there are also larger
internal costs (an extra back-link per node, or something).

--
Dave Abrahams
Boost Consulting
www.boost-consulting.com

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]

Author: usenet@aristeia.com (Scott Meyers)
Date: Sun, 5 Nov 2006 06:26:04 GMT Raw View

David Abrahams wrote:
> The problem is that in i = c.lower_bound(x), i isn't just an iterator
> to the first element equal to x.  It also induces a partition on c so
> that for all positions p that precede i in c, *p < x, and for all other
> positions q in c, !(x < *q).

Nonsense.  This is how it's specified for ordered containers, but these
are UNordered containers.  All we know about the elements that precede
lower_bound(val) in a begin()-to-end() traversal is that they are not
equivalent to val.  I don't understand why so many people seem to want
to assume that what's true of an ordered container must also be true of
an unordered one.  The elements in an unordered container are NOT
ORDERED BY VALUE, except that all elements with equivalent values must
be adjacent to one another, because without that constraint, you can't
implement equal_range.  Implementing that constraint imposes an
insertion overhead for multi-containers, because they can't just dump a
new element at the front of the bucket.  Instead, they must search the
bucket for the first equivalent value (or the end of the bucket).

> (**) The Dinkum-style unordered container is an exception, but getting
> lower_bound in that design imposes more requirements on users, namely
> that they supply an ordering relation, and IIRC there are also larger
> internal costs (an extra back-link per node, or something).

It's just as easy to implement without the back-link.  You hash to the
correct bucket and search for what you're looking for.  If you find it,
you return an iterator to the first one you find.  If you reach the end
of the bucket and don't find it, you return end().  There's no ordering
relation, no required back-link, nothing.

Scott

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]

Author: "P.J. Plauger" <pjp@dinkumware.com>
Date: Sun, 5 Nov 2006 14:18:17 CST Raw View

"Scott Meyers" <usenet@aristeia.com> wrote in message
news:12kqlmfqukpt2fe@corp.supernews.com...

> David Abrahams wrote:
>> The problem is that in i = c.lower_bound(x), i isn't just an iterator
>> to the first element equal to x.  It also induces a partition on c so
>> that for all positions p that precede i in c, *p < x, and for all other
>> positions q in c, !(x < *q).
>
> Nonsense.  This is how it's specified for ordered containers, but these
> are UNordered containers.  All we know about the elements that precede
> lower_bound(val) in a begin()-to-end() traversal is that they are not
> equivalent to val.  I don't understand why so many people seem to want to
> assume that what's true of an ordered container must also be true of an
> unordered one.  The elements in an unordered container are NOT ORDERED BY
> VALUE, except that all elements with equivalent values must be adjacent to
> one another, because without that constraint, you can't implement
> equal_range.  Implementing that constraint imposes an insertion overhead
> for multi-containers, because they can't just dump a new element at the
> front of the bucket.  Instead, they must search the bucket for the first
> equivalent value (or the end of the bucket).

Right.

>> (**) The Dinkum-style unordered container is an exception, but getting
>> lower_bound in that design imposes more requirements on users, namely
>> that they supply an ordering relation, and IIRC there are also larger
>> internal costs (an extra back-link per node, or something).

You don't need to supply an ordering relation to make lower_bound
and upper_bound work right. We also accept operator== (or its moral
equivalent). We permit operator< and operator> to speed up searches a
bit, particularly in heavily loaded or unbalanced containers, but
they're not required.

> It's just as easy to implement without the back-link.  You hash to the
> correct bucket and search for what you're looking for.  If you find it,
> you return an iterator to the first one you find.  If you reach the end of
> the bucket and don't find it, you return end().  There's no ordering
> relation, no required back-link, nothing.

Right again. We provide back links because:

1) that makes it easier use hash_map or unordered_map as a drop-in
replacement for map (because it supports bidirectional iterators

2) it improves performance by making it easier to rearrange nodes
during a rehash, and by supporting searches from either end of
the bucket for things like equal_range

But as Scott Meyers correctly observes, neither of these features
is required to implement lower_bound and upper_bound on hash
tables, for a suitably weakened notion of what they mean.

P.J. Plauger
Dinkumware, Ltd.
http://www.dinkumware.com


---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]

Author: "James Kanze" <james.kanze@gmail.com>
Date: Sun, 5 Nov 2006 23:02:23 CST Raw View

Scott Meyers wrote:
> David Abrahams wrote:
> > The problem is that in i = c.lower_bound(x), i isn't just an iterator
> > to the first element equal to x.  It also induces a partition on c so
> > that for all positions p that precede i in c, *p < x, and for all other
> > positions q in c, !(x < *q).

> Nonsense.  This is how it's specified for ordered containers, but these
> are UNordered containers.

But that's not the point.  Obviously, you can specify anything,
any way you wish.  But the semantics that David described are
what a normal English speaker (at least one even vaguely
familiar with mathematical English) would assume for a function
named "lower_bounds".  Defining it as something else for some
sort of containers is counter-intuitive.  And just because we
already have some examples (e.g. std::remove) in the standard
doesn't mean that we should continue this policy.

> All we know about the elements that precede
> lower_bound(val) in a begin()-to-end() traversal is that they are not
> equivalent to val.

So why the name "lower_bound"?

> I don't understand why so many people seem to want
> to assume that what's true of an ordered container must also be true of
> an unordered one.

Because we're not thinking in terms of type of container, but in
terms of what the function name says.

> The elements in an unordered container are NOT
> ORDERED BY VALUE, except that all elements with equivalent values must
> be adjacent to one another, because without that constraint, you can't
> implement equal_range.  Implementing that constraint imposes an
> insertion overhead for multi-containers, because they can't just dump a
> new element at the front of the bucket.  Instead, they must search the
> bucket for the first equivalent value (or the end of the bucket).

They have to do that anyway, since they have to know whether the
element already exists or not.

--
James Kanze (Gabi Software)            email: james.kanze@gmail.com
Conseils en informatique orient   e objet/
                   Beratung in objektorientierter Datenverarbeitung
9 place S   mard, 78210 St.-Cyr-l'   cole, France, +33 (0)1 30 23 00 34


---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]

Author: "Greg Herlihy" <greghe@pacbell.net>
Date: Sun, 5 Nov 2006 23:46:50 CST Raw View

P.J. Plauger wrote:
> But as Scott Meyers correctly observes, neither of these features
> is required to implement lower_bound and upper_bound on hash
> tables, for a suitably weakened notion of what they mean.

So to answer the original question: the reason why lower_bound() and
upper_bound() are not members of tr1's unordered container class
interface, is due to the fact that both of these methods would have to
be so watered-down, that neither method would be able provide any
information to the program that the program could not have obtained
(and obtained just as efficiently) had it called an existing member of
the interface.

After all, there is little reason why an unordered container needs a
lower_bound() method, when a call to find() already returns the same
result just as quickly. And there is even less of a case to be made for
upper_bound(): because - when compared with a call to equal_range() - a
call to upper_bound() takes just as much time to return only half as
much information.

Greg

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]

Author: dave@boost-consulting.com (David Abrahams)
Date: Mon, 6 Nov 2006 17:22:35 GMT Raw View

usenet@aristeia.com (Scott Meyers) writes:

> David Abrahams wrote:
>> The problem is that in i = c.lower_bound(x), i isn't just an iterator
>> to the first element equal to x.  It also induces a partition on c so
>> that for all positions p that precede i in c, *p < x, and for all other
>> positions q in c, !(x < *q).
>
> Nonsense.

Excuse me?

> This is how it's specified for ordered containers, but these are
> UNordered containers.

Yes.  So you might change your expectations for what lower_bound means
in the context of unordered containers.  However, it is reasonable to
assume that some of what people expect will be based on existing
contexts.  Everywhere else in the standard library, lower_bound means
exactly what I said it means.  Whether such a shift in meaning in this
new context is desirable is debatable.  I don't happen to have a
strong opinion at the moment, although I do see why it makes sense,
especially if there is an equal_range that has also made such a
contextual leap.

>> (**) The Dinkum-style unordered container is an exception, but getting
>> lower_bound in that design imposes more requirements on users, namely
>> that they supply an ordering relation, and IIRC there are also larger
>> internal costs (an extra back-link per node, or something).
>
> It's just as easy to implement without the back-link.

Not with the usual std::lower_bound semantics.  If you want to talk
about a semantic shift for unordered containers, that's a different
matter.

--
Dave Abrahams
Boost Consulting
www.boost-consulting.com

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]

Author: Scott Meyers <usenet@aristeia.com>
Date: Mon, 6 Nov 2006 11:28:41 CST Raw View

James Kanze wrote:
> But that's not the point.  Obviously, you can specify anything,
> any way you wish.  But the semantics that David described are
> what a normal English speaker (at least one even vaguely
> familiar with mathematical English) would assume for a function
> named "lower_bounds".

Maybe.  To me, lower_bound has always meant "the first of the range of
equivalent values in a begin()-to-end() traversal."  If there's another
meaning for it, it's news to me.  I'd never heard of it before I
encountered the STL.

>> The elements in an unordered container are NOT
>> ORDERED BY VALUE, except that all elements with equivalent values must
>> be adjacent to one another, because without that constraint, you can't
>> implement equal_range.  Implementing that constraint imposes an
>> insertion overhead for multi-containers, because they can't just dump a
>> new element at the front of the bucket.  Instead, they must search the
>> bucket for the first equivalent value (or the end of the bucket).
>
> They have to do that anyway, since they have to know whether the
> element already exists or not.

Not in a multi-container.

Scott

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]

Author: dave@boost-consulting.com (David Abrahams)
Date: Mon, 6 Nov 2006 19:10:20 GMT Raw View

Scott Meyers <usenet@aristeia.com> writes:

> James Kanze wrote:
>> But that's not the point.  Obviously, you can specify anything,
>> any way you wish.  But the semantics that David described are
>> what a normal English speaker (at least one even vaguely
>> familiar with mathematical English) would assume for a function
>> named "lower_bounds".
>
> Maybe.  To me, lower_bound has always meant "the first of the range of
> equivalent values in a begin()-to-end() traversal."  If there's
> another meaning for it, it's news to me.  I'd never heard of it before
> I encountered the STL.

It means more than that, even in the STL, because it has meaning even
when there is no equivalent value to the one being used as a key.

--
Dave Abrahams
Boost Consulting
www.boost-consulting.com

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]

Author: usenet@aristeia.com (Scott Meyers)
Date: Mon, 6 Nov 2006 20:50:15 GMT Raw View

David Abrahams wrote:
> It means more than that, even in the STL, because it has meaning even
> when there is no equivalent value to the one being used as a key.

Yes, but the question is whether that is inherent or accidental.  To be
more complete, my understanding of lower_bound has always been "the
beginning of the range of equivalent values, or, if such a range is
empty, the insertion location where such a value would go if it were to
be inserted."  If you happen to know that the range you are searching is
  ordered, then you can draw additional conclusions, but, to me, that
has always been accidental, not inherent, perhaps because I learned that
Dinkumware's implementation of hash_set et al supported lower_bound et
al at about the same time I learned of the existence of these functions.
    Having never heard of lower_bound et al before encountering the STL
and knowing that DW supported them on hashed containers, it never
occurred to me that some people might view ordering as an inherent
aspect of their semantics.

Perhaps Stepanov meant lower_bound and upper_bound to have meanings
inherently tied to ordering.  I don't know.  I do know that it's very
strange to have equal_range be defined for the standard associative
containers as returning the pair consisting of lower_bound and
upper_bound, to offer equal_range on unordered containers, and then to
argue that lower_bound and upper_bound make no sense for unordered
containers.  It's consistent with C++'s long history of making things as
irregular and complicated as possible, but still.

Scott

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]

Author: Alec Ross <alec@arlross.demon.co.uk>
Date: Mon, 6 Nov 2006 16:06:45 CST Raw View

In message <12kv6pb354uh7ad@corp.supernews.com>, Scott Meyers
<usenet@aristeia.com> writes
>David Abrahams wrote:
>> It means more than that, even in the STL, because it has meaning even
>> when there is no equivalent value to the one being used as a key.
>
>Yes, but the question is whether that is inherent or accidental.  To be
>more complete, my understanding of lower_bound has always been "the
>beginning of the range of equivalent values, or, if such a range is
>empty, the insertion location where such a value would go if it were to
>be inserted."
.
>Perhaps Stepanov meant lower_bound and upper_bound to have meanings
>inherently tied to ordering.  I don't know.
Nor do I.  But FWIW I first came across the term in math, see e.g.

http://en.wikipedia.org/wiki/Lower_bound

(redirected ;))

.
--
Alec Ross

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]

Author: Scott Meyers <usenet@aristeia.com>
Date: Mon, 6 Nov 2006 16:35:38 CST Raw View

Alec Ross wrote:
> Nor do I.  But FWIW I first came across the term in math, see e.g.
>
> http://en.wikipedia.org/wiki/Lower_bound

As I've mentioned, I'd never heard the term used in a mathematical
sense.  I will note, however, that the definition of upper_bound at that
URL does not correspond to the STL semantics of the uppper_bound
algorithm or member function.

Scott

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]

Author: dave@boost-consulting.com (David Abrahams)
Date: Tue, 7 Nov 2006 00:12:42 GMT Raw View

usenet@aristeia.com (Scott Meyers) writes:

> David Abrahams wrote:
>> It means more than that, even in the STL, because it has meaning even
>> when there is no equivalent value to the one being used as a key.
>
> Yes, but the question is whether that is inherent or accidental.

It's certainly not accidental.

     v.insert(std::lower_bound(v.begin(), v.end(), x), x)

is guaranteed to insert x in sorted order if v is sorted, and that's
an important guarantee.

> To be more complete, my understanding of lower_bound has always been
> "the beginning of the range of equivalent values, or, if such a
> range is empty, the insertion location where such a value would go
> if it were to be inserted."

Exactly.  That part after "or" is important.  It is still potentially
meaningful for "unordered" containers (which really have an order --
misnomer?), i.e. one might reasonably expect a traversal before
insertion to be the same as a traversal after insertion except that
the new element would appear just before the position indicated by
lower_bound.  However, I don't think an implementation of the TR1
containers can give that guarantee without adding overhead.

> If you happen to know that the range you are searching is ordered,
> then you can draw additional conclusions,

No, you're not even allowed to call lower_bound unless the range is
sorted, and if it isn't sorted you can't expect it to even give you a
sensible answer when the range of equivalent values wouldn't be empty.

> Perhaps Stepanov meant lower_bound and upper_bound to have meanings
> inherently tied to ordering.  I don't know.  I do know that it's very
> strange to have equal_range be defined for the standard associative
> containers as returning the pair consisting of lower_bound and
> upper_bound, to offer equal_range on unordered containers, and then to
> argue that lower_bound and upper_bound make no sense for unordered
> containers.

I agree that it is strange.

> It's consistent with C++'s long history of making things as
> irregular and complicated as possible, but still.

but still?

--
Dave Abrahams
Boost Consulting
www.boost-consulting.com

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]

Author: usenet@aristeia.com (Scott Meyers)
Date: Tue, 7 Nov 2006 06:44:07 GMT Raw View

David Abrahams wrote:
> No, you're not even allowed to call lower_bound unless the range is
> sorted

Technically, you're right, but you can refer to equal_range(key).first
for unordered containers, and one might be forgiven for assuming that
that's akin to referring to lower_bound(key) on that container (if
lower_bound existed), since equal_range(key).first is defined to be the
same as lower_bound(key) on the standard associative containers.

There's no right or wrong here.  To me, it's just plain bizarre that
equal_range is offered without also offering lower_bound and
upper_bound, but that's because I think of the bounds functions as
fundamentally returning the first and second components of equal_range.
  The fact that you can use them for other things on sorted ranges is,
to me, coincidental.  My understanding is that you view the bounds
functions as fundamentally making sense only for sorted ranges, and it's
a simple coincidence that they happen to correspond to the result of
calling equal_range on such containers.  From that point of view, it's
just plain bizarre that anybody would think they could be used on
unordered containers.

The standard, always eager to be helpful, defines equal_range in terms
of lower_bound and upper_bound for the standard associative containers
(that backs my notion, IMO: equal_range is DEFINED in terms of
lower_bound and upper_bound), but it defines the equal_range algorithm
without referring to the bounds functions at all (which backs your
notion, IMO:  if equal_range happens to return the same values as
lower_- and upper_bound, that's just a coincidence).

Scott

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]

Author: "James Kanze" <james.kanze@gmail.com>
Date: Tue, 7 Nov 2006 10:24:20 CST Raw View

Scott Meyers wrote:

    [...]
> The standard, always eager to be helpful, defines equal_range in terms
> of lower_bound and upper_bound for the standard associative containers
> (that backs my notion, IMO: equal_range is DEFINED in terms of
> lower_bound and upper_bound), but it defines the equal_range algorithm
> without referring to the bounds functions at all (which backs your
> notion, IMO:  if equal_range happens to return the same values as
> lower_- and upper_bound, that's just a coincidence).

Nothing more than my intuition to go on, but I have always
viewed this from the opposite point of view.  The standard
defines two basic functions, lower_bound and upper_bound, based
on an ordering relationship.  It then offers equal_range as a
convenience, mathematically, because it is possible to define an
equivalence relationship from a strict ordering relationship.
And practically, because it is convenient.

Along comes a new associative container type, which is not based
on an ordering relationship, but requires an explicit
equivalence relationship.  My first take would be to expect none
of the three functions; a little bit of reflection, however,
suggests that since equal_range is based on the notion of
equivalent, it is reasonable and convenient (but its absense
wouldn't have shocked me).

In actual practice, I don't think I've ever actually used any of
these functions as members of an associative container.  I've
used the free function lower_bound a lot, on the other hand,
when inserting into an std::vector, in order to maintain it
sorted.  The result is that, like Dave, I've very sensiblized to
its ordering functionality.

--
James Kanze (GABI Software)             email:james.kanze@gmail.com
Conseils en informatique orient   e objet/
                   Beratung in objektorientierter Datenverarbeitung
9 place S   mard, 78210 St.-Cyr-l'   cole, France, +33 (0)1 30 23 00 34

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]

Author: "James Kanze" <james.kanze@gmail.com>
Date: Tue, 7 Nov 2006 10:22:03 CST Raw View

David Abrahams wrote:
> usenet@aristeia.com (Scott Meyers) writes:

    [...]
> > Perhaps Stepanov meant lower_bound and upper_bound to have meanings
> > inherently tied to ordering.  I don't know.  I do know that it's very
> > strange to have equal_range be defined for the standard associative
> > containers as returning the pair consisting of lower_bound and
> > upper_bound, to offer equal_range on unordered containers, and then to
> > argue that lower_bound and upper_bound make no sense for unordered
> > containers.

> I agree that it is strange.

Yes and no.  Intuitively, to me at least, equal_range suggests a
relationship of equality, and upper and lower_bound one of
order.  (I know that the STL defined equal_range in terms of
order, but I would expect that any reasonable class which
defined < also defined ==, in a compatible manner.)

Since the unordered containers do require the elements to
support and equivalence relationship, equal_range seems
appropriate.  Since they don't require an ordering relationship,
upper and lower_bound don't.

>From the point of view of the semantics of the name, of course.

--
James Kanze (GABI Software)             email:james.kanze@gmail.com
Conseils en informatique orient   e objet/
                   Beratung in objektorientierter Datenverarbeitung
9 place S   mard, 78210 St.-Cyr-l'   cole, France, +33 (0)1 30 23 00 34


---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]

Author: "James Kanze" <james.kanze@gmail.com>
Date: Tue, 7 Nov 2006 11:53:52 CST Raw View

Scott Meyers wrote:
> James Kanze wrote:
> > But that's not the point.  Obviously, you can specify anything,
> > any way you wish.  But the semantics that David described are
> > what a normal English speaker (at least one even vaguely
> > familiar with mathematical English) would assume for a function
> > named "lower_bounds".

> Maybe.  To me, lower_bound has always meant "the first of the range of
> equivalent values in a begin()-to-end() traversal."  If there's another
> meaning for it, it's news to me.  I'd never heard of it before I
> encountered the STL.

Really?  What little math I have studied, beyond high school
algebra, has been in French, so I might be mistaken about the
English vocabulary, but I've always understood that "lower
bound" was the English equivalent of "borne inf   rieur", and that
it was frequently used when speaking about functions, etc.
(E.g. the lower bound of sin(x) is -1, and the upper bound 1.)
Have I got something wrong in translation, or my use of English?

(Of course, this doesn't mean that the expression must have the
same meaning in C++.  Programming languages do tend to impose
very specialized meanings.  But in this case, the meaning more
or less fits as long as it is applied to ordered sequences.)

> >> The elements in an unordered container are NOT
> >> ORDERED BY VALUE, except that all elements with equivalent values must
> >> be adjacent to one another, because without that constraint, you can't
> >> implement equal_range.  Implementing that constraint imposes an
> >> insertion overhead for multi-containers, because they can't just dump a
> >> new element at the front of the bucket.  Instead, they must search the
> >> bucket for the first equivalent value (or the end of the bucket).

> > They have to do that anyway, since they have to know whether the
> > element already exists or not.

> Not in a multi-container.

I'm not sure.  Independantly of a lower or upper_bound
function, about the only way I can see of visiting all of the
elements with an equivalent value is for the iterator returned
by find() to point to one of them, and incrementing it to point
to the others.  Something like:

    Container::iterator iter = c.find( myKey ) ;
    while ( iter != c.end() && *iter == myKey ) {
        //  ...
        ++ iter ;
    }

Practically speaking, this also requires keeping the equivalent
elements together.  I don't see how you could make a usable
multi-container otherwise.

--
James Kanze (GABI Software)             email:james.kanze@gmail.com
Conseils en informatique orient   e objet/
                   Beratung in objektorientierter Datenverarbeitung
9 place S   mard, 78210 St.-Cyr-l'   cole, France, +33 (0)1 30 23 00 34

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]

Author: "Greg Herlihy" <greghe@pacbell.net>
Date: Tue, 7 Nov 2006 11:54:49 CST Raw View

David Abrahams wrote:
> usenet@aristeia.com (Scott Meyers) writes:

> > Perhaps Stepanov meant lower_bound and upper_bound to have meanings
> > inherently tied to ordering.  I don't know.  I do know that it's very
> > strange to have equal_range be defined for the standard associative
> > containers as returning the pair consisting of lower_bound and
> > upper_bound, to offer equal_range on unordered containers, and then to
> > argue that lower_bound and upper_bound make no sense for unordered
> > containers.
>
> I agree that it is strange.

Both ordered and unordered containers guarantee that elements of
equivalent value will reside at consecutive positions within the
container. So it makes sense that both container classes would also
have the same method, equal_range(), to enable a client to find out the
location and extent of a group of items (if any) within the container
that are all equivalent in value to a specified key.

Now, in a sorted container, it will be true that each group of
equal-valued elements will fall between the lower and upper bound
positions for that group's key value. But this observation is true
simply because the elements in an ordered container reside in sorted
order. Furthermore, an unordered container proves that it is indeed
possible to group items of equivalent value within a container without
having to sort all of the elements as well. In other words
grouping-by-value does not necessitate a sorting-by-value.

So for a container that merely groups elements by value and does not
sort them, describing equal_range's operation in terms of an upper or
lower bound no longer is meaningful - since there may be no such thing
as lower-valued keys or higher-valued keys anywhere in existence -
either inside or outside - of that container. So if anything it is the
description of equal_range for an ordered container that is causing
some confusion.

And as a practical matter, an upper_bound() or lower_bound() for an
unordered container class would still be misleading, and for this
reason: inserting an item into an unordered container can change the
number of buckets, change the elements within each bucket, and change
the order of items therein, Since none of these consequences are at all
possible when inserting an item into an an ordered container, the mere
existence of these methods would imply a stable order of items that an
unordered container does not in fact provide.

Greg

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]

Author: usenet@aristeia.com (Scott Meyers)
Date: Tue, 7 Nov 2006 20:24:51 GMT Raw View

James Kanze wrote:
> I'm not sure.  Independantly of a lower or upper_bound
> function, about the only way I can see of visiting all of the
> elements with an equivalent value is for the iterator returned
> by find() to point to one of them, and incrementing it to point
> to the others.  Something like:
>
>     Container::iterator iter = c.find( myKey ) ;
>     while ( iter != c.end() && *iter == myKey ) {
>         //  ...
>         ++ iter ;
>     }
>
> Practically speaking, this also requires keeping the equivalent
> elements together.  I don't see how you could make a usable
> multi-container otherwise.

Just use a bucket_iterator over the bucket identified by
container.bucket(key).  Walk the bucket looking for all values
equivalent to key.  Ignore the ones that aren't equivalent.

BTW, the call to find above isn't guaranteed to return the *first*
element with a given value in a multi-container.  For that you need, um,
lower_bound or equal_range or, if you don't have them (and in the TR1
containers, you have equal_range), you can do the bucket iteration.  Or
you can just walk the entire container from begin() to end(), if you
want to take the scenic route.

Scott

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]

Author: usenet@aristeia.com (Scott Meyers)
Date: Tue, 31 Oct 2006 19:54:49 GMT Raw View

James Kanze wrote:
> And what should the behavior be?  At least in my mind, names
> like upper_bound and lower_bound suggest an ordering criteria.

I can think of a couple of possible behaviors.  Remember that
equal_range is already supported.  So if the key is found, lower_bound
could return equal_range(key).first.  For consistency with the standard
associative containers, a failed lower_bound search would return the
correct insertion point should the value be added.  Or it could do what
equal_range does and return end() if a search fails.

It seems tautological to me that if equal_range can be specified and
implemented, so can lower- and upper_bound.  So my question is simply
why only equal_range was provided.

Scott

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]

Author: usenet@aristeia.com (Scott Meyers)
Date: Tue, 31 Oct 2006 19:55:18 GMT Raw View

Greg Herlihy wrote:
> A client could implement their own unordered container lower_bound() or
> upper_bound() routines that would be every bit as efficient as any that
> the unordered container class could offer itself.

I must be overlooking something, because I don't see how.  The container
can hash the key and then search the appropriate bucket.  This is
essentially O(1), assuming a decent distribution of values.  Presumably
this is how equal_range is implemented. (The complexity guarantee
essentially says so.)  I can write code to search a bucket myself, and I
can hash a key to get a size_t, but I don't think I have a way to map
from the result of the hash function to the bucket to search, so I don't
see how I can implement this as efficiently as the container
implementer.  Or am I supposed to assume that the bucket to search is
hash(key)%container.bucket_count()?

Scott

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]

Author: "P.J. Plauger" <pjp@dinkumware.com>
Date: Tue, 31 Oct 2006 15:09:49 CST Raw View

"Greg Herlihy" <greghe@pacbell.net> wrote in message
news:1162256109.292731.207040@e3g2000cwe.googlegroups.com...

> Scott Meyers wrote:
>> The unordered containers specified in TR1 include no support for the
>> member functions upper_bound and lower_bound, though they do include
>> support for equal_range.  The behavior of lower_bound and upper_bound in
>> the standard makes sense only for sorted ranges, but this could easily
>> have been changed for the unordered containers.  After all, the C++03
>> definition of equal_range is that it returns a pair containing the
>> values of calls to lower_bound and upper_bound, but this was modified
>> for the unordered containers: calls to equal_range that find nothing
>> return a pair of end iterators.
>
>> I'm not asking for anything to change, just for somebody to explain why
>> these functions were omitted.  FWIW, it seems to me that they'd be easy
>> to implement, and the only change to the spec would be that if either
>> search fails, the iterator returned is the one that would be used to
>> insert the value that is currently not present.  This isn't how C++03
>> defines the return value of these functions for failed searches, but it
>> is behaviorally identical, I believe.
>
> A client could implement their own unordered container lower_bound() or
> upper_bound() routines that would be every bit as efficient as any that
> the unordered container class could offer itself. And if there is
> nothing to be gained by implementing an external routine as a class
> method, then the routine should not be made a class method.
>
> Moreover the mere existence of lower_bound() and upper_bound() methods
> for unordered container class would likely prove to be very misleading.
> The client is apt to expect performance comparable to the ordered
> container versions of the same routines - but in reality the
> performance will be orders of magnitude worse. After all, the only way
> to find the lower or upper bound for a key in an unordered container is
> to search the entire container - which of course leads to another
> question: if the client needs to find an upper or lower bound for a
> value in a container, why on earth would they be using an unordered
> container in the first place?

We've provided lower_bound and upper_bound in our hash_map/set for years.
And since we use the same base class for these and the unordered_map/set
of TR1 they're still lurking about. They use effectively the same
algorithm as equal_range, so they are not "orders of magnitude worse".
All of these functions have to do a linear search on the bucket chosen
by the hashed key, so all decay to O(1) for a bad hash function (just
like darned near everything else in hash tables). But if the bucket
doesn't contain too much other stuff besides elements in the sequence
delimited by equal_range, all can be pretty fast.

The essential requirement for any of these functions to make sense is
that all elements with equivalent ordering in an unordered_multimap/set
be adjacent within a bucket, and hence form a proper subsequence within
the container. All that lower_bound then promises is that it designates
the first element in that subsequence, if it's non-empty. And all that
upper_bound promises is that it designates the first element after that
subsequence in the container. If they return end() when the subsequence
is empty, so much the better for tidiness.

Our common engine takes an ordering function that behaves like any of
operator<, operator>, or operator==. That matches the requirements of
unordered_map/set, but also makes it easy to use our hash_map/set or
unordered_map/set as a drop-in replacement for map/set. So the bottom
line is, Scott Meyers is right to question to absence of lower_bound and
upper_bound, even though it was a good and proper decision to omit these
from unordered_map/set.

P.J. Plauger
Dinkumware, Ltd.
http://www.dinkumware.com

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]

Author: pjp@dinkumware.com ("P.J. Plauger")
Date: Tue, 31 Oct 2006 23:50:29 GMT Raw View

"Scott Meyers" <usenet@aristeia.com> wrote in message
news:12kf7q26o2tq2ed@corp.supernews.com...

> Greg Herlihy wrote:
>> A client could implement their own unordered container lower_bound() or
>> upper_bound() routines that would be every bit as efficient as any that
>> the unordered container class could offer itself.
>
> I must be overlooking something, because I don't see how.  The container
> can hash the key and then search the appropriate bucket.  This is
> essentially O(1), assuming a decent distribution of values.  Presumably
> this is how equal_range is implemented. (The complexity guarantee
> essentially says so.)  I can write code to search a bucket myself, and I
> can hash a key to get a size_t, but I don't think I have a way to map from
> the result of the hash function to the bucket to search, so I don't see
> how I can implement this as efficiently as the container implementer.  Or
> am I supposed to assume that the bucket to search is
> hash(key)%container.bucket_count()?

Try bucket(key). In particular, you can use begin(bucket(key)) and
end(bucket(key)) to delimit the range of elements in the bucket
selected by key.

P.J. Plauger
Dinkumware, Ltd.
http://www.dinkumware.com


---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]

Author: Scott Meyers <usenet@aristeia.com>
Date: Tue, 31 Oct 2006 22:36:47 CST Raw View

P.J. Plauger wrote:
> The essential requirement for any of these functions to make sense is
> that all elements with equivalent ordering in an unordered_multimap/set
> be adjacent within a bucket, and hence form a proper subsequence within
> the container. All that lower_bound then promises is that it designates
> the first element in that subsequence, if it's non-empty. And all that
> upper_bound promises is that it designates the first element after that
> subsequence in the container. If they return end() when the subsequence
> is empty, so much the better for tidiness.

But not for consistency with other standard associative containers,
where failed lower- and upper_bound searches return an iterator to the
element that would follow what you searched for if what you searched for
was there.  In general, this is not container.end(), and I don't see how
  the change in semantics makes things any more or less tidy.

> line is, Scott Meyers is right to question to absence of lower_bound and
> upper_bound, even though it was a good and proper decision to omit these
> from unordered_map/set.

I'm not questioning the decision, I'm trying to find out why the
decision was made.  Given your view that it was good and proper and your
doubtless participation in the discussions that took place during
development of TR1, can you lend insight into why lower_bound and
upper_bound were omitted?

Thanks,

Scott

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]

Author: Scott Meyers <usenet@aristeia.com>
Date: Tue, 31 Oct 2006 22:38:00 CST Raw View

P.J. Plauger wrote:
> The essential requirement for any of these functions to make sense is
> that all elements with equivalent ordering in an unordered_multimap/set
> be adjacent within a bucket, and hence form a proper subsequence within
> the container.

This suggests that maybe I've been asking the wrong question.  Given
that it's an *unordered* container, there is no a priori reason to
expect that all elements with equivalent keys will be next to one
another, and for the multi containers, it seems to me that satisfying
this constraint imposes a cost that would otherwise not be there.  (For
example, insertion into a multi container can't just insert the element
at the beginning of the bucket without looking at the value of any
elements that happen to already be there.)

So a better question is probably this:  why do the unordered containers
support equal_range?

Scott

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]

Author: "Greg Herlihy" <greghe@pacbell.net>
Date: Tue, 31 Oct 2006 22:37:34 CST Raw View

P.J. Plauger wrote:
> "Greg Herlihy" <greghe@pacbell.net> wrote in message
> news:1162256109.292731.207040@e3g2000cwe.googlegroups.com...
>
> Our common engine takes an ordering function that behaves like any of
> operator<, operator>, or operator==. That matches the requirements of
> unordered_map/set, but also makes it easy to use our hash_map/set or
> unordered_map/set as a drop-in replacement for map/set. So the bottom
> line is, Scott Meyers is right to question to absence of lower_bound and
> upper_bound, even though it was a good and proper decision to omit these
> from unordered_map/set.

If so, it seems all the more curious that the current interfaces of
std::unordered_map and std::unordered_set in the Dinkumware C++ Library
do not offer lower_bound() and upper_bound() methods as extensions.
After all, the Standard allows an implementation to add their own
(non-virtual) methods to a Standard Library class interface. And if it
really were possible to implement these two (quite useful) routines in
a reasonably efficient manner for the unordered containers, then what
possible reason could there be for leaving them out?

As it turns out, there is a very good reason why the unordered
containers have no lower_bound() and upper_bound methods() - and why
they never will - not the Dinkumware library, and not in any other
Standard C++ library. The obstacle is not so much the egregious
performance cost of exhaustively searching the container to find the
best matching key (except on those rare occasions when the method
"lucks out" and finds a key exactly at the boundary of the requested
range or finds a key with the closest possible value to the boundary
before the exhaustive search completes). But the reality is that the
vast majority of the time, a call to lower_bound() or upper_bound() for
an unordered container would necessitate nothing less than checking
every key in the container. And that is actually the good news. The
not-so-good news - and the primary stumbling block - is that actually
implementing lower_bound() or upper_bound() for an unordered container
is likely to prove impossible.

Unlike the keys in a hash_map (which follow a strict weak ordering),
the keys in an unordered container are "equivalently ordered". In other
words, while it is possible to test whether two keys are equivalent, it
may not be possible to test whether one key is less than or greater
than another. In fact no sorted order may even exist for an unordered
container's keys. Therefore trying to find the upper bound or lower
bound for a key value without any way to to rank the keys - is a task
that is certain not only to take a long time to complete - but is in
fact a task that is certain to take forever before it is finished.

Greg

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]

Author: pjp@dinkumware.com ("P.J. Plauger")
Date: Wed, 1 Nov 2006 15:19:51 GMT Raw View

"Greg Herlihy" <greghe@pacbell.net> wrote in message
news:1162341590.392186.241190@m73g2000cwd.googlegroups.com...

> P.J. Plauger wrote:
>> "Greg Herlihy" <greghe@pacbell.net> wrote in message
>> news:1162256109.292731.207040@e3g2000cwe.googlegroups.com...
>>
>> Our common engine takes an ordering function that behaves like any of
>> operator<, operator>, or operator==. That matches the requirements of
>> unordered_map/set, but also makes it easy to use our hash_map/set or
>> unordered_map/set as a drop-in replacement for map/set. So the bottom
>> line is, Scott Meyers is right to question to absence of lower_bound and
>> upper_bound, even though it was a good and proper decision to omit these
>> from unordered_map/set.
>
> If so, it seems all the more curious that the current interfaces of
> std::unordered_map and std::unordered_set in the Dinkumware C++ Library
> do not offer lower_bound() and upper_bound() methods as extensions.

But they do. We don't flaunt them, because every time we mention a
conforming extension someone starts flaming us about vendor lockin.

> After all, the Standard allows an implementation to add their own
> (non-virtual) methods to a Standard Library class interface. And if it
> really were possible to implement these two (quite useful) routines in
> a reasonably efficient manner for the unordered containers, then what
> possible reason could there be for leaving them out?

None.

> As it turns out, there is a very good reason why the unordered
> containers have no lower_bound() and upper_bound methods() - and why
> they never will - not the Dinkumware library, and not in any other
> Standard C++ library. The obstacle is not so much the egregious
> performance cost of exhaustively searching the container to find the
> best matching key (except on those rare occasions when the method
> "lucks out" and finds a key exactly at the boundary of the requested
> range or finds a key with the closest possible value to the boundary
> before the exhaustive search completes). But the reality is that the
> vast majority of the time, a call to lower_bound() or upper_bound() for
> an unordered container would necessitate nothing less than checking
> every key in the container. And that is actually the good news. The
> not-so-good news - and the primary stumbling block - is that actually
> implementing lower_bound() or upper_bound() for an unordered container
> is likely to prove impossible.
>
> Unlike the keys in a hash_map (which follow a strict weak ordering),
> the keys in an unordered container are "equivalently ordered". In other
> words, while it is possible to test whether two keys are equivalent, it
> may not be possible to test whether one key is less than or greater
> than another. In fact no sorted order may even exist for an unordered
> container's keys. Therefore trying to find the upper bound or lower
> bound for a key value without any way to to rank the keys - is a task
> that is certain not only to take a long time to complete - but is in
> fact a task that is certain to take forever before it is finished.

Your logic is impeccable, but wrong. In an earlier posting, I tried to
describe how the searches work for lower_bound, upper_bound, and
equal_range (which *is* required by TR1). Evidently I failed to get
the point across that they all are O(1) for a well balanced hash
table and O(N) for an ill balanced one. Try rereading it, or just
read the code in our (widely available) header xhash. Microsoft has
been shipping it since V7.1.

P.J. Plauger
Dinkumware, Ltd.
http://www.dinkumware.com


---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]

Author: pjp@dinkumware.com ("P.J. Plauger")
Date: Wed, 1 Nov 2006 15:20:11 GMT Raw View

"Scott Meyers" <usenet@aristeia.com> wrote in message
news:12kg6vteb2ha9f0@corp.supernews.com...

> P.J. Plauger wrote:
>> The essential requirement for any of these functions to make sense is
>> that all elements with equivalent ordering in an unordered_multimap/set
>> be adjacent within a bucket, and hence form a proper subsequence within
>> the container.
>
> This suggests that maybe I've been asking the wrong question.  Given that
> it's an *unordered* container, there is no a priori reason to expect that
> all elements with equivalent keys will be next to one another, and for the
> multi containers, it seems to me that satisfying this constraint imposes a
> cost that would otherwise not be there.  (For example, insertion into a
> multi container can't just insert the element at the beginning of the
> bucket without looking at the value of any elements that happen to already
> be there.)

Correct. You have to scan the bucket to look for a key with equivalent
ordering to ensure that all such elements are adjacent. Note that you
have to do this anyway for unordered_map and unordered_set, to ensure
that no such key is already present, so that doesn't cost more. The
extra cost is only for unordered_multimap and unordered_multiset.

> So a better question is probably this:  why do the unordered containers
> support equal_range?

Because the committee felt that it was useful to be able to report
all elements with equivalent ordering simply.

P.J. Plauger
Dinkumware, Ltd.
http://www.dinkumware.com

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]

Author: "Denise Kleingeist" <denise.kleingeist@googlemail.com>
Date: Wed, 1 Nov 2006 11:06:19 CST Raw View

Hello Scott!
Scott Meyers wrote:
> The unordered containers specified in TR1 include no support for the
> member functions upper_bound and lower_bound, though they do include
> support for equal_range.

Keys in a hash container have to provide an equivalence
relation, especially if it is not a multimap version: the
hash value of the key only selects the bucket, to identify
the correct key an equivalence relation is needed. Based on
this relation, it is reasonable to provide an equal_range(),
equivalent objects have to be put into the same bucket anyway.

lower_bound() and upper_bound() require a partial order on
the key and to be efficient some order of the elements in the
container: if a < b holds, lower_bound(b) sbould be reachable
from lower_bound(a); similarily for upper_bound(). Requiring or
even just suggesting a corresponding ordering on the elements
of an "unordered *" object seems somewhat strange to me.

Basically, for a hash being based on equivalence relationship
an equal_range() makes sense. For ordered associative
containers based on a partial order, lower_bound() and
upper_bound() are reasonable. Since a partial order also
implies an equivalence relationship for its elements,
equal_range() also makes sense for the ordered associative
containers.

Good luck, Denise.

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]

Author: pjp@dinkumware.com ("P.J. Plauger")
Date: Wed, 1 Nov 2006 17:57:34 GMT Raw View

"Scott Meyers" <usenet@aristeia.com> wrote in message
news:12kfqlo3cuv1sae@corp.supernews.com...

> P.J. Plauger wrote:
>> The essential requirement for any of these functions to make sense is
>> that all elements with equivalent ordering in an unordered_multimap/set
>> be adjacent within a bucket, and hence form a proper subsequence within
>> the container. All that lower_bound then promises is that it designates
>> the first element in that subsequence, if it's non-empty. And all that
>> upper_bound promises is that it designates the first element after that
>> subsequence in the container. If they return end() when the subsequence
>> is empty, so much the better for tidiness.
>
> But not for consistency with other standard associative containers, where
> failed lower- and upper_bound searches return an iterator to the element
> that would follow what you searched for if what you searched for was
> there.  In general, this is not container.end(), and I don't see how the
> change in semantics makes things any more or less tidy.

The so-called unordered containers do have a bit of ordering:

-- all elements whose keys hash to the same bucket reside in that
bucket (a subsequence of the total controlled sequence)

-- all elements that have equivalent ordering are adjacent within
the same bucket

There is no meaning to the concept of "the element that would follow
what you searched for if what you searched for was there" because
this weak ordering doesn't define what that element might be. What
matters, therefore, is that lower_bound and upper_bound have
consistent behavior when the desired subsequence is present. And
so long as they compare equal when the subsequence is not present,
they are quite as useful as for the more strongly ordered associative
containers such as map and set.

>> line is, Scott Meyers is right to question to absence of lower_bound and
>> upper_bound, even though it was a good and proper decision to omit these
>> from unordered_map/set.
>
> I'm not questioning the decision, I'm trying to find out why the decision
> was made.  Given your view that it was good and proper and your doubtless
> participation in the discussions that took place during development of
> TR1, can you lend insight into why lower_bound and upper_bound were
> omitted?

IIRC, other members of the committee didn't feel as strongly as I
that these two functions were useful. And I agreed that equal_range
is adequate for most uses.

P.J. Plauger
Dinkumware, Ltd.
http://www.dinkumware.com

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]

Author: Scott Meyers <usenet@aristeia.com>
Date: Wed, 1 Nov 2006 11:59:45 CST Raw View

Greg Herlihy wrote:
> As it turns out, there is a very good reason why the unordered
> containers have no lower_bound() and upper_bound methods() - and why
> they never will - not the Dinkumware library, and not in any other
> Standard C++ library. The obstacle is not so much the egregious
> performance cost of exhaustively searching the container to find the
> best matching key

There is no need to find the "best matching key." They could return
container.end() on a failed search (as equal_range does on a failed
search).  But see my post questioning the existence of equal_range on
unordered containers in this thread.

> Unlike the keys in a hash_map (which follow a strict weak ordering),
> the keys in an unordered container are "equivalently ordered". In other
> words, while it is possible to test whether two keys are equivalent, it
> may not be possible to test whether one key is less than or greater
> than another. In fact no sorted order may even exist for an unordered
> container's keys. Therefore trying to find the upper bound or lower
> bound for a key value without any way to to rank the keys - is a task
> that is certain not only to take a long time to complete - but is in
> fact a task that is certain to take forever before it is finished.

Which is why the specified behavior for equal_range for a failed search
for unordered containers is different from that for the sorted
associative containers.

Scott

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]

Author: "James Kanze" <james.kanze@gmail.com>
Date: Wed, 1 Nov 2006 14:13:11 CST Raw View

Scott Meyers wrote:
> James Kanze wrote:
> > And what should the behavior be?  At least in my mind, names
> > like upper_bound and lower_bound suggest an ordering criteria.

> I can think of a couple of possible behaviors.  Remember that
> equal_range is already supported.

And elements of unordered containers are required to support
equality.  They aren't required to have an ordering
relationship.

> So if the key is found, lower_bound
> could return equal_range(key).first.  For consistency with the standard
> associative containers, a failed lower_bound search would return the
> correct insertion point should the value be added.

The whole point of unordered containers is that there is no
specified ("correct") insertion point.  The even the notion of
insertion point has no sense, viewed from the outside.

> Or it could do what
> equal_range does and return end() if a search fails.

> It seems tautological to me that if equal_range can be specified and
> implemented, so can lower- and upper_bound.

I don't agree.  It's the opposite which is true: if lower- and
upper_bound can be specified and implemented, so can
equal_range, even if the contained objects don't otherwise
support a relationship of equality.  It is possible to define a
relationship of equality based on order (a == b <==> !(a < b) &&
!(b < a)), but how do you specify a relationship of order based
only on a relationship of equality?

--
James Kanze (Gabi Software)            email: james.kanze@gmail.com
Conseils en informatique orient   e objet/
                   Beratung in objektorientierter Datenverarbeitung
9 place S   mard, 78210 St.-Cyr-l'   cole, France, +33 (0)1 30 23 00 34

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]

Author: "James Kanze" <james.kanze@gmail.com>
Date: Wed, 1 Nov 2006 14:24:24 CST Raw View

Scott Meyers wrote:
> Greg Herlihy wrote:
> > A client could implement their own unordered container lower_bound() or
> > upper_bound() routines that would be every bit as efficient as any that
> > the unordered container class could offer itself.

> I must be overlooking something, because I don't see how.  The container
> can hash the key and then search the appropriate bucket.

But how does this determine a "lower bound"?  What does it
return if the appropriate bucket is empty?  For that matter,
what does it return if the bucket isn't empty?

I don't think that it is even a question of implementation or
efficiency.  What does "lower bound" mean for a collection of
elements which don't support an ordering relationship?  Given:

    typedef std::complex< double >
                        Complex ;
    std::unordered_set< Complex >
                        setOfZ ;
    setOfZ.insert( Complex( 1.2, 3.4 ) ) ;
    setOfZ.insert( Complex( -1.2, 3.4 ) ) ;
    setOfZ.insert( Complex( 1.2, -3.4 ) ) ;
    std::unordered_set::const_iterator
                        pos = setOfZ.lower_bound( Complex( -1.2, -3.4 )
) ;

Where should pos point?  Replace Complex by double, and
regardless of the values involved, the answer is always
obvious.  Note, however, that this "obvious" answer cannot
easily be found in an unordered container; two values very close
to one another may hash to radically different buckets, and two
very distant values may hash to the same bucket.

> This is essentially O(1), assuming a decent distribution of
> values.  Presumably this is how equal_range is implemented.
> (The complexity guarantee essentially says so.)

The reason equal_range is possible is because it doesn't require
establishing an ordering relationship, only finding the
element(s).

> I can write code to search a bucket myself, and I
> can hash a key to get a size_t, but I don't think I have a way to map
> from the result of the hash function to the bucket to search, so I don't
> see how I can implement this as efficiently as the container
> implementer.

I don't think that you can, but I fail to see any use of a
function which would somehow return an iterator to an element
which had no relationship to the element given as a key.  The
only relationship you are proposing is that they hash the same,
but this isn't an ordering in any reasonable sense, and it
doesn't address the problem of what to return if the bucket is
empty.

> Or am I supposed to assume that the bucket to search is
> hash(key)%container.bucket_count()?

I don't think you can suppose that.

But I still don't understand what lower_bound is supposed to
return.  You're arguing that the function should be there, but
I, for one, don't know what it should do.  And not knowing what
it should do, it's hard to argue on any other grounds: I can't
say that what it should do doesn't make sense.

For starters, for example, what is the answer to the question
above: to what element should pos point?

--
James Kanze (Gabi Software)            email: james.kanze@gmail.com
Conseils en informatique orient   e objet/
                   Beratung in objektorientierter Datenverarbeitung
9 place S   mard, 78210 St.-Cyr-l'   cole, France, +33 (0)1 30 23 00 34

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]

Author: "James Kanze" <james.kanze@gmail.com>
Date: Wed, 1 Nov 2006 14:35:35 CST Raw View

Scott Meyers wrote:
> P.J. Plauger wrote:
> > The essential requirement for any of these functions to make sense is
> > that all elements with equivalent ordering in an unordered_multimap/set
> > be adjacent within a bucket, and hence form a proper subsequence within
> > the container.

> This suggests that maybe I've been asking the wrong question.  Given
> that it's an *unordered* container, there is no a priori reason to
> expect that all elements with equivalent keys will be next to one
> another, and for the multi containers, it seems to me that satisfying
> this constraint imposes a cost that would otherwise not be there.  (For
> example, insertion into a multi container can't just insert the element
> at the beginning of the bucket without looking at the value of any
> elements that happen to already be there.)

Well, you've got to iterate through the bucket anyway, since for
the non-multi containers, you have to find out if the element is
already there or not.  The only thing that changes is what you
do if you find a match: in one case, you insert at that
location, and in the other, you don't.

> So a better question is probably this:  why do the unordered containers
> support equal_range?

Because it's useful, at least with the multi containers.  (I
can't imagine anyone using it, rather than find, with the
non-multi containers.)

--
James Kanze (Gabi Software)            email: james.kanze@gmail.com
Conseils en informatique orient   e objet/
                   Beratung in objektorientierter Datenverarbeitung
9 place S   mard, 78210 St.-Cyr-l'   cole, France, +33 (0)1 30 23 00 34


---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]

Author: cbarron413@adelphia.net (Carl Barron)
Date: Wed, 1 Nov 2006 22:36:51 GMT Raw View

In article <5N-dncw_19hrOtXYnZ2dnUVZ_vmdnZ2d@giganews.com>, P.J.
Plauger <pjp@dinkumware.com> wrote:

> [snip]
>
> The so-called unordered containers do have a bit of ordering:
>
> -- all elements whose keys hash to the same bucket reside in that
> bucket (a subsequence of the total controlled sequence)
>
   OK
> -- all elements that have equivalent ordering are adjacent within
> the same bucket
>
   Huh, where is that said?  I only see that equivalent keys are in the
same bucket, not that there is any order there in.

according to N2009=06-0079 [is there a newer document ??] it says
in sec. 23.1.3
-begin quote-
The elements of an unordered associative container are organized into
buckets. Keys with the same hash code appear in
the same bucket. The number of buckets is automatically increased as
elements are added to an unordered associative
container, so that the average number of elements per bucket is kept
below a bound. Rehashing invalidates iterators,
changes ordering between elements, and changes which buckets elements
appear in, but does not invalidate pointers or
references to elements.
-end quote-

I see no special requirements in sec 23.4 either, regarding order in a
bucket.

so if we have an unordered_multiset for simplicity, with three keys
that are in different equivalence classes, but hash to the same value
[and hence are in the same bucket] then any of the following orders
in the bucket.
   a,a,b,c
   a,b,c,a
 if the hash is good there are few entries in the bucket and then in
terms of total # of entries in the unordered_multi_set is O(1) but in
the worst case everything is in the bucket and it is O(N). as specified
via tables in 23.1.3.

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]

Author: "James Kanze" <james.kanze@gmail.com>
Date: Wed, 1 Nov 2006 16:32:37 CST Raw View

Scott Meyers wrote:
> P.J. Plauger wrote:
> > The essential requirement for any of these functions to make sense is
> > that all elements with equivalent ordering in an unordered_multimap/set
> > be adjacent within a bucket, and hence form a proper subsequence within
> > the container. All that lower_bound then promises is that it designates
> > the first element in that subsequence, if it's non-empty. And all that
> > upper_bound promises is that it designates the first element after that
> > subsequence in the container. If they return end() when the subsequence
> > is empty, so much the better for tidiness.

> But not for consistency with other standard associative containers,
> where failed lower- and upper_bound searches return an iterator to the
> element that would follow what you searched for if what you searched for
> was there.

And what is the element that would follow?  How do you define
this if the elements don't define an ordering relationship?

If I understand Plauger's remarks, their definition is the
ordering within a single bucket, and they consider elements in
different buckets unordered.  Which works, and is implementable,
but I'll admit that off-hand, I don't see any use for it.  In
particular, I think (and Plauger will correct me if I'm wrong)
that as the table grows, elements that were once ordered become
unordered; i.e. at one point in time, use of lower- and
upper_bound may show that x <= y < z (i.e. if I iterator from
lower_bound(x) to upper_bound(z), I will find y), and at a later
time no.

--
James Kanze (Gabi Software)            email: james.kanze@gmail.com
Conseils en informatique orient   e objet/
                   Beratung in objektorientierter Datenverarbeitung
9 place S   mard, 78210 St.-Cyr-l'   cole, France, +33 (0)1 30 23 00 34

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]

Author: usenet@aristeia.com (Scott Meyers)
Date: Thu, 2 Nov 2006 00:22:15 GMT Raw View

James Kanze wrote:
> Scott Meyers wrote:
>> But not for consistency with other standard associative containers,
>> where failed lower- and upper_bound searches return an iterator to the
>> element that would follow what you searched for if what you searched for
>> was there.
>
> And what is the element that would follow?  How do you define
> this if the elements don't define an ordering relationship?

I should rephrase.  Rather than saying "the element that would follow" I
should say "an element such that the iterator would be an accurate
insertion hint."  This is both consistent with the behavior of the
sorted containers and implementable, but now that I think of it, not
obviously implementable in a reasonable fashion.  (It can be done by
putting a dummy element in each empty bucket, but that wastes space and
makes many things more complicated.)  So having upper- and lower_bound
return end() on failed searches looks better.  Note that the notion of
"the element that would follow" is well-defined for successful searches,
because each element in the container (except for the last one) has a
specific successor in a begin()-to-end() traversal of the container.

Anyway, I've since received mail from Matt Austern (author of the
proposal that led to unordered containers being in TR1) telling me that
the original hash table interface proposed by Musser et al around 1995
had equal_range but not lower- or upper_bound, so this interface
decision goes back a long time.

Scott

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]

Author: pjp@dinkumware.com ("P.J. Plauger")
Date: Thu, 2 Nov 2006 00:30:21 GMT Raw View

"James Kanze" <james.kanze@gmail.com> wrote in message
news:1162412753.554555.243640@e3g2000cwe.googlegroups.com...

> Scott Meyers wrote:
>> P.J. Plauger wrote:
>> > The essential requirement for any of these functions to make sense is
>> > that all elements with equivalent ordering in an unordered_multimap/set
>> > be adjacent within a bucket, and hence form a proper subsequence within
>> > the container. All that lower_bound then promises is that it designates
>> > the first element in that subsequence, if it's non-empty. And all that
>> > upper_bound promises is that it designates the first element after that
>> > subsequence in the container. If they return end() when the subsequence
>> > is empty, so much the better for tidiness.
>
>> But not for consistency with other standard associative containers,
>> where failed lower- and upper_bound searches return an iterator to the
>> element that would follow what you searched for if what you searched for
>> was there.
>
> And what is the element that would follow?  How do you define
> this if the elements don't define an ordering relationship?
>
> If I understand Plauger's remarks, their definition is the
> ordering within a single bucket, and they consider elements in
> different buckets unordered.  Which works, and is implementable,
> but I'll admit that off-hand, I don't see any use for it.  In
> particular, I think (and Plauger will correct me if I'm wrong)
> that as the table grows, elements that were once ordered become
> unordered; i.e. at one point in time, use of lower- and
> upper_bound may show that x <= y < z (i.e. if I iterator from
> lower_bound(x) to upper_bound(z), I will find y), and at a later
> time no.

Think of lower_bound and upper_bound as each returning half of
what equal_range returns as a pair. All three functions continue
to return the iterators delimiting a subrange with equivalent order
to the search key even as the container grows and shrinks. As a
kindness, if the subrange is empty, all such iterators equal end().

P.J. Plauger
Dinkumware, Ltd.
http://www.dinkumware.com


---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]

Author: pjp@dinkumware.com ("P.J. Plauger")
Date: Thu, 2 Nov 2006 00:30:31 GMT Raw View

"Carl Barron" <cbarron413@adelphia.net> wrote in message
news:011120061556517609%cbarron413@adelphia.net...

> In article <5N-dncw_19hrOtXYnZ2dnUVZ_vmdnZ2d@giganews.com>, P.J.
> Plauger <pjp@dinkumware.com> wrote:
>
>> [snip]
>>
>> The so-called unordered containers do have a bit of ordering:
>>
>> -- all elements whose keys hash to the same bucket reside in that
>> bucket (a subsequence of the total controlled sequence)
>>
>   OK
>> -- all elements that have equivalent ordering are adjacent within
>> the same bucket
>>
>   Huh, where is that said?  I only see that equivalent keys are in the
> same bucket, not that there is any order there in.

They have to be to make equal_range workable.

> according to N2009=06-0079 [is there a newer document ??] it says
> in sec. 23.1.3
> -begin quote-
> The elements of an unordered associative container are organized into
> buckets. Keys with the same hash code appear in
> the same bucket. The number of buckets is automatically increased as
> elements are added to an unordered associative
> container, so that the average number of elements per bucket is kept
> below a bound. Rehashing invalidates iterators,
> changes ordering between elements, and changes which buckets elements
> appear in, but does not invalidate pointers or
> references to elements.
> -end quote-
>
> I see no special requirements in sec 23.4 either, regarding order in a
> bucket.

Right, it's not stated there but implied by the constraints of
equal_range.

> so if we have an unordered_multiset for simplicity, with three keys
> that are in different equivalence classes, but hash to the same value
> [and hence are in the same bucket] then any of the following orders
> in the bucket.
>   a,a,b,c
>   a,b,c,a
> if the hash is good there are few entries in the bucket and then in
> terms of total # of entries in the unordered_multi_set is O(1) but in
> the worst case everything is in the bucket and it is O(N). as specified
> via tables in 23.1.3.

Nope. The 'a' elements have to be adjacent. equal_range is a const
function, so it can't reorder the controlled sequence.

P.J. Plauger
Dinkumware, Ltd.
http://www.dinkumware.com


---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]

Author: "James Kanze" <james.kanze@gmail.com>
Date: Thu, 2 Nov 2006 09:36:40 CST Raw View

"P.J. Plauger" wrote:
> "James Kanze" <james.kanze@gmail.com> wrote in message
> news:1162412753.554555.243640@e3g2000cwe.googlegroups.com...

> > Scott Meyers wrote:
> >> P.J. Plauger wrote:
> >> > The essential requirement for any of these functions to make sense is
> >> > that all elements with equivalent ordering in an unordered_multimap/set
> >> > be adjacent within a bucket, and hence form a proper subsequence within
> >> > the container. All that lower_bound then promises is that it designates
> >> > the first element in that subsequence, if it's non-empty. And all that
> >> > upper_bound promises is that it designates the first element after that
> >> > subsequence in the container. If they return end() when the subsequence
> >> > is empty, so much the better for tidiness.

> >> But not for consistency with other standard associative containers,
> >> where failed lower- and upper_bound searches return an iterator to the
> >> element that would follow what you searched for if what you searched for
> >> was there.

> > And what is the element that would follow?  How do you define
> > this if the elements don't define an ordering relationship?

> > If I understand Plauger's remarks, their definition is the
> > ordering within a single bucket, and they consider elements in
> > different buckets unordered.  Which works, and is implementable,
> > but I'll admit that off-hand, I don't see any use for it.  In
> > particular, I think (and Plauger will correct me if I'm wrong)
> > that as the table grows, elements that were once ordered become
> > unordered; i.e. at one point in time, use of lower- and
> > upper_bound may show that x <= y < z (i.e. if I iterator from
> > lower_bound(x) to upper_bound(z), I will find y), and at a later
> > time no.

> Think of lower_bound and upper_bound as each returning half of
> what equal_range returns as a pair. All three functions continue
> to return the iterators delimiting a subrange with equivalent order
> to the search key even as the container grows and shrinks. As a
> kindness, if the subrange is empty, all such iterators equal end().

If I understand correctly, I can (or could) use lower_bound to
get the starting point for iterating through a set of equal
keys, and upper_bound for getting the end point.  But using a
lower_bound with one key, and an upper_bound with another, can
only result in undefined behavior.

I still find that the names suggest an ordering relationship
that just isn't there.  (Whereas equal_range implies an
equivalence relationship that is present.)

--
James Kanze (GABI Software)             email:james.kanze@gmail.com
Conseils en informatique orient   e objet/
                   Beratung in objektorientierter Datenverarbeitung
9 place S   mard, 78210 St.-Cyr-l'   cole, France, +33 (0)1 30 23 00 34


---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]

Author: Carl Barron <cbarron413@adelphia.net>
Date: Thu, 2 Nov 2006 11:06:14 CST Raw View

In article <-6CdnY_zyaGGptTYnZ2dnUVZ_tGdnZ2d@giganews.com>, P.J.
Plauger <pjp@dinkumware.com> wrote:

> "Carl Barron" <cbarron413@adelphia.net> wrote in message
> news:011120061556517609%cbarron413@adelphia.net...
>
> > In article <5N-dncw_19hrOtXYnZ2dnUVZ_vmdnZ2d@giganews.com>, P.J.
> > Plauger <pjp@dinkumware.com> wrote:
> >
> >> [snip]
> >>
> >> The so-called unordered containers do have a bit of ordering:
> >>
> >> -- all elements whose keys hash to the same bucket reside in that
> >> bucket (a subsequence of the total controlled sequence)
> >>
> >   OK
> >> -- all elements that have equivalent ordering are adjacent within
> >> the same bucket
> >>
> >   Huh, where is that said?  I only see that equivalent keys are in the
> > same bucket, not that there is any order there in.
>
> They have to be to make equal_range workable.
>
> > according to N2009=06-0079 [is there a newer document ??] it says
> > in sec. 23.1.3
> > -begin quote-
> > The elements of an unordered associative container are organized into
> > buckets. Keys with the same hash code appear in
> > the same bucket. The number of buckets is automatically increased as
> > elements are added to an unordered associative
> > container, so that the average number of elements per bucket is kept
> > below a bound. Rehashing invalidates iterators,
> > changes ordering between elements, and changes which buckets elements
> > appear in, but does not invalidate pointers or
> > references to elements.
> > -end quote-
> >
> > I see no special requirements in sec 23.4 either, regarding order in a
> > bucket.
>
> Right, it's not stated there but implied by the constraints of
> equal_range.
>
> > so if we have an unordered_multiset for simplicity, with three keys
> > that are in different equivalence classes, but hash to the same value
> > [and hence are in the same bucket] then any of the following orders
> > in the bucket.
> >   a,a,b,c
> >   a,b,c,a
> > if the hash is good there are few entries in the bucket and then in
> > terms of total # of entries in the unordered_multi_set is O(1) but in
> > the worst case everything is in the bucket and it is O(N). as specified
> > via tables in 23.1.3.
>
> Nope. The 'a' elements have to be adjacent. equal_range is a const
> function, so it can't reorder the controlled sequence.
>
  Mutable inards, avoid this problem, but create another.  I assume
[don't see any contradictory evidence] that equal_range() does not
invalidate iterators.  I would think that a lookup operation does not
invalidate iterators, and so the data must be adjacent if the
pair returned represents a valid range and the validity of an iterator
is not changed via equal_range.

If equal_range can invalidate iterators, then storage in a bucket can be
random, otherwise equivalent items must be adjacent.

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]

Author: usenet@aristeia.com (Scott Meyers)
Date: Mon, 30 Oct 2006 17:45:19 GMT Raw View

[This is a resend, as the original has not appeared in well over a day
since submission. --Scott]

The unordered containers specified in TR1 include no support for the
member functions upper_bound and lower_bound, though they do include
support for equal_range.  The behavior of lower_bound and upper_bound in
the standard makes sense only for sorted ranges, but this could easily
have been changed for the unordered containers.  After all, the C++03
definition of equal_range is that it returns a pair containing the
values of calls to lower_bound and upper_bound, but this was modified
for the unordered containers: calls to equal_range that find nothing
return a pair of end iterators.

I can't find any rationale for omitting lower_bound and upper_bound from
the unordered containers in the proposal document
(http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2003/n1456.html) or
in Pete Becker's TR1 book.   Does anybody know why they were excluded?

I'm not asking for anything to change, just for somebody to explain why
these functions were omitted.  FWIW, it seems to me that they'd be easy
to implement, and the only change to the spec would be that if either
search fails, the iterator returned is the one that would be used to
insert the value that is currently not present.  This isn't how C++03
defines the return value of these functions for failed searches, but it
is behaviorally identical, I believe.

Thanks for any info about the reason behind these missing functions for
the unordered containers,

Scott

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]

Author: AlbertoBarbati@libero.it (Alberto Ganesh Barbati)
Date: Mon, 30 Oct 2006 19:33:55 GMT Raw View

Scott Meyers ha scritto:
> I can't find any rationale for omitting lower_bound and upper_bound from
> the unordered containers in the proposal document
> (http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2003/n1456.html) or
> in Pete Becker's TR1 book.   Does anybody know why they were excluded?
>
> I'm not asking for anything to change, just for somebody to explain why
> these functions were omitted.  FWIW, it seems to me that they'd be easy
> to implement, and the only change to the spec would be that if either
> search fails, the iterator returned is the one that would be used to
> insert the value that is currently not present.  This isn't how C++03
> defines the return value of these functions for failed searches, but it
> is behaviorally identical, I believe.

Well, it's not really a rationale, but I just can't find any use-case
where it would make sense to have the lower_bound but not also the
upper_bound (or vice-versa) for an unordered container. If I always need
both and equal_range gives me both, why should I bother with
lower_bound/upper_bound?

Suppose we defined lower_bound as you suggest, that is you get the
iterator that would be used to insert the value on failure. What use
such iterator might have? You can feed it to insert() as the hint
parameter, although the implementation is allowed to ignore the hint.
For all other uses, it's just as good as an iterator to an element
chosen at random. Practically useless.

Just my opinion,

Ganesh

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]

Author: "James Kanze" <james.kanze@gmail.com>
Date: Tue, 31 Oct 2006 12:00:17 CST Raw View

Scott Meyers wrote:

> The unordered containers specified in TR1 include no support for the
> member functions upper_bound and lower_bound, though they do include
> support for equal_range.  The behavior of lower_bound and upper_bound in
> the standard makes sense only for sorted ranges, but this could easily
> have been changed for the unordered containers.  After all, the C++03
> definition of equal_range is that it returns a pair containing the
> values of calls to lower_bound and upper_bound, but this was modified
> for the unordered containers: calls to equal_range that find nothing
> return a pair of end iterators.

And what should the behavior be?  At least in my mind, names
like upper_bound and lower_bound suggest an ordering criteria.

--
James Kanze (GABI Software)             email:james.kanze@gmail.com
Conseils en informatique orient   e objet/
                   Beratung in objektorientierter Datenverarbeitung
9 place S   mard, 78210 St.-Cyr-l'   cole, France, +33 (0)1 30 23 00 34


---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]

Author: "Greg Herlihy" <greghe@pacbell.net>
Date: Tue, 31 Oct 2006 12:26:33 CST Raw View

Scott Meyers wrote:
> The unordered containers specified in TR1 include no support for the
> member functions upper_bound and lower_bound, though they do include
> support for equal_range.  The behavior of lower_bound and upper_bound in
> the standard makes sense only for sorted ranges, but this could easily
> have been changed for the unordered containers.  After all, the C++03
> definition of equal_range is that it returns a pair containing the
> values of calls to lower_bound and upper_bound, but this was modified
> for the unordered containers: calls to equal_range that find nothing
> return a pair of end iterators.

> I'm not asking for anything to change, just for somebody to explain why
> these functions were omitted.  FWIW, it seems to me that they'd be easy
> to implement, and the only change to the spec would be that if either
> search fails, the iterator returned is the one that would be used to
> insert the value that is currently not present.  This isn't how C++03
> defines the return value of these functions for failed searches, but it
> is behaviorally identical, I believe.

A client could implement their own unordered container lower_bound() or
upper_bound() routines that would be every bit as efficient as any that
the unordered container class could offer itself. And if there is
nothing to be gained by implementing an external routine as a class
method, then the routine should not be made a class method.

Moreover the mere existence of lower_bound() and upper_bound() methods
for unordered container class would likely prove to be very misleading.
The client is apt to expect performance comparable to the ordered
container versions of the same routines - but in reality the
performance will be orders of magnitude worse. After all, the only way
to find the lower or upper bound for a key in an unordered container is
to search the entire container - which of course leads to another
question: if the client needs to find an upper or lower bound for a
value in a container, why on earth would they be using an unordered
container in the first place?

Greg

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]