Topic: Heterogeneous Comparisons in Binary Search [was: a small holiday gift, I hope]


Author: kanze@gabi-soft.de
Date: Sat, 6 Jan 2001 02:02:40 GMT
Raw View
"David Abrahams" <abrahams@mediaone.net> writes:

|>  P.S. Even though I know many implementations where it works, I don't
|>  know of any implementation on which x < y is not undefined behavior
|>  when x and y are pointers. Do you?

It's undefined behavior on all implementations.  On most
implementations, the undefined behavior actually does what you'd
intuitively expect.  At least on some Intel 16 bit implementations, it
doesn't.  No crashes, or anything, but you can get two different
pointers for which a<b, b<a and a=3D=3Db are all false.

--=20
James Kanze                               mailto:kanze@gabi-soft.de
Conseils en informatique orient=E9e objet/
                   Beratung in objektorientierter Datenverarbeitung
Ziegelh=FCttenweg 17a, 60598 Frankfurt, Germany Tel. +49(069)63198627

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.research.att.com/~austern/csc/faq.html                ]
[ Note that the FAQ URL has changed!  Please update your bookmarks.     ]





Author: "David Abrahams" <abrahams@mediaone.net>
Date: Wed, 3 Jan 2001 20:20:08 GMT
Raw View
"John D. Hickin" <hickin@cam.org> wrote in message
news:z_K36.1148$0k1.32072@wagner.videotron.net...
>
> ----- Original Message -----
> From: "David Abrahams" <abrahams@mediaone.net>
>
> >
> > First, I note that the standard contains a requirement on the element
> > type which was surely unintended: even when using an explicit
> > comparison function, the standard requires that the elements be
> > LessThanComparable. This means, for example, that the element types
> > can't be pointers, even if you use std::less<T> as the comparison
> > object.
> >
>
>
> Perhaps it was intended to be this way so that map<K,V>, for K a pointer
> type, could be efficiently implemented.

I can't understand what this could possibly have to do with map<>. You can't
use the binary search algorithms to implement std::map<> because linear
containers don't meet the efficiency constraints for insertion, and
node-based containers (e.g. rb-trees) have bidirectional iterators and thus
don't meet the efficiency constraints for, e.g., map<>::lower_bound when
used with binary search algorithms.

Binary searching over pointer types already works efficiently, except for my
point above. std::less<T*>(T*x, T*y) is usually just implemented (inline) in
terms of x < y.

> In such a case one could take the
> view that Standard C++ cannot be implemented on all hardware
architectures.
> This would be nothing outrageous, IMO, as POSIX Threads already imposes
mild
> constraints on hardware. Imposing LessThanComparable for pointer types may
> represent a more invasive constraint on hardware than the constraints
> imposed by POSIX Threads but my gut feeling is that few processor/OS
> combinations today would be affected by such a constraint.

Maybe so, but there has never been a consensus in the comittee that this was
an acceptable course to take. I still doubt that the restriction was
intentional.

P.S. Even though I know many implementations where it works, I don't know of
any implementation on which x < y is not undefined behavior when x and y are
pointers. Do you?



---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.research.att.com/~austern/csc/faq.html                ]
[ Note that the FAQ URL has changed!  Please update your bookmarks.     ]





Author: "David Abrahams" <abrahams@mediaone.net>
Date: Sat, 30 Dec 2000 14:03:11 GMT
Raw View
In an article posted last week, I began a discussion of how to write
the standard requirements for lower_bound, upper_bound, equal_range,
and binary_search so that heterogeneous comparisons are
possible. Today I'm going to propose replacement standard text.

First, I note that the standard contains a requirement on the element
type which was surely unintended: even when using an explicit
comparison function, the standard requires that the elements be
LessThanComparable. This means, for example, that the element types
can't be pointers, even if you use std::less<T> as the comparison
object.

Now for the proposed changes to standard text:

---

Change 25.3 [lib.alg.sorting] paragraph 3 from:

  3 For all algorithms that take Compare, there is a version that uses
  operator< instead. That is, comp(*i, *j) != false defaults to *i <
  *j != false. For the algorithms to work correctly, comp has to
  induce a strict weak ordering on the values.

to:

  3 For all algorithms that take Compare, there is a version that uses
  operator< instead. That is, comp(*i, *j) != false defaults to *i <
  *j != false. For algorithms not described in lib.alg.binary.search
  (25.3.3) to work correctly, comp has to induce a strict weak
  ordering on the values.

---

Add the following paragraph after 25.3 [lib.alg.sorting] paragraph 5:

  -6- A sequence [start, finish) is partitioned with respect to an
  expression f(e) if there exists a non-negative integer n such that
  for all 0 <= i < distance(start, finish), f(*(begin+i)) is true if
  and only if i < n.

---


Change 25.3.3 [lib.alg.binary.search] paragraph 1 from:

  -1- All of the algorithms in this section are versions of binary
   search and assume that the sequence being searched is in order
   according to the implied or explicit comparison function. They work
   on non-random access iterators minimizing the number of
   comparisons, which will be logarithmic for all types of
   iterators. They are especially appropriate for random access
   iterators, because these algorithms do a logarithmic number of
   steps through the data structure. For non-random access iterators
   they execute a linear number of steps.

to:

   -1- All of the algorithms in this section are versions of binary
    search and assume that the sequence being searched is partitioned
    with respect to an expression formed by binding the search key to
    an argument of the implied or explicit comparison function. They
    work on non-random access iterators minimizing the number of
    comparisons, which will be logarithmic for all types of
    iterators. They are especially appropriate for random access
    iterators, because these algorithms do a logarithmic number of
    steps through the data structure. For non-random access iterators
    they execute a linear number of steps.

---

Change 25.3.3.1 [lib.lower.bound] paragraph 1 from:

   -1- Requires: Type T is LessThanComparable
    (lib.lessthancomparable).

to:

   -1- Requires: The elements e of [first, last) are partitioned with
   respect to the expression e < value or comp(e, value)

---

Remove 25.3.3.1 [lib.lower.bound] paragraph 2:

   -2- Effects: Finds the first position into which value can be
    inserted without violating the ordering.

---

Change 25.3.3.2 [lib.upper.bound] paragraph 1 from:

  -1- Requires: Type T is LessThanComparable (lib.lessthancomparable).

to:

   -1- Requires: The elements e of [first, last) are partitioned with
   respect to the expression !(value < e) or !comp(value, e)

---

Remove 25.3.3.2 [lib.upper.bound] paragraph 2:

   -2- Effects: Finds the furthermost position into which value can be
    inserted without violating the ordering.

---

Change 25.3.3.3 [lib.equal.range] paragraph 1 from:

   -1- Requires: Type T is LessThanComparable
    (lib.lessthancomparable).

to:

   -1- Requires: The elements e of [first, last) are partitioned with
   respect to the expressions e < value and !(value < e) or
   comp(e, value) and !comp(value, e).

Optionally add the following to the end of the proposed text above,
which allows library implementors to make a small optimization at the
cost of slightly complexifying the standard text. The idea is that we
want to ensure that the partition point which defines the upper_bound
is no earlier in the sequence than the partion point which defines the
lower_bound, so that the implementor can do one of the searches over a
subrange:

   Also, for all elements e of [first, last), e < value implies
   !(value < e) or comp(e, value) implies !comp(value, e)

Note also that if we don't add the above, the result of equal_range()
might be an invalid range.

---

Change 25.3.3.3 [lib.equal.range] paragraph 2 from:

   -2- Effects: Finds the largest subrange [i, j) such that the value
    can be inserted at any iterator k in it without violating the
    ordering. k satisfies the corresponding conditions: !(*k < value)
    && !(value < *k) or comp(*k, value) == false && comp(value, *k) ==
    false.

to:

   -2- Returns:
         make_pair(lower_bound(first, last, value),
                   upper_bound(first, last, value))
       or
         make_pair(lower_bound(first, last, value, comp),
                   upper_bound(first, last, value, comp))

Note that the original text did not say whether the first element of
the return value was the beginning or end of the range, or something
else altogether. The proposed text is both more precise and general
enough to accomodate heterogeneous comparisons.

---

Change 25.3.3.3 [lib.binary.search] paragraph 1 from:

   -1- Requires: Type T is LessThanComparable
    (lib.lessthancomparable).

to:

   -1- Requires: The elements e of [first, last) are partitioned with
   respect to the expressions e < value and !(value < e) or comp(e,
   value) and !comp(value, e). Also, for all elements e of [first,
   last), e < value implies !(value < e) or comp(e, value) implies
   !comp(value, e)

===================================

I have posted binary searching code to the boost files area at
http://www.egroups.com/files/boost/Binary+Search/. The .zip file contains:

boost/boost/detail/iterator.hpp, containing:
  boost::detail::iterator_traits<>
     std::iterator_traits, even for nonconforming compilers (as long as
     you only care about difference_type and iterator_category ;-) )
  boost::detail::distance()
     std::distance(), even for nonconforming compilers

boost/boost/binary_search.hpp
  a derivative of the binary search functions in SGI STL, but using cleaner
  workarounds for nonconforming compilers

boost/libs/utility/binary_search_test.hpp
  Test code for verifying the above functionality

The text files, binary_search1.txt and binary_search2.txt are derived from
my postings on the subject and will form the basis for the formal
documentation.

-Dave
Regards,
Dave


---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.research.att.com/~austern/csc/faq.html                ]
[ Note that the FAQ URL has changed!  Please update your bookmarks.     ]





Author: "John D. Hickin" <hickin@cam.org>
Date: Tue, 2 Jan 2001 20:55:39 GMT
Raw View
----- Original Message -----
From: "David Abrahams" <abrahams@mediaone.net>

>
> First, I note that the standard contains a requirement on the element
> type which was surely unintended: even when using an explicit
> comparison function, the standard requires that the elements be
> LessThanComparable. This means, for example, that the element types
> can't be pointers, even if you use std::less<T> as the comparison
> object.
>


Perhaps it was intended to be this way so that map<K,V>, for K a pointer
type, could be efficiently implemented. In such a case one could take the
view that Standard C++ cannot be implemented on all hardware architectures.
This would be nothing outrageous, IMO, as POSIX Threads already imposes mild
constraints on hardware. Imposing LessThanComparable for pointer types may
represent a more invasive constraint on hardware than the constraints
imposed by POSIX Threads but my gut feeling is that few processor/OS
combinations today would be affected by such a constraint.

Regards, John.
>

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.research.att.com/~austern/csc/faq.html                ]
[ Note that the FAQ URL has changed!  Please update your bookmarks.     ]