Thread

Topic: Feature Request: Fast "find" for sorted random access containers

Author: remove.haberg@matematik.su.se (Hans Aberg)
Date: Fri, 5 Jan 2001 17:08:26 GMT Raw View

In article <3A53B49D.3D70967A@wizard.net>, James Kuyper
<kuyper@wizard.net> wrote:
>> In the context we discussed here, it was said that one didn't implement
 operator +, -, [], ... on the "list" class because it one thought people
 might using the without being aware of that is inefficient.
 So I got the impression it was a kind of informational statement to the
 effect that the algorithm is inefficient. But if one is using such an
 algorithm, there should be a way to turn off the diagnostic message. <<

>Sorry - I didn't mean to imply that the diagnostic would contain any
>useful information about the reason for the diagnostic; that's purely a
>QoI issue.
>All I meant was that any code which requires random access to a
>container will almost always trigger a mandatory diagnostic when passed
>a container type (such as list<T>) which doesn't support random access.
>However, there's no requrement that the diagnostic contain any useful
>information.

Anyway, it seems me that one should implement operator +, -, [], ... on
the "list" class and other containers where possible, merely jotting down
the complexities in the new revision.

If one does not like the prospect that people may write inefficient code,
one should figure out a warning system for that:

I think that a suitable warning system is a profiler, which will tell you
which portions of the code that will be needed to be rewritten for any
particular program.

The idea of using an "advance" class just causes one to move away from the
original idea of name overloading.

  Hans Aberg      * Anti-spam: remove "remove." from email address.
                  * Email: Hans Aberg <remove.haberg@member.ams.org>
                  * Home Page: <http://www.matematik.su.se/~haberg/>
                  * AMS member listing: <http://www.ams.org/cml/>

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.research.att.com/~austern/csc/faq.html                ]
[ Note that the FAQ URL has changed!  Please update your bookmarks.     ]

Author: James Kuyper <kuyper@wizard.net>
Date: Wed, 3 Jan 2001 23:27:45 GMT Raw View

Hans Aberg wrote:
>
> In article <91t6j2$7cf$1@nnrp1.deja.com>, Niklas Mellin
> <nimel@my-deja.com> wrote:
> >Isn't an error reported by the compiler the same thing as a diagnostic
> >in most cases?
>
> I do not know what you mean with a "diagnostic". For a compiler, an

You might want to re-read the standard, it's less than perfectly clear
:-) about the matter, but it does specify what a "diagnostic" is.

> "error" is a message that aborts the compiler output, whereas a "warning"
> is a message that still produces the object code output.

This is a distinction that has nothing to do with the standard. The
standard specifies circumstances where a diagnostic is required; it
never prohibits a diagnostic. The standard specifies circumstances where
a conforming implementation is required to correctly execute a program;
it never prohibits an implementation from executing a program (of
course, when execution is not required, there's no such concept as
"correct execution"). There's an anti-correlation between "diagnostic
required" and "execution required", but even that anti-correlation is
less than perfect.

In other words, the standard never distinguishes between error messages
and warning messages.

> In the context we discussed here, it was said that one didn't implement
> operator +, -, [], ... on the "list" class because it one thought people
> might using the without being aware of that is inefficient.
> So I got the impression it was a kind of informational statement to the
> effect that the algorithm is inefficient. But if one is using such an
> algorithm, there should be a way to turn off the diagnostic message.

Sorry - I didn't mean to imply that the diagnostic would contain any
useful information about the reason for the diagnostic; that's purely a
QoI issue.
All I meant was that any code which requires random access to a
container will almost always trigger a mandatory diagnostic when passed
a container type (such as list<T>) which doesn't support random access.
However, there's no requrement that the diagnostic contain any useful
information.

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.research.att.com/~austern/csc/faq.html                ]
[ Note that the FAQ URL has changed!  Please update your bookmarks.     ]

Author: Niklas Mellin <nimel@my-deja.com>
Date: Thu, 21 Dec 2000 15:17:20 GMT Raw View

In article
<remove.haberg-2012002256280001@du148-226.ppp.su-anst.tninet.se>,
  remove.haberg@matematik.su.se (Hans Aberg) wrote:
> In article <91qnl7$86m$1@nnrp1.deja.com>, Niklas Mellin
> <nimel@my-deja.com> wrote:
>
> >> I have not seen this; please give an example when this diagnostic
> >message is issued.
> >
> >int main()
> >{
> >  list<int> int_list;
> >
> >  //...
> >  int_list.begin() + 7; // Will generate diagnostic
> >  int_list.end() - int_list().begin(); // Will also result in
diagnostic
> >}
>
> This just generates an error on my computer (removing "()" of
int_list).

(The parentheses was a typo).

Isn't an error reported by the compiler the same thing as a diagnostic
in most cases?

/Niklas Mellin


Sent via Deja.com
http://www.deja.com/

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.research.att.com/~austern/csc/faq.html                ]
[ Note that the FAQ URL has changed!  Please update your bookmarks.     ]

Author: remove.haberg@matematik.su.se (Hans Aberg)
Date: Thu, 21 Dec 2000 12:24:59 CST Raw View

In article <91t6j2$7cf$1@nnrp1.deja.com>, Niklas Mellin
<nimel@my-deja.com> wrote:
>Isn't an error reported by the compiler the same thing as a diagnostic
>in most cases?

I do not know what you mean with a "diagnostic". For a compiler, an
"error" is a message that aborts the compiler output, whereas a "warning"
is a message that still produces the object code output.

In the context we discussed here, it was said that one didn't implement
operator +, -, [], ... on the "list" class because it one thought people
might using the without being aware of that is inefficient.

So I got the impression it was a kind of informational statement to the
effect that the algorithm is inefficient. But if one is using such an
algorithm, there should be a way to turn off the diagnostic message.

-- Perhaps this is a good suggestion, that one can put in macro directives like
  #warning list::operator[] is inefficient
or
  #info Have a nice day!

  Hans Aberg      * Anti-spam: remove "remove." from email address.
                  * Email: Hans Aberg <remove.haberg@member.ams.org>
                  * Home Page: <http://www.matematik.su.se/~haberg/>
                  * AMS member listing: <http://www.ams.org/cml/>

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.research.att.com/~austern/csc/faq.html                ]
[ Note that the FAQ URL has changed!  Please update your bookmarks.     ]

Author: remove.haberg@matematik.su.se (Hans Aberg)
Date: Thu, 21 Dec 2000 18:56:17 GMT Raw View

In article <memo.20001221144633.48433A@a.btinternet.com>,
brangdon@cix.compulink.co.uk wrote:
>> One reason for having a type ordering = { less, equal, greater } is
>> that such switch staments can be fully checked [...]
>
>No. That is what I just said doesn't work. The reason is that enums have
>a range which includes unnamed values.

Who has said that a type "ordering" should be implemented as an enum? -- I
didn't. :-) After all, the "bool" type is not an enum (3.9.1#6), so why
should an "ordering" be an enum.

>    The range of an enumeration holds all the enumeration values
>    rounded up to the nearest larger binary power minus 1.
...
>    enum order {
>        less_than = -1, equal_to = 0, greater_than = 1
>    };

One might though have a type "ordering" with values in the _set_ { less,
equal, greater, unrelated}. Then there is another wholly different story
if one should allow type coercion to integral types; if one decides that
should be allowed, one could coerce "unrelated" to -2.

>> There are no jumps needed, if the CPU has an instruction to extract the
>> sign of an int. -- I do not know if that is common though.
>
>I would write it as:
>
>     order compare( const S &a, const S &b ) {
>         if (a.x != b.x)
>             return (a.x < b.x) ? less_than : greater_than;
>         if (a.y != b.y)
>             return (a.y < b.y) ? less_than : greater_than;
>         return equal_to;
>     }

Well, if x is a signed integer, its sign can be written (in pseudo-C++) as
  sgn x := (x > 0) - (x < 0)
and there is not anything here that forces a jump. If type coercion rules
are allowed, one just writes ordering(sgn(x)) to get the right ordering
value.

>Although the full compare() would be useful in some algorithms, much of
>the time it would be used like:
>    if (compare( a, b ) == less_than) ...
in which case the straight-forward:
     bool operator<( const S &a, const S &b ) {
         return (a.x == b.x) ? (a.y < b.y) : (a.x < b.x);
     }
>will be maximally efficient, especially if it can be inlined. This is
>what the STL uses. I wouldn't be surprised if a good optimiser could
>reduce the common subexpressions even in the 3-way case.

I think the main point is that when using lexicographical orderings on
containers, etc., the full compare values { less, equal, greater } are
gotten for free. Thus one does not loose anything is such cases but one
gets the benefit of not having to do two comparisons.

  Hans Aberg      * Anti-spam: remove "remove." from email address.
                  * Email: Hans Aberg <remove.haberg@member.ams.org>
                  * Home Page: <http://www.matematik.su.se/~haberg/>
                  * AMS member listing: <http://www.ams.org/cml/>

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.research.att.com/~austern/csc/faq.html                ]
[ Note that the FAQ URL has changed!  Please update your bookmarks.     ]

Author: sirwillard@my-deja.com
Date: Fri, 22 Dec 2000 11:25:25 GMT Raw View

In article <remove.haberg-2112001922500001@du158-226.ppp.su-
anst.tninet.se>,
  remove.haberg@matematik.su.se (Hans Aberg) wrote:
> In article <91t6j2$7cf$1@nnrp1.deja.com>, Niklas Mellin
> <nimel@my-deja.com> wrote:
> >Isn't an error reported by the compiler the same thing as a
diagnostic
> >in most cases?
>
> I do not know what you mean with a "diagnostic". For a compiler, an
> "error" is a message that aborts the compiler output, whereas
a "warning"
> is a message that still produces the object code output.

>From the standard:

[defns.diagnostic] 1.3.2 diagnostic message
a message belonging to an implementation-defined subset of the
implementation_s output messages.

Both warnings and errors are diagnostics.

> In the context we discussed here, it was said that one didn't
implement
> operator +, -, [], ... on the "list" class because it one thought
people
> might using the without being aware of that is inefficient.

Yes.  Though I'd also go as far as to say it's not only inefficient,
but wrong.  Using operator[] denotes random access, which is something
you can't do with a list.

> So I got the impression it was a kind of informational statement to
the
> effect that the algorithm is inefficient. But if one is using such an
> algorithm, there should be a way to turn off the diagnostic message.

There is... use the alternative interface "std::advance".

> -- Perhaps this is a good suggestion, that one can put in macro
directives like
>   #warning list::operator[] is inefficient
> or
>   #info Have a nice day!

This simply wouldn't work.  You only want a diagnostic to be given if
the user attempts to use them.  A #warning would generate a diagnostic
simply by including the header.

--
William E. Kempf
Software Engineer, MS Windows Programmer


Sent via Deja.com
http://www.deja.com/

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.research.att.com/~austern/csc/faq.html                ]
[ Note that the FAQ URL has changed!  Please update your bookmarks.     ]

Author: brangdon@cix.compulink.co.uk (Dave Harris)
Date: Fri, 22 Dec 2000 11:26:22 GMT Raw View

remove.haberg@matematik.su.se (Hans Aberg) wrote (abridged):
> Who has said that a type "ordering" should be implemented as an enum?
> -- I didn't. :-) After all, the "bool" type is not an enum
> (3.9.1#6), so why should an "ordering" be an enum.

Why else could it be? It can't be a class if we want switch() to work. It
hardly seems worth adding a new primitive type (and hence reserved name)
to the language.


> One might though have a type "ordering" with values in the _set_ {
> less, equal, greater, unrelated}.

If you have 4 values in the enum, the problem goes away, but now all code
has to deal with the "unrelated" case - often even if it is not possible.

  Dave Harris, Nottingham, UK | "Weave a circle round him thrice,
      brangdon@cix.co.uk      |   And close your eyes with holy dread,
                              |  For he on honey dew hath fed
 http://www.bhresearch.co.uk/ |   And drunk the milk of Paradise."

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.research.att.com/~austern/csc/faq.html                ]
[ Note that the FAQ URL has changed!  Please update your bookmarks.     ]

Author: remove.haberg@matematik.su.se (Hans Aberg)
Date: Sat, 23 Dec 2000 13:10:19 GMT Raw View

In article <91u1uu$18$1@nnrp1.deja.com>, sirwillard@my-deja.com wrote:
>> In the context we discussed here, it was said that one didn't
>implement
>> operator +, -, [], ... on the "list" class because it one thought
>people
>> might using the without being aware of that is inefficient.
>
>Yes.  Though I'd also go as far as to say it's not only inefficient,
>but wrong.  Using operator[] denotes random access, which is something
>you can't do with a list.

I think it would be silly to say that operator[] says anything about the
underlying implementation. Besides it is standard in other languages such
as Haskell.

>> So I got the impression it was a kind of informational statement to
>the
>> effect that the algorithm is inefficient. But if one is using such an
>> algorithm, there should be a way to turn off the diagnostic message.
>
>There is... use the alternative interface "std::advance".

There is no reason to use "advance" in a library when + is available --
"advance" just clutters the namespace.

>> -- Perhaps this is a good suggestion, that one can put in macro
>directives like
>>   #warning list::operator[] is inefficient
>> or
>>   #info Have a nice day!
>
>This simply wouldn't work.  You only want a diagnostic to be given if
>the user attempts to use them.  A #warning would generate a diagnostic
>simply by including the header.

So one might figure out something else.

Personally, I think one should just implement operators +, - and [] onto
lists, and not bothering about trying to police how people should program.
-- There is no way to ensure that any kind of computer language isn't
abused, so there is no point in trying to police in this case either.

What one can do is helping people to program correctly; giving the
complexity in the standard should suffice, as it suffices in other cases.

  Hans Aberg      * Anti-spam: remove "remove." from email address.
                  * Email: Hans Aberg <remove.haberg@member.ams.org>
                  * Home Page: <http://www.matematik.su.se/~haberg/>
                  * AMS member listing: <http://www.ams.org/cml/>

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.research.att.com/~austern/csc/faq.html                ]
[ Note that the FAQ URL has changed!  Please update your bookmarks.     ]

Author: remove.haberg@matematik.su.se (Hans Aberg)
Date: Sat, 23 Dec 2000 13:11:45 GMT Raw View

In article <memo.20001222103314.5813B@a.btinternet.com>,
brangdon@cix.compulink.co.uk wrote:
>> Who has said that a type "ordering" should be implemented as an enum?
>> -- I didn't. :-) After all, the "bool" type is not an enum
>> (3.9.1#6), so why should an "ordering" be an enum.
>
>Why else could it be? It can't be a class if we want switch() to work. It
>hardly seems worth adding a new primitive type (and hence reserved name)
>to the language.

It would be a new type, just as "bool", adding new keywords as you say.

Orderings are very fundamental in computers, it seems: Every logical
structure can be given a total order, induced by some underlying binary
implementation, and then such a total order can be used to speed up
sorting & searching.

So I think it is worth having such a type.

>> One might though have a type "ordering" with values in the _set_ {
>> less, equal, greater, unrelated}.
>
>If you have 4 values in the enum, the problem goes away, but now all code
>has to deal with the "unrelated" case - often even if it is not possible.

That is the drawback, that the extra case must be considered in switches.

But if one is not exclusively dealing with total orders, it may be worth
giving it a consideration.

For example, if one defines "x => y" to be "x is derived from y", then
some elements are not comparable. -- In math, this is a quite common
situation, "lattice theory", etc.

So one must wiegh the drawback of having to consider the extra case in
swicth statements when using total orders against the advantage of being
able to use the structure for any type of order, not simply total orders.

  Hans Aberg      * Anti-spam: remove "remove." from email address.
                  * Email: Hans Aberg <remove.haberg@member.ams.org>
                  * Home Page: <http://www.matematik.su.se/~haberg/>
                  * AMS member listing: <http://www.ams.org/cml/>

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.research.att.com/~austern/csc/faq.html                ]
[ Note that the FAQ URL has changed!  Please update your bookmarks.     ]

Author: Valentin.Bonnard@free.fr (Valentin Bonnard)
Date: Mon, 25 Dec 2000 00:50:58 GMT Raw View

> we cannot have a range of just 3 values.

The quoted text from the standard just says that we can.

--

Valentin Bonnard

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.research.att.com/~austern/csc/faq.html                ]
[ Note that the FAQ URL has changed!  Please update your bookmarks.     ]

Author: remove.haberg@matematik.su.se (Hans Aberg)
Date: Mon, 18 Dec 2000 17:53:42 GMT Raw View

Summary: I suggest that the C++ std library is augmented with a
logarithmic time "find" function for use with sorted random access
containers.

Description: The C++ std library contains the "find" template function,
which finds an element of a container in linear time: If n is the number
of elements in the container, the maximum number of comparisons needed in
order to find the element is n.
  By contrast, a random access container in which the elements are known
to be sorted with respect to the comparison function can be searched in
logarithmic time, by successively halving intervals. One template  version
is the following algorithm:
  // compare(a, b) = { less_than = -1   <=> a < b
  //                   equal = 0        <=> a == b
  //                   greater_than = 1 <=> a > b }
template <class RandomAccessIterator, class T, class Compare>
inline RandomAccessIterator
find_sorted(RandomAccessIterator first, RandomAccessIterator last,
  const T& value, Compare compare)
{
  RandomAccessIterator high = last;
    while (first != high) {
      RandomAccessIterator i = first + (high - first)/2;
    switch (compare(value, *i)) {
      case -1: high = i; break;
      case 1: first = i + 1; break;
      default: return i;
    }
  }
    return last;
}

This function, if it finds an iterator  i  in the semi-open interval
[first, last)  such that  *i == value, it returns i; otherwise, it returns
"last".

Note that the comparison function "compare" above takes values in the set
{-1, 0, 1}. Using a comparison functions with "bool" values would increase
the number of comparisons, so it seems to be no point in implementing a
find function using such in a library.

If n is the number of elements in the container, the maximum number of
comparisons by find_sorted needed for a find above is the integral part of
1 + log_2 n. (The proof is left as an exercise to the reader.)

Background and motivation: I came across the problem while working with
the compiler-compiler Bison which generates lookup tables which are said
to be slow for a search. These lookup tables are written out in advance of
compile time and are unsorted, and may contain hundreds of names. It
appears to me to be a simple overlook to not also write out a sorted
lookup table, given that the names are all known in advance, and then use
the algorithm given above, to thereby attain faster lookups.
  Thus, my conclusion is that including the algorithm above in the
standard library, and perhaps others as well, would encourage programmers
to generate faster look-ups in cases where possible.

  Hans Aberg      * Anti-spam: remove "remove." from email address.
                  * Email: Hans Aberg <remove.haberg@member.ams.org>
                  * Home Page: <http://www.matematik.su.se/~haberg/>
                  * AMS member listing: <http://www.ams.org/cml/>

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.research.att.com/~austern/csc/faq.html                ]
[ Note that the FAQ URL has changed!  Please update your bookmarks.     ]

Author: Herb Sutter <hsutter@peerdirect.com>
Date: Mon, 18 Dec 2000 19:00:49 GMT Raw View

remove.haberg@matematik.su.se (Hans Aberg) writes:
>Summary: I suggest that the C++ std library is augmented with a
>logarithmic time "find" function for use with sorted random access
>containers.

I don't disagree that this is a good idea, but they already exist --
binary_search, lower_bound, upper_bound, and equal_range. They do what you
want, but they're even slicker, to wit:

>  By contrast, a random access container in which the elements are known
>to be sorted with respect to the comparison function can be searched in
>logarithmic time, by successively halving intervals

The nice thing about these functions is that they behave as you want them to
if the iterators are indeed random-access, but if you happen to supply
forward or bidirectional iterators the algorithms will automatically switch
to linear mode. Interestingly, in both cases they always use O(logN)
comparisons -- the only thing that depends on the iterator category is O(N)
vs. O(logN) steps through the range.

Herb

---
Herb Sutter (mailto:hsutter@peerdirect.com)

CTO, PeerDirect Inc. (http://www.peerdirect.com)
Contributing Editor, C/C++ Users Journal (http://www.cuj.com)

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.research.att.com/~austern/csc/faq.html                ]
[ Note that the FAQ URL has changed!  Please update your bookmarks.     ]

Author: remove.haberg@matematik.su.se (Hans Aberg)
Date: Mon, 18 Dec 2000 22:46:52 GMT Raw View

In article <gums3t4chvsdcuaipte3atg51f5l16cvn2@4ax.com>, Herb Sutter
<hsutter@peerdirect.com> wrote:
>>Summary: I suggest that the C++ std library is augmented with a
>>logarithmic time "find" function for use with sorted random access
>>containers.
>
>I don't disagree that this is a good idea, but they already exist --
>binary_search, lower_bound, upper_bound, and equal_range. They do what you
>want, but they're even slicker, to wit:

Sorry, I will check the standard or making a regular post first then in
the future. :-) -- To my defence, I intended to make a really simple
feature request, before trying something more complicated.

  Hans Aberg      * Anti-spam: remove "remove." from email address.
                  * Email: Hans Aberg <remove.haberg@member.ams.org>
                  * Home Page: <http://www.matematik.su.se/~haberg/>
                  * AMS member listing: <http://www.ams.org/cml/>

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.research.att.com/~austern/csc/faq.html                ]
[ Note that the FAQ URL has changed!  Please update your bookmarks.     ]

Author: remove.haberg@matematik.su.se (Hans Aberg)
Date: Tue, 19 Dec 2000 10:44:07 GMT Raw View

In article <gums3t4chvsdcuaipte3atg51f5l16cvn2@4ax.com>, Herb Sutter
<hsutter@peerdirect.com> wrote:
>I don't disagree that this is a good idea, but they already exist --
>binary_search, lower_bound, upper_bound, and equal_range. They do what you
>want, but they're even slicker, to wit:

On these one does not get to know if the iterator is the element sought,
probably great if the idea is to keep the list sorted, and one only want
to find the place where to insert new elements; but otherwise, one will
have to throw in an extra equality test with these.

>The nice thing about these functions is that they behave as you want them to
>if the iterators are indeed random-access, but if you happen to supply
>forward or bidirectional iterators the algorithms will automatically switch
>to linear mode. Interestingly, in both cases they always use O(logN)
>comparisons -- the only thing that depends on the iterator category is O(N)
>vs. O(logN) steps through the range.

For some reason, one has not bothered implementing +, - on containers such
as list and their iterators with respect to the iterator difference_type;
if one had done that, this feature would have been automatic. That is, one
could have written

template <class ForwardIterator, class T>
ForwardIterator
lower_bound(ForwardIterator first, ForwardIterator last, const T& value)
{
  iterator_traits<ForwardIterator>::difference_type
    len = last - first, half;
  ForwardIterator middle;
  while (len > 0)
  {
    middle = first + len / 2;
    if (*middle < value)
    {
      first = middle + 1;
      len = len - half - 1;
    }
    else
      len = half;
  }
  return first;
}

also for say lists.

-- While at it, I think that comparison functions should return a value in
a type { less_than, equal, greater_than } which could be equal t { -1, 0,
1 }.

Right now there seems to be two types of comparison functions: First those
like string::compare which one does not get to know which value they
return, only >0 or <0, which means that they cannot be used directly in a
switch statement. And second the STL comparisons functions which does not
allow one to determine equality in one comparison.

It is a bit inconsistent.

  Hans Aberg      * Anti-spam: remove "remove." from email address.
                  * Email: Hans Aberg <remove.haberg@member.ams.org>
                  * Home Page: <http://www.matematik.su.se/~haberg/>
                  * AMS member listing: <http://www.ams.org/cml/>

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.research.att.com/~austern/csc/faq.html                ]
[ Note that the FAQ URL has changed!  Please update your bookmarks.     ]

Author: James Kuyper <kuyper@wizard.net>
Date: Tue, 19 Dec 2000 18:06:27 GMT Raw View

Hans Aberg wrote:
...
> For some reason, one has not bothered implementing +, - on containers such
> as list and their iterators with respect to the iterator difference_type;

That was quite deliberate. Operator overloads, by their very nature,
have a tendency to get used inadvertently. It was decided to implement
those operators only on the containers where they have inexpensive
implementations. If you try to use an algorithm that requires +/-
difference_type on a non-reversible container, you'll get a diagnostic
message that will warn you that the algorithm may be significantly more
expensive than you thought (unless of course the algorithm already has
different specializations for forward and bi-directional iterators, such
as for example, std::advance<>()).

However, anytime you're certain you're certain that you're willing to
pay the cost, std::advance<>() and std::distance<>() are the appropriate
methods to use, rather than iterator+difference_type or
iterator-iterator, respectively.

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.research.att.com/~austern/csc/faq.html                ]
[ Note that the FAQ URL has changed!  Please update your bookmarks.     ]

Author: "Homer Meyer" <homer@cqg.com>
Date: Tue, 19 Dec 2000 19:18:22 GMT Raw View

"Hans Aberg" <remove.haberg@matematik.su.se> wrote in message
news:remove.haberg-1812002251330001@du134-226.ppp.su-anst.tninet.se...
> In article <gums3t4chvsdcuaipte3atg51f5l16cvn2@4ax.com>, Herb Sutter
> <hsutter@peerdirect.com> wrote:
> >I don't disagree that this is a good idea, but they already exist --
> >binary_search, lower_bound, upper_bound, and equal_range. They do what
you
> >want, but they're even slicker, to wit:
>
> On these one does not get to know if the iterator is the element sought,
> probably great if the idea is to keep the list sorted, and one only want
> to find the place where to insert new elements; but otherwise, one will
> have to throw in an extra equality test with these.

It's pretty easy to write it yourself:

template<class FwdIter, class T>
inline FwdIter binary_find(
  FwdIter start,
  FwdIter finish,
  const T& elem
) {
  FwdIter result = std::lower_bound(start, finish, elem);
  return (result == finish || elem < *result) ? finish : result;
}

> >The nice thing about these functions is that they behave as you want them
to
> >if the iterators are indeed random-access, but if you happen to supply
> >forward or bidirectional iterators the algorithms will automatically
switch
> >to linear mode. Interestingly, in both cases they always use O(logN)
> >comparisons -- the only thing that depends on the iterator category is
O(N)
> >vs. O(logN) steps through the range.
>
> For some reason, one has not bothered implementing +, - on containers such
> as list and their iterators with respect to the iterator difference_type;
> if one had done that, this feature would have been automatic. That is, one
> could have written

<SNIP>

You should check out the std::advance() algorithm.  I imagine the reason
that + and - are not implemented on bidirectional iterators is that one
could be led to believe that the operators were efficient when they are not.

> -- While at it, I think that comparison functions should return a value in
> a type { less_than, equal, greater_than } which could be equal t { -1, 0,
> 1 }.

Again, you can implement this easily if you want it.

> Right now there seems to be two types of comparison functions: First those
> like string::compare which one does not get to know which value they
> return, only >0 or <0, which means that they cannot be used directly in a
> switch statement.

enum comparison_t { less_than = -1, equal = 0, greater_than = 1 };
inline comparison_t comparison_switch( int v ) {
  return (v > 0) ? greater_than : ((v == 0) ? equal : less_than);
}

...

switch( comparison_switch(result_of_comparison) ) {
  case less_than:
    ...
  case equal:
    ...
  case greater_than:
    ...
}

Although, I cannot imagine how this gives you any significant advantages
over just using a series of if-else statments.

>And second the STL comparisons functions which does not
> allow one to determine equality in one comparison.

I'm not sure what you mean by this.  Both the algorithm std::equal() and the
functor std::equal_to() return a bool indicating whether or not the
arguments are equal.




---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.research.att.com/~austern/csc/faq.html                ]
[ Note that the FAQ URL has changed!  Please update your bookmarks.     ]

Author: James Kuyper <kuyper@wizard.net>
Date: Wed, 20 Dec 2000 16:16:32 GMT Raw View

Homer Meyer wrote:
>
> "Hans Aberg" <remove.haberg@matematik.su.se> wrote in message
> news:remove.haberg-1812002251330001@du134-226.ppp.su-anst.tninet.se...
...
> >And second the STL comparisons functions which does not
> > allow one to determine equality in one comparison.
>
> I'm not sure what you mean by this.  Both the algorithm std::equal() and the
> functor std::equal_to() return a bool indicating whether or not the
> arguments are equal.

Yes, but std::equal(a,b) is defined to return that bool by evaluating
the following expression: !(a<b)&&!(b<a). In other words, two separate
relative comparisons must be performed to execute one equality
comparison. There are ways based upon the as-if rule to avoid this
inefficiency, but not for the general case.

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.research.att.com/~austern/csc/faq.html                ]
[ Note that the FAQ URL has changed!  Please update your bookmarks.     ]

Author: remove.haberg@matematik.su.se (Hans Aberg)
Date: Wed, 20 Dec 2000 16:16:40 GMT Raw View

In article <3A3F63A6.527B4E5D@wizard.net>, James Kuyper
<kuyper@wizard.net> wrote:
>> For some reason, one has not bothered implementing +, - on containers such
>> as list and their iterators with respect to the iterator difference_type;
>
>That was quite deliberate. Operator overloads, by their very nature,
>have a tendency to get used inadvertently. It was decided to implement
>those operators only on the containers where they have inexpensive
>implementations.

Well, I think that classes such as vector and list could have the same
interface in this respect.

> If you try to use an algorithm that requires +/-
>difference_type on a non-reversible container, you'll get a diagnostic
>message that will warn you that the algorithm may be significantly more
>expensive than you thought (unless of course the algorithm already has
>different specializations for forward and bi-directional iterators, such
>as for example, std::advance<>()).

I have not seen this; please give an example when this diagnostic message
is issued.

-- In general, it is better to strive for a simple, common interface, and
allow such diagnostic messages be issued to those that want it, rather
than that one should have to revert to specialty programming when the
features are needed. -- The idea with name overloading is to cut down on
the things that the programmer need to remembers, so it must be better to
have those operators available, rather than the writers of the library
trying to police its uses.

  Hans Aberg      * Anti-spam: remove "remove." from email address.
                  * Email: Hans Aberg <remove.haberg@member.ams.org>
                  * Home Page: <http://www.matematik.su.se/~haberg/>
                  * AMS member listing: <http://www.ams.org/cml/>

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.research.att.com/~austern/csc/faq.html                ]
[ Note that the FAQ URL has changed!  Please update your bookmarks.     ]

Author: remove.haberg@matematik.su.se (Hans Aberg)
Date: Wed, 20 Dec 2000 16:16:53 GMT Raw View

In article <t3vctc4n9j8673@news.supernews.com>, "Homer Meyer"
<homer@cqg.com> wrote:
>> On these one does not get to know if the iterator is the element sought,
>> probably great if the idea is to keep the list sorted, and one only want
>> to find the place where to insert new elements; but otherwise, one will
>> have to throw in an extra equality test with these.
>
>It's pretty easy to write it yourself:

Well, I am well aware of that those things are easy to write, as I started
off writing the algorithms myself; then, because I did not look for the
right names in the library, I did not find the binary search algorithms,
leading up to this silly "Feature Request".

But it is a bother that one should always have to flip in some extra code
when programming with a feature, especially if it is the same kind of code
all the time: The idea with library is that repetitive tasks should be
avoided.

So one variation of the lower_bound search function is when the comparison
function it uses returns value in { less, equal, greater } and lower_bound
itself instead of just a ForwardIterator returns a value
  pair<ForwardIterator, bool>
the second component being true if there is equality.

>Although, I cannot imagine how this gives you any significant advantages
>over just using a series of if-else statments.

In general one wants to avoid conditional sequences as jumps slows down
the CPU: Typically instructions are piped, and that pipe is broken by a
jump.

>>And second the STL comparisons functions which does not
>> allow one to determine equality in one comparison.
>
>I'm not sure what you mean by this.  Both the algorithm std::equal() and the
>functor std::equal_to() return a bool indicating whether or not the
>arguments are equal.

Here I simply mean that STL frequently uses a function
  compare(x, y) <=> x < y
which takes a bool value. Thus if you only have "compare" available and
want to check that x == y, this must be done by checking that both
compare(x, y) & compare(y, x) are false.

If one uses classes which has all the comparisons operators implemented,
consideration like this are not important, but the best way to implement
them is to only have _one_ primitive, if possible, and derive the others
from it. Mathematically, a relation R on a set A is just a subset of A x
A, but in practise in a computer, one often wants information about both x
R y and y R y, which gives four possibilities: { less, equal, greater,
incomparable } (or perhaps "unrelated" instead of "incomparable"). And as
we here use total orders (meaning that all pairs are comparable), this
leaves a function  compare: A x A -> { less, equal, greater }

-- Haskell has the name "Ordering" for this type, which seems right, so it
seems that C++ could as well have a type "ordering" with elements { less,
equal, greater }.

  Hans Aberg      * Anti-spam: remove "remove." from email address.
                  * Email: Hans Aberg <remove.haberg@member.ams.org>
                  * Home Page: <http://www.matematik.su.se/~haberg/>
                  * AMS member listing: <http://www.ams.org/cml/>

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.research.att.com/~austern/csc/faq.html                ]
[ Note that the FAQ URL has changed!  Please update your bookmarks.     ]

Author: Niklas Mellin <nimel@my-deja.com>
Date: Wed, 20 Dec 2000 16:53:14 GMT Raw View

In article
<remove.haberg-2012000045090001@du129-226.ppp.su-anst.tninet.se>,
  remove.haberg@matematik.su.se (Hans Aberg) wrote:
> In article <3A3F63A6.527B4E5D@wizard.net>, James Kuyper
> <kuyper@wizard.net> wrote:
> >> For some reason, one has not bothered implementing +, - on
containers such
> >> as list and their iterators with respect to the iterator
difference_type;
> >
> >That was quite deliberate. Operator overloads, by their very nature,
> >have a tendency to get used inadvertently. It was decided to
implement
> >those operators only on the containers where they have inexpensive
> >implementations.
>
> Well, I think that classes such as vector and list could have the same
> interface in this respect.

They do. You can use advance<> and distance<> both for vectors and
lists.

>
> > If you try to use an algorithm that requires +/-
> >difference_type on a non-reversible container, you'll get a
diagnostic
> >message that will warn you that the algorithm may be significantly
more
> >expensive than you thought (unless of course the algorithm already
has
> >different specializations for forward and bi-directional iterators,
such
> >as for example, std::advance<>()).
>
> I have not seen this; please give an example when this diagnostic
message
> is issued.

int main()
{
  list<int> int_list;

  //...
  int_list.begin() + 7; // Will generate diagnostic
  int_list.end() - int_list().begin(); // Will also result in diagnostic
}





Sent via Deja.com
http://www.deja.com/

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.research.att.com/~austern/csc/faq.html                ]
[ Note that the FAQ URL has changed!  Please update your bookmarks.     ]

Author: "Gillmer J. Derge" <spam@gillmerderge.com>
Date: Wed, 20 Dec 2000 17:17:07 GMT Raw View

"James Kuyper" <kuyper@wizard.net> wrote in message
news:3A401B00.BED87714@wizard.net...
> Homer Meyer wrote:
> >
> > "Hans Aberg" <remove.haberg@matematik.su.se> wrote in message
> > news:remove.haberg-1812002251330001@du134-226.ppp.su-anst.tninet.se...
> ...
> > >And second the STL comparisons functions which does not
> > > allow one to determine equality in one comparison.
> >
> > I'm not sure what you mean by this.  Both the algorithm std::equal()
and the
> > functor std::equal_to() return a bool indicating whether or not the
> > arguments are equal.
>
> Yes, but std::equal(a,b) is defined to return that bool by evaluating
> the following expression: !(a<b)&&!(b<a). In other words, two separate
> relative comparisons must be performed to execute one equality
> comparison. There are ways based upon the as-if rule to avoid this
> inefficiency, but not for the general case.

You seem to have mixed two things here.  The std:equal algorithm is defined
in terms of == and operates on ranges, not individual values.  Equality in
associative containers, however, is defined by comp(k1, k2) == false &&
comp(k2, k1) == false.  If the comparison object for your container is an
instance of less<T> (as is the case by default), then that is equivalent to
!(a<b)&&!(b<a), but that's a separate issue from std::equal.

-----
Gillmer J. Derge


---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.research.att.com/~austern/csc/faq.html                ]
[ Note that the FAQ URL has changed!  Please update your bookmarks.     ]

Author: jthill@telus.net (Jim Hill)
Date: Wed, 20 Dec 2000 19:04:01 GMT Raw View

Hans Aberg <remove.haberg@matematik.su.se> wrote:

> In general one wants to avoid conditional sequences as jumps slows down
> the CPU: Typically instructions are piped, and that pipe is broken by a
> jump.

This is certainly true, but I think you're missing crucial points: the
library's complexity requirements are not arbitrary, and  interfaces
missing from the standard library are (pace idiocies) missing for
precisely those non-arbitrary reasons.

You wanted +/- on lists.  You asked the impossible: look at the
complexity requirements. This may seem flippant, but it's not intended
to be so: if you can show how to do it, you will become the latest entry
on a long list of those who have achieved the impossible and made the
mortals look foolish once again.

For another take on it, consider this: I see your point on convenience.
I've got a "dts" namespace for hacking around, full of interfaces like

using std::sort;
template <class C> void              sort( C &c )
{ sort(c.begin(),c.end()); }
template <class C, class LT> void    sort( C &c, LT &lt )
{ sort(c.begin(), c.end(), lt); }

and I don't suppose I'm alone.  It'd be nice to be able to use it in
published code, because applying algorithms to a complete container is
by far the most common usage.  I can't, because it's not standard.
Grumble..grumble...

Right? I don't think so anymore. What it is, is "by far the most common
_nonstandard_ usage".  Where to stop is always going to be a compromise,
and so is always going to be less than completely satisfying.  You've
found other such compromises.  C'est la vie, n'est-ce pas?

Both of the above considerations apply to
> STL frequently uses a function
>   compare(x, y) <=> x < y
> which takes a bool value. Thus if you only have "compare" available and
> want to check that x == y, this must be done by checking that both
> compare(x, y) & compare(y, x) are false.

I've seen Stepanov's choice of strict weak ordering for the STL's core
described as an act of genius.  I don't know enough to agree or disagree
with that, but I can see their point.  Equal for what purpose?  The STL
supports arbitrary -- more than that, multiple, incompatible arbitrary
purposes.  When you don't need it, the navigation is decidedly
inconvenient.  I saw their point when I actually needed such a beast: a
container with almost entirely unrelated iterator paths and equivalence
relations.  It took me days to realize the solution I was working out
precisely matched the definitions of a couple of iterator categories.
My way: hundreds of lines of code.  With iterators, I haven't counted.
A handful of declarations and one-liner definitions of the right
operators, plus algorithms that look like textbook examples because they
might as well be just that.  Color me deeply, deeply impressed.

My current opinion, for what it's worth, is that the standard library
_shouldn't_ provide all those convenience interfaces.  The time and
thought necessary to implement them provides a tour of the structure
that I look forward to taking again at intervals.

Jim

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.research.att.com/~austern/csc/faq.html                ]
[ Note that the FAQ URL has changed!  Please update your bookmarks.     ]

Author: brangdon@cix.compulink.co.uk (Dave Harris)
Date: Wed, 20 Dec 2000 20:42:44 GMT Raw View

remove.haberg@matematik.su.se (Hans Aberg) wrote (abridged):
> In general one wants to avoid conditional sequences as jumps slows down
> the CPU: Typically instructions are piped, and that pipe is broken by a
> jump.

Conditional jumps are usually more efficient than indirect ones, so "if"
statements can be more efficient than "switch" statements if the "switch"
is implemented that way. In practice a small "switch" like:

    switch (c) {
    case less_than: /* */ break;
    case equals: /* */ break;
    case greater_than: /* */ break;
    }

will probably be implemented by "if" statements anyway, so you don't gain
anything. Indeed, you may find that the compiler generates a default case
since the range of the enum includes a 4th value. Also you may find a
comparison against -1 is marginally more expensive than a sign-test.

Quite often it is natural for a comparison function to return < 0. For
example:

    struct S {
        int x;
        int y;
    };

    int compare( const S &a, const S &b ) {
        if (a.x != b.x)
            return a.x - b.x;
        return a.y - b.y;
    }

(You have to watch for overflow, of course.) If we return compare_t
instead we'd have to add extra conditional jumps.

  Dave Harris, Nottingham, UK | "Weave a circle round him thrice,
      brangdon@cix.co.uk      |   And close your eyes with holy dread,
                              |  For he on honey dew hath fed
 http://www.bhresearch.co.uk/ |   And drunk the milk of Paradise."

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.research.att.com/~austern/csc/faq.html                ]
[ Note that the FAQ URL has changed!  Please update your bookmarks.     ]

Author: remove.haberg@matematik.su.se (Hans Aberg)
Date: Wed, 20 Dec 2000 22:00:33 GMT Raw View

In article <memo.20001220203915.60357A@a.btinternet.com>,
brangdon@cix.compulink.co.uk wrote:
> In practice a small "switch" like:
>
>    switch (c) {
>    case less_than: /* */ break;
>    case equals: /* */ break;
>    case greater_than: /* */ break;
>    }
>
>will probably be implemented by "if" statements anyway, so you don't gain
>anything. Indeed, you may find that the compiler generates a default case
>since the range of the enum includes a 4th value. Also you may find a
>comparison against -1 is marginally more expensive than a sign-test.

One reason for having a type ordering = { less, equal, greater } is that
such switch staments can be fully checked, instead of having to write
    switch (c) {
    case less_than: ... break;
    case greater_than: ... break;
    default /* case equals: */: ...  break;
    }


>Quite often it is natural for a comparison function to return < 0. For
example:
...
    int compare( const S &a, const S &b ) {
        if (a.x != b.x)
            return a.x - b.x;
        return a.y - b.y;
    }

>(You have to watch for overflow, of course.) If we return compare_t
>instead we'd have to add extra conditional jumps.

There are no jumps needed, if the CPU has an instruction to extract the
sign of an int. -- I do not know if that is common though.

  Hans Aberg      * Anti-spam: remove "remove." from email address.
                  * Email: Hans Aberg <remove.haberg@member.ams.org>
                  * Home Page: <http://www.matematik.su.se/~haberg/>
                  * AMS member listing: <http://www.ams.org/cml/>

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.research.att.com/~austern/csc/faq.html                ]
[ Note that the FAQ URL has changed!  Please update your bookmarks.     ]

Author: remove.haberg@matematik.su.se (Hans Aberg)
Date: Wed, 20 Dec 2000 16:09:51 CST Raw View

In article <91qnl7$86m$1@nnrp1.deja.com>, Niklas Mellin
<nimel@my-deja.com> wrote:
>> Well, I think that classes such as vector and list could have the same
>> interface in this respect.
>
>They do. You can use advance<> and distance<> both for vectors and
>lists.

These functions are not really part of the interface -- but anyway, what I
meant is that that operators +, -, [] should be implemented equally on
those containers.

>> I have not seen this; please give an example when this diagnostic
>message is issued.
>
>int main()
>{
>  list<int> int_list;
>
>  //...
>  int_list.begin() + 7; // Will generate diagnostic
>  int_list.end() - int_list().begin(); // Will also result in diagnostic
>}

This just generates an error on my computer (removing "()" of int_list).

  Hans Aberg      * Anti-spam: remove "remove." from email address.
                  * Email: Hans Aberg <remove.haberg@member.ams.org>
                  * Home Page: <http://www.matematik.su.se/~haberg/>
                  * AMS member listing: <http://www.ams.org/cml/>

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.research.att.com/~austern/csc/faq.html                ]
[ Note that the FAQ URL has changed!  Please update your bookmarks.     ]

Author: remove.haberg@matematik.su.se (Hans Aberg)
Date: Wed, 20 Dec 2000 22:10:10 GMT Raw View

In article <1elxn3k.l6pl5i10qjnnkN%jthill@telus.net>, jthill@telus.net
(Jim Hill) wrote:
>> In general one wants to avoid conditional sequences as jumps slows down
>> the CPU: Typically instructions are piped, and that pipe is broken by a
>> jump.
>
>This is certainly true, but I think you're missing crucial points: the
>library's complexity requirements are not arbitrary, and  interfaces
>missing from the standard library are (pace idiocies) missing for
>precisely those non-arbitrary reasons.
>
>You wanted +/- on lists.  You asked the impossible: look at the
>complexity requirements.

This is not impossible, because the complexity requirements will be
different for different containers: The difference in complexity is the
reason for using different types of containers.

The complexity is not part of the interface but the implmentation, though.

So lists, vectors, ... simply use the same operator +, -, [], ... but with
differnet complexity requirements.

>For another take on it, consider this: I see your point on convenience.
>I've got a "dts" namespace for hacking around, full of interfaces like
>
>using std::sort;
>template <class C> void              sort( C &c )
>{ sort(c.begin(),c.end()); }
>template <class C, class LT> void    sort( C &c, LT &lt )
>{ sort(c.begin(), c.end(), lt); }
>
>and I don't suppose I'm alone.  It'd be nice to be able to use it in
>published code, because applying algorithms to a complete container is
>by far the most common usage.  I can't, because it's not standard.
>Grumble..grumble...
>
>Right? I don't think so anymore. What it is, is "by far the most common
>_nonstandard_ usage".  Where to stop is always going to be a compromise,
>and so is always going to be less than completely satisfying.  You've
>found other such compromises.  C'est la vie, n'est-ce pas?
>
>Both of the above considerations apply to
>> STL frequently uses a function
>>   compare(x, y) <=> x < y
>> which takes a bool value. Thus if you only have "compare" available and
>> want to check that x == y, this must be done by checking that both
>> compare(x, y) & compare(y, x) are false.
>
>I've seen Stepanov's choice of strict weak ordering for the STL's core
>described as an act of genius.  I don't know enough to agree or disagree
>with that, but I can see their point.

Actually, there are two parallel systems: One which is based on a relation
<, for which it is appropriate to define x == y := !(x < y) && ! (y < x),
and another which is based on a comparison function compare with values in
ordering := { less, equal, greater }. The reason the latter system is
often convenient in a computer context is that the when testing x R y for
relation, often the information y R y is at hand, and one take advantage
of that.

In general though, one should always be able to make the question specific
for a test so that no time is wasted. For example, on a list or a string,
it is faster to determine x < y or x != y than x == y, because in the
latter case one always has to check that all elements agree up to the end.

A complete system should make sure that the request for a relation test
can be made efficiently.

>My current opinion, for what it's worth, is that the standard library
>_shouldn't_ provide all those convenience interfaces.  The time and
>thought necessary to implement them provides a tour of the structure
>that I look forward to taking again at intervals.

Well, STL needs some improvements though.

One other topic that comes to my mind is the asymmetrical iterator &
reverse_iterator structure, in order to be compatible with an old C array
requirement (that if an array has size s, that offset is a valid pointer
value),
where rbegin() points at end() and not at the last element. On the
computers we see today it should be no problem letting rbegin() be the
last element and rend() an element symbolically before the first.

  Hans Aberg      * Anti-spam: remove "remove." from email address.
                  * Email: Hans Aberg <remove.haberg@member.ams.org>
                  * Home Page: <http://www.matematik.su.se/~haberg/>
                  * AMS member listing: <http://www.ams.org/cml/>

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.research.att.com/~austern/csc/faq.html                ]
[ Note that the FAQ URL has changed!  Please update your bookmarks.     ]

Author: brangdon@cix.compulink.co.uk (Dave Harris)
Date: Thu, 21 Dec 2000 15:14:04 GMT Raw View

remove.haberg@matematik.su.se (Hans Aberg) wrote (abridged):
> > Indeed, you may find that the compiler generates a default case
> > since the range of the enum includes a 4th value.
>
> One reason for having a type ordering = { less, equal, greater } is
> that such switch staments can be fully checked [...]

No. That is what I just said doesn't work. The reason is that enums have
a range which includes unnamed values. From Stroustrup's C++PL:

    The range of an enumeration holds all the enumeration values
    rounded up to the nearest larger binary power minus 1. The
    range goes down to 0 if the smallest enumerator is non-negative
    and to the nearest lesser negative binary power if the smallest
    enumerator is negative. This defines the smallest bit-field
    capable of holding the enumerator values.

This means the range is always a power of 2; we cannot have a range of
just 3 values. Thus:

    enum order {
        less_than = -1, equal_to = 0, greater_than = 1
    };

    order x = order(-2);
    assert( x == -2 ); // OK.

This is portable and well-defined. -2 belongs to the range even though it
isn't a named value. Switch statements cannot assume that order
expressions will never be -2.


> There are no jumps needed, if the CPU has an instruction to extract the
> sign of an int. -- I do not know if that is common though.

I would write it as:

     order compare( const S &a, const S &b ) {
         if (a.x != b.x)
             return (a.x < b.x) ? less_than : greater_than;
         if (a.y != b.y)
             return (a.y < b.y) ? less_than : greater_than;
         return equal_to;
     }

I should think the conditional operators would become jumps on most
platforms. I would be surprised if any compiler got rid of the second
"if".

Although the full compare() would be useful in some algorithms, much of
the time it would be used like:

    if (compare( a, b ) == less_than) ...

in which case the straight-forward:

     bool operator<( const S &a, const S &b ) {
         return (a.x == b.x) ? (a.y < b.y) : (a.x < b.x);
     }

will be maximally efficient, especially if it can be inlined. This is
what the STL uses. I wouldn't be surprised if a good optimiser could
reduce the common subexpressions even in the 3-way case.

  Dave Harris, Nottingham, UK | "Weave a circle round him thrice,
      brangdon@cix.co.uk      |   And close your eyes with holy dread,
                              |  For he on honey dew hath fed
 http://www.bhresearch.co.uk/ |   And drunk the milk of Paradise."

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.research.att.com/~austern/csc/faq.html                ]
[ Note that the FAQ URL has changed!  Please update your bookmarks.     ]