Topic: sorting and hashing iterators
Author: Stephen Vavasis <vavasis@CS.Cornell.EDU>
Date: 1998/05/13 Raw View
I've been very pleased using STL so far, but I've run into the following
limitation: There is no simple, efficient, and standard-conforming way
to sort iterators of maps and multimaps.
You may ask: why would anyone want to sort map iterators? Because I
might want use an iterator as a key (or partial key) to another STL data
structure! This has come up in my program; I want to use K as a heap
key:
struct K {
double dist;
multimap<A,B>::const_iterator it;
};
The idea is that K's with the smallest dist will come out on top of the
heap, but I also want to make sure that if k1, k2 are two K's such that
k1==k2 (note that == is well defined for iterators), then they come out
adjacent on the heap. There seems to be no efficient,
standard-conforming way to get the desired effect.
Note that every other primitive data object in C++ admits a total
order. I think that even pointers admit a total order, as long as they
point to the same array. So I would like to propose for the next round
of standardization that there be some standard way to efficiently impose
a total order on iterators, or at least, iterators that point to the
same container. For my application, the order of iterators does not
have to be correlated to the logical order of the container. (But the
order has to be logically consistent with operator== which is defined
for iterators).
Along the same lines, I think it would be helpful if there were a
standard way to hash iterators (cast them to int's, for instance).
Actually, it would be helpful if there were a standardized way to
compute integer hash codes for all primitive data types in C++. (See
the use of "reinterpret_cast" on p. 503 of Stroustrup 3d edition.)
-- Steve Vavasis (vavasis@cs.cornell.edu)
---
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html ]
Author: jbuck@best.com (Joe Buck)
Date: 1998/05/13 Raw View
In article <3558B0DB.6698@cs.cornell.edu>,
Stephen Vavasis <vavasis@CS.Cornell.EDU> wrote:
>The idea is that K's with the smallest dist will come out on top of the
>heap, but I also want to make sure that if k1, k2 are two K's such that
>k1==k2 (note that == is well defined for iterators), then they come out
>adjacent on the heap. There seems to be no efficient,
>standard-conforming way to get the desired effect.
You want a way to define operator< for iterators. I'm afraid that it
can only work for random access iterators; for bidirectional and forward
iterators, there is no cheap way to define it.
However, for iterators that point to sets or maps, there's an out: you
can exploit the sorting of the set or map: if neither iterator points
to the end of the map, iter1 < iter2 iff *iter1 < *iter2. In your
application I believe that you have that property.
>Note that every other primitive data object in C++ admits a total
>order. I think that even pointers admit a total order, as long as they
>point to the same array. So I would like to propose for the next round
>of standardization that there be some standard way to efficiently impose
>a total order on iterators, or at least, iterators that point to the
>same container.
You think that pointers are totally ordered, but as soon as you interpret
those pointers as nodes in a list, you'll see that their physical addresses
have nothing to do with the logical position of the nodes in the list.
I can't think of any way to do that for list<T> that wouldn't be wildly
expensive, therefore I would vote no. It's even worse for STL conformant
classes like SGI's slist<T>. You just can't do this for iterators that
aren't random-access, except in special cases. Sorry.
> For my application, the order of iterators does not
>have to be correlated to the logical order of the container. (But the
>order has to be logically consistent with operator== which is defined
>for iterators).
Ah: you are only using the order because sets and maps require one.
It seems that you want hashed objects. While hash_set, hash_map and
friends didn't make it into this round of standardization, they are
available and usable. It seems that your needs would be satisfied by
inclusion of such classes in the standard -- you don't *really* need
operator< on iterators.
>Along the same lines, I think it would be helpful if there were a
>standard way to hash iterators (cast them to int's, for instance).
You wouldn't need this; rather, you would only need definitions of
hash<T> where T is an iterator (that is, you don't need to care about
the details).
--
-- Joe Buck
work: jbuck@synopsys.com, otherwise jbuck@welsh-buck.org or jbuck@best.net
http://www.welsh-buck.org/
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html ]
Author: James Kuyper <kuyper@wizard.net>
Date: 1998/05/14 Raw View
Stephen Vavasis wrote:
>
> I've been very pleased using STL so far, but I've run into the following
> limitation: There is no simple, efficient, and standard-conforming way
> to sort iterators of maps and multimaps.
>
> You may ask: why would anyone want to sort map iterators? Because I
> might want use an iterator as a key (or partial key) to another STL data
> structure! This has come up in my program; I want to use K as a heap
> key:
> struct K {
> double dist;
> multimap<A,B>::const_iterator it;
> };
> The idea is that K's with the smallest dist will come out on top of the
> heap, but I also want to make sure that if k1, k2 are two K's such that
> k1==k2 (note that == is well defined for iterators), then they come out
> adjacent on the heap. There seems to be no efficient,
> standard-conforming way to get the desired effect.
For map<>s, define a compare function object that compares the keys of
the object pointed to by the two iterators. It will need to know the
compare object used to construct the map<>, you can't get that
information from the iterators themselves. For multimaps<>, the case
where the keys are equal presents problems that I haven't figured out
how to solve.
---
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html ]
Author: jkanze@otelo.ibmmail.com
Date: 1998/05/14 Raw View
In article <3558B0DB.6698@cs.cornell.edu>,
vavasis@CS.Cornell.EDU wrote:
>
> Note that every other primitive data object in C++ admits a total
> order. I think that even pointers admit a total order, as long as they
> point to the same array. So I would like to propose for the next round
> of standardization that there be some standard way to efficiently impose
> a total order on iterators, or at least, iterators that point to the
> same container. For my application, the order of iterators does not
> have to be correlated to the logical order of the container. (But the
> order has to be logically consistent with operator== which is defined
> for iterators).
This is almost trivial to implement yourself: the expression &*i should
give the address of the designated object, so the problem is reduced to
that of a pointer, and the standard requires less to handle this.
I say almost, because the expression &*i is NOT defined for the end
pointer, and typically, there is no way of telling whether an isolated
iterator is an end pointer. (In your case, I believe that you can
guarantee that you will never have an end pointer.)
> Along the same lines, I think it would be helpful if there were a
> standard way to hash iterators (cast them to int's, for instance).
> Actually, it would be helpful if there were a standardized way to
> compute integer hash codes for all primitive data types in C++. (See
> the use of "reinterpret_cast" on p. 503 of Stroustrup 3d edition.)
The portable hash is a real problem: the example in Stroustrup does
NOT work. It fails for pointers on older Intel machines, for floating
point values for almost all architectures, and even for integer types
on other than 2's complement (not to mention that it isn't a very good
hash function to begin with). An important constraint on the hash
function is that two values which compare equal must have the same
hash value. On an Intel, there are different combinations of segment:
offset which point to the same object, and thus must compare equal.
And all of the floating point formats I'm aware of have both a positive
and a negative 0 (as do 1's complement and signed magnitude integers),
the two 0's have different bit patterns, but compare equal.
Requiring a hash function for all basic types and pointers would thus
be a very useful addition to the standard. To be useful, of course,
it would have to be a good hash function, however, and I'm not sure
how you can specify this. (Of course, the same can be said about
rand(). But the standard rand() is all but unusable in portable
software precisely because of this lack of specification.)
--
James Kanze +33 (0)1 39 23 84 71 mailto: kanze@gabi-soft.fr
+49 (0)69 66 45 33 10 mailto: jkanze@otelo.ibmmail.com
GABI Software, 22 rue Jacques-Lemercier, 78000 Versailles, France
Conseils en informatique orient e objet --
-- Beratung in objektorientierter Datenverarbeitung
-----== Posted via Deja News, The Leader in Internet Discussion ==-----
http://www.dejanews.com/ Now offering spam-free web-based newsreading
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html ]