Topic: Trees, Boost, GSoC, and the Standard [longish]


Author: avakar@volny.cz (=?ISO-8859-1?Q?Martin_Vejn=E1r?=)
Date: Sat, 9 Dec 2006 18:57:04 GMT
Raw View
Bernhard Reiter wrote:
> Martin Vejn=E1r wrote:
>> Regarding the R-trees, unfortunately, after rereading the proposal, I
>> realized that B+tree can not be supported (and neither can R-tree, whi=
ch
>> is a derivation of B+tree), because it is a heterogenous structure.
>=20
> Heterogenous in that data is only stored in leaf nodes, no? That could
> be tweakable...

Yes. We can't have the operator* return a pair (key,value) for all=20
nodes, because value is defined only for leaves. It is possible for the=20
iterator to return a proxy, which could in turn be queried for the=20
actual data, as in it->key() and it->value() -- the latter would result=20
in undefined behavior for internal nodes.

Such solution is in nearly all aspects equivalent to using property maps=20
(get(key, it), get(value, it)). Below, you mention Dave Abrahams'=20
new-style iterators proposal. If you compare the types of these=20
iterators with regards to data access (readable, writable, readwrite,=20
lvalue) to the types of property maps, you'll see, they are the same.=20
They have the same semantics and are thus equivalent in power.

I see two reasons for preferring property maps, though. For one, BGL=20
sets the precedent for using property maps, when it comes to=20
heterogeneous data. For two, I'd like to see traversal and data access=20
separate -- because of secondary storage access. You mention the problem=20
in n2101. I argue, that current interface of iterators (and property=20
maps) is insufficient for this kind of data manipulation and will=20
eventually have to change. And I don't want cursors to change with them.

Btw, I started to implement the cursor-without-operator* thing, but I=20
realized, that the whole standard library, along with Boost.range, are=20
working against me. For a normal STL implementation, one would expect=20
for example std::advance to be able to advance a cursor, even though it=20
doesn't support operator*. Unfortunately, the standard doesn't=20
guaranteee this, and I'm pretty sure at least on gcc with concept=20
checking, such attempt would fail to compile.

For the same reason, putting the traversal and data access concepts=20
together will make the task of changing the data access concepts much=20
more difficult, because even simple algorithms would have to be rewritten.

>> I guess, I will have to take a different approach after all. I'll
>> probably reuse your concept of a cursor, but remove operator *.
>=20
> Sounds good so far. Given that this kind of heterogenity might be
> common in other kinds of hierarchies as well, a cursor concept without
> operator* might be something to consider... Here, Boost's iterator
> concepts might work quite well in untangling these aspects...

I don't think so. Boost.iterator does a very good job in separating=20
traversal and data access, but doesn't go far enough. (It can't.=20
Whatever would result from total separation of traversal and data access=20
semantics would no longer be an iterator.)

>> The data
>> might then be accessed via some sort of a property map. Strangely
>> enough, I could easily transform this cursor into an iterator by
>> implementing its operator * by means of the given property map. I'm
>> going to have to explore this option more, but I think this might be a
>> good solution to all heterogenous containers, no?
>=20
> I have to admit that I haven't looked into property maps very deeply
> yet, as I felt implementing trees to use them would again require some
> associative container the property map is based upon --- which would
> make trees for efficient data storage rather absurd...

I highly recommend looking over property maps. They do not require=20
associative container --- they are simply an abstraction of data access=20
and can be viewed as a separation of operator* from iterators.

>> A Wikipedia article on "Iterator" states, that "An iterator is sometim=
es
>> called a cursor, especially within the context of a database." E.g.
>> there are two names for the same thing.
>=20
> Well, I'd say C++ refines that a little and stick to the Standard's
> definition.
[snip]
>> I can think of an oriented graph with unweighted edges, where verteces
>> could be traversed "downwards" along the edges --- very much like in a
>> tree --- and "upwards" against the direction of edges. Such vertex
>> iterators would be natural extension of your descending cursors. They
>> would merely have to provide yet another set of begin/end. Would you
>> call them iterators or cursors?
>=20
> Cursors rather than iterators, as I'd associate iterators really very
> strongly with linear data structures, and as they'd probably model
> Descending (but not Ascending) Cursor, while addding, say, up_begin and
> up_end. This means they might deserve a name like oriented_cursor,
> indicating that they're both (descending) cursors plus some more.

I wouldn't say C++ refines the meaning of the word "iterator" --- it is=20
connected to linear structures simply because C++ lacks any other kinds=20
of data structures. Just like the word "light" expanded its meaning and=20
now we qualify it as "visible light" to distinguish it from other forms=20
of EM radiation, the word "iterator" is entitled to have its meaning=20
expanded, especially because it already has wider application outside=20
the scope of today's C++.

>> In the previous post, I just wanted to point out, that iterators and
>> cursors are the same concept, and there's no need to choose different
>> names to distinguish between ranges and hierarchies. Whatever naming
>> will eventually be chosen, however, it will have no impact on actual
>> development.
>=20
> But you're still striking an interesting chord here: within the context
> of your above example, maybe ascending, being the opposite of
> descending, should actually denote "has up_begin() and up_end()"
> instead of "has parent()". So nomenclature might still require some
> reworking, in order to retain the compatibility of cursors to other
> graph (even hypergraph?) structures...

It is ironic, that my first reply to the original post was mainly about=20
naming. Now, I have to turn around: I changed my opinion. Seeing how=20
often the concepts in my mind change their meanings, I think the proper=20
naming should be chosen after the concepts are fixed and perhaps a=20
reference implementation is finished. For now, whatever naming will do.

>> Maybe the real solution is not to require iterators to
>> have constant complexity. I don't really know anymore.
>=20
> Difficult. Given the usefulness of iterators in connection with
> algorithms, that might be true, but then again, if iterator operations
> had worse complexity (esp. in/decrement), that might indicate that
> (linear!) algorithms aren't really suitable.

It wouldn't effect the correctness, though.

I've actually seen more than one of my coursemates pass large structures=20
like std::map by value. I guess, one has to know what one is doing to=20
write effective programs. Pass by value/by reference is one place, where=20
complexity can differ by more than factor of O(n), yet we're not trying=20
to change the language to prevent passing collections by value.

So maybe it would be better to allow algorithms to have substandard=20
effectivity in some cases, just for the sake of convenience. After all,=20
in production environment, either programmers won't use such algorithm=20
in the first place, or the profiler will tell them, that a different=20
solution is asked for.

>> The standard puts everything in std:: namespace. I don't know, what
>> current practice in Boost is, but from what I've gathered by peeking a=
t
>> the iterators library, everyting's in boost:: namespace.
>=20
> boost::detail is used for implementation, but apart from that, maybe
> true. Still not enough to convince me... :-)

I bet the committee will cut your wings in this regard ;-)

Anyway, I think I'll try to sum up my ideas on the whole thing more=20
coherently, in form of an article perhaps.

All the best and keep up the good work :-)
--=20
Martin Vejnar

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]





Author: "Bernhard Reiter" <ockham@gmx.net>
Date: Tue, 5 Dec 2006 13:08:49 CST
Raw View
Martin Vejn   r wrote:

> Bernhard Reiter wrote:
> > Thanks a lot for your friendly and motivating feedback! R-trees are
> > admittedly something I'm pretty ignorant about, but if you think they'd
> > fit in nicely with N2101, I'd be of course curious to hear more about
> > that :-)
>
> Here I am again with more feedback. :-)
>
> Regarding the R-trees, unfortunately, after rereading the proposal, I
> realized that B+tree can not be supported (and neither can R-tree, which
> is a derivation of B+tree), because it is a heterogenous structure.

Heterogenous in that data is only stored in leaf nodes, no? That could
be tweakable...

> I guess, I will have to take a different approach after all. I'll
> probably reuse your concept of a cursor, but remove operator *.

Sounds good so far. Given that this kind of heterogenity might be
common in other kinds of hierarchies as well, a cursor concept without
operator* might be something to consider... Here, Boost's iterator
concepts might work quite well in untangling these aspects...

> The data
> might then be accessed via some sort of a property map. Strangely
> enough, I could easily transform this cursor into an iterator by
> implementing its operator * by means of the given property map. I'm
> going to have to explore this option more, but I think this might be a
> good solution to all heterogenous containers, no?

I have to admit that I haven't looked into property maps very deeply
yet, as I felt implementing trees to use them would again require some
associative container the property map is based upon --- which would
make trees for efficient data storage rather absurd...

> A Wikipedia article on "Iterator" states, that "An iterator is sometimes
> called a cursor, especially within the context of a database." E.g.
> there are two names for the same thing.

Well, I'd say C++ refines that a little and stick to the Standard's
definition.

> You decided to call a
> range-traversing structures "iterators" and hierarchy-traversing
> structures "cursors". The distinction is understandable, as long as
> there are only ranges and hierarchies. What if another structure emerges
> that would require traversing. What would the traversal objects be
> called then?

(see below)

> I can think of an oriented graph with unweighted edges, where verteces
> could be traversed "downwards" along the edges --- very much like in a
> tree --- and "upwards" against the direction of edges. Such vertex
> iterators would be natural extension of your descending cursors. They
> would merely have to provide yet another set of begin/end. Would you
> call them iterators or cursors?

Cursors rather than iterators, as I'd associate iterators really very
strongly with linear data structures, and as they'd probably model
Descending (but not Ascending) Cursor, while addding, say, up_begin and
up_end. This means they might deserve a name like oriented_cursor,
indicating that they're both (descending) cursors plus some more.

> In the previous post, I just wanted to point out, that iterators and
> cursors are the same concept, and there's no need to choose different
> names to distinguish between ranges and hierarchies. Whatever naming
> will eventually be chosen, however, it will have no impact on actual
> development.

But you're still striking an interesting chord here: within the context
of your above example, maybe ascending, being the opposite of
descending, should actually denote "has up_begin() and up_end()"
instead of "has parent()". So nomenclature might still require some
reworking, in order to retain the compatibility of cursors to other
graph (even hypergraph?) structures...

> I think, that what we really differ in is an opinion on what the word
> iterator means. You seem to have the iterator connected with _linear_
> traversal. I view it as a generic means of traversal (not necessarily
> linear). Not being a native English speaker though, my view could be wrong.

I'm not, either, so the same holds for me...

> > Note that I've actually added these structures at a point when I was
> > thinking pre- and postorder increment and decrement weren't amortized
> > O(1), but was convinced by Ren    that iterators, even when breaking one
> > or two requirements, suggest themselves for that kind of linear
> > traversal/view of a hierarchy (so even if it turned out they really
> > aren't, I'd probably opt for keeping them as iterators, just with a
> > statement added that they can't provide amortized O(1) increment and
> > decrement.)
>
> The truth is, it would be unfortunate, if one couldn't use std::find to
> locate an element in a tree, just because inorder::iterator wasn't an
> actual iterator.

That and many other standard algorithms!

> Maybe the real solution is not to require iterators to
> have constant complexity. I don't really know anymore.

Difficult. Given the usefulness of iterators in connection with
algorithms, that might be true, but then again, if iterator operations
had worse complexity (esp. in/decrement), that might indicate that
(linear!) algorithms aren't really suitable.

> The standard puts everything in std:: namespace. I don't know, what
> current practice in Boost is, but from what I've gathered by peeking at
> the iterators library, everyting's in boost:: namespace.

boost::detail is used for implementation, but apart from that, maybe
true. Still not enough to convince me... :-)

> Once you decide not to require constant complexity for cursor's member
> functions, adding free standing functions doesn't help much. But
> guaranteing constant complexity for members while providing free
> standing functions without those guarantees might be a good idea ---
> cursors that are missing a.parity() member might still provide parity(a)
> instead, which would be linear. Algorithms that do not depend on
> constant comlexities can then use free standing functions; other
> algorithms can use members. Like free standing swap(), which may cause
> expensive copying, compared to member swap() which should be cheap.

.. and would normally induce a free standing swap specialization
invoking the member one. Yep, that's the way I'd go for.

> The name parent-knowing was really ill-chosen. I meant a concept, which
> would apply to all cursors, for which parent() had constant complexity.
> Not all ascending cursors would be parent-knowing then.

Currently, I'm still against another concept for that, but more for the
swap-inspired way.

> >> The last point (a defect maybe) is that technically, descending
> >> bidirectional cursor is not an input iterator. That's because objects of
> >> descending_bidirectional_cursor_tag are not convertible to
> >> input_iterator_tag --- the conversion is ambiguous.
> >
> > Okay, I'll need to look into this and find out how to work around that.
> > Thanks for notifying.
>
> This problem will disappear once the concepts become available and tags
> are deprecated. Let's just hope it's soon :-)

Concepts are still a big TODO item for my reference implementation, but
I need to fix some other bugs first...

Bernhard Reiter


---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]





Author: avakar@volny.cz (=?ISO-8859-1?Q?Martin_Vejn=E1r?=)
Date: Sat, 25 Nov 2006 18:07:47 GMT
Raw View
Bernhard Reiter wrote:
> Thanks a lot for your friendly and motivating feedback! R-trees are
> admittedly something I'm pretty ignorant about, but if you think they'd
> fit in nicely with N2101, I'd be of course curious to hear more about
> that :-)

Here I am again with more feedback. :-)

Regarding the R-trees, unfortunately, after rereading the proposal, I=20
realized that B+tree can not be supported (and neither can R-tree, which=20
is a derivation of B+tree), because it is a heterogenous structure.

I guess, I will have to take a different approach after all. I'll=20
probably reuse your concept of a cursor, but remove operator *. The data=20
might then be accessed via some sort of a property map. Strangely=20
enough, I could easily transform this cursor into an iterator by=20
implementing its operator * by means of the given property map. I'm=20
going to have to explore this option more, but I think this might be a=20
good solution to all heterogenous containers, no?

> Martin Vejn=E1r wrote:
>> In the proposal, you have introduced a concept of a cursor, which is i=
n
>> fact an iterator. The question which arises is why you have chosen to
>> name an iterator with a new name "cursor". If I read the proposal
>> correctly, a (non-descending) cursor is equivalent to a forward
>> iterator, save some minor differences like cursor's requirement for a
>> member "cursor" typedef.
>=20
> That's about correct --- the main reason is that the "pure" cursor
> definition (ie, as of 24.X.1) is more or less academic as it's the
> descending cursor that represents the "least" level of cursors we're
> interested in as that one models the range requirements (allowing us to
> descend in the hierarchy).
> Keep in mind that many cursor terms and requirements are also stated as
> part of the introductory section in 24.X.

A Wikipedia article on "Iterator" states, that "An iterator is sometimes=20
called a cursor, especially within the context of a database." E.g.=20
there are two names for the same thing. You decided to call a=20
range-traversing structures "iterators" and hierarchy-traversing=20
structures "cursors". The distinction is understandable, as long as=20
there are only ranges and hierarchies. What if another structure emerges=20
that would require traversing. What would the traversal objects be=20
called then?

I can think of an oriented graph with unweighted edges, where verteces=20
could be traversed "downwards" along the edges --- very much like in a=20
tree --- and "upwards" against the direction of edges. Such vertex=20
iterators would be natural extension of your descending cursors. They=20
would merely have to provide yet another set of begin/end. Would you=20
call them iterators or cursors?

In the previous post, I just wanted to point out, that iterators and=20
cursors are the same concept, and there's no need to choose different=20
names to distinguish between ranges and hierarchies. Whatever naming=20
will eventually be chosen, however, it will have no impact on actual=20
development.

I think, that what we really differ in is an opinion on what the word=20
iterator means. You seem to have the iterator connected with _linear_=20
traversal. I view it as a generic means of traversal (not necessarily=20
linear). Not being a native English speaker though, my view could be wron=
g.

>> Ironically, in section 24.4.X, a tree traversal structures are
>> mislabeled as "iterators", which they aren't, since they break
>> iterators' constraint on constant time complexity.
>=20
> Are you sure about that point? I have to admit that I haven't
> researched this very closely yet, but gathering from all three {in,
> pre, post}order algorithms taking up O(n) for one complete traversal,
> I'd say increment and decrement can be done in amortized constant time
> (as demanded by 24.1, =A78).

My bad. You're right of course. :-) They do have every right to be=20
called iterators.

> Note that I've actually added these structures at a point when I was
> thinking pre- and postorder increment and decrement weren't amortized
> O(1), but was convinced by Ren=E9 that iterators, even when breaking on=
e
> or two requirements, suggest themselves for that kind of linear
> traversal/view of a hierarchy (so even if it turned out they really
> aren't, I'd probably opt for keeping them as iterators, just with a
> statement added that they can't provide amortized O(1) increment and
> decrement.)

The truth is, it would be unfortunate, if one couldn't use std::find to=20
locate an element in a tree, just because inorder::iterator wasn't an=20
actual iterator. Maybe the real solution is not to require iterators to=20
have constant complexity. I don't really know anymore.

>> (I also don't like
>> separate namespaces for different kinds of traversals; why not
>> preorder_iterator, postorder_iterator etc., or better yet
>> preorder_somethingelse?).
>=20
> That's perhaps a matter of personal taste. The namespace approach
> looked somewhat appealing to me (e.g. in connection with a using
> namespace inorder; line), although I can't really bring forward many
> arguments in favor of this one over the other approach (but neither
> vice versa, so here I am being admittedly arbitrary :-)

The standard puts everything in std:: namespace. I don't know, what=20
current practice in Boost is, but from what I've gathered by peeking at=20
the iterators library, everyting's in boost:: namespace.

>> To summarize, I'd vote for renaming cursors to iterators. That would
>> leave us with, for example, forward iterators, forward descending
>> iterators, bidirectional ascending iterators etc.
>=20
> Frankly, I wouldn't do that, although I believe that I get your point.
> But doing that would add yet one or two further orthogonal aspects to
> the existing collection of iterator concepts --- which aren't ever
> needed when dealing with non-hierarchical structures. (Plus, given that
> the relevant addition is [descending] cursors modelling range
> requirements, it would be somewhat arbitrary calling any type of cursor
> an iterator --- why not call them ranges instead?)

See my response on the naming issue above.

>> On another subject, there are two operations on cursors that have (at
>> most) linear complexity --- thats a.parity() and a.parent(). I strongl=
y
>> believe, that cursors (and iterators) should have their operations con=
stant.
>=20
> I agree, they should; but maybe not as a conceptual requirement. I had
> the galore of possible (well, at least some more or less pathological)
> tree implementations (see e.g. Knuth 2.3.3) in mind which probably
> yield completely different time complexities for parent and parity.
> Okay, some maybe also for increment/decrement etc, but I wanted at
> least to take some of the threaded tree representations into account,
> especially the ones that have e.g. the rightmost node's right link
> point up to its parent (which, for
> a given cursor, obviously yields worst case time complexity linear in
> the number of its siblings). (This is linked to one pretty tough basic
> issue here, how to have the interface map implementations abstractly
> but minimally, ie without imposing any extra information that we
> wouldn't require if we just contented ourselves with the pure
> implementation and omitted the interface...)

Right. It makes sense to have e.g. boost::filter_iterator, which also=20
cannot guarantee constant complexities. On the other hand, iterators are=20
supposed to be an elementary lightweight objects. As I said above, I=20
don't really know anymore.

>> The first case is fine, let the cursor have parity(), but why limit
>> parity() to descending cursors? Every iterator that stores its parity
>> can have parity(), right?
>>
>> For the latter, it might be better for the cursor to reveal its first
>> sibling. We can then call "std::distance" ourselves, being aware of it=
s
>> complexity.
>=20
> My approach would be to remove the current a.parity() from the
> descending cursor requirements and add parity(a) to the ascending ones
> instead (which would be implemented as distance(a.parent().begin(), a)
> --- its complexity then is a function of the type of cursor that a
> models). In relevant places, one might require guaranteed constant --
> or just more efficient -- specializations of that algorithm (maybe even
> for descending cursors), in a "freestanding calls member function" way
> of doing things (as swap() does in the standard).

Once you decide not to require constant complexity for cursor's member=20
functions, adding free standing functions doesn't help much. But=20
guaranteing constant complexity for members while providing free=20
standing functions without those guarantees might be a good idea ---=20
cursors that are missing a.parity() member might still provide parity(a)=20
instead, which would be linear. Algorithms that do not depend on=20
constant comlexities can then use free standing functions; other=20
algorithms can use members. Like free standing swap(), which may cause=20
expensive copying, compared to member swap() which should be cheap.

>> About parent(). Well, again, it is linear. I understand, that a genera=
l
>> tree with parent pointers would only have access to the first sibling =
of
>> its parent. Why not provide the first_sibling_of_parent() member
>> function (which would have constant complexity) and give parent() only
>> to those cursors, which can get it in constant time?
>=20
> Again because of different possible implementations. What we should
> probably do is add requirements to such important things as binary or
> nary_trees for their cursors' parent() member functions to be O(1).

This is a variation of the still same issue. Should cursor members be=20
required to be O(1) or not? Once answered, everything will follow.

>> I know, that I make the cursor/iterator hierarchy quite complicated.
>> (Well, not really. After all, the parent-knowing iterator is only a
>> refinement of the ascending iterator, parity-aware iterator and
>> random-access iterator. :-) ) But this just seems little more coherent.
>=20
> Well, here I tend to disagree...
> I think parity-aware shouldn't really be a concept (for reasons pointed
> out above). And as for parent-knowing, I'm a little confused --- that's
> precisely what ascending is, no?

The name parent-knowing was really ill-chosen. I meant a concept, which=20
would apply to all cursors, for which parent() had constant complexity.
Not all ascending cursors would be parent-knowing then.

>> The last point (a defect maybe) is that technically, descending
>> bidirectional cursor is not an input iterator. That's because objects =
of
>> descending_bidirectional_cursor_tag are not convertible to
>> input_iterator_tag --- the conversion is ambiguous.
>=20
> Okay, I'll need to look into this and find out how to work around that.
> Thanks for notifying.

This problem will disappear once the concepts become available and tags=20
are deprecated. Let's just hope it's soon :-)

Best regards,
--=20
Martin Vejnar

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]





Author: "Bernhard Reiter" <ockham@gmx.net>
Date: Wed, 22 Nov 2006 12:32:42 CST
Raw View
Martin Vejn   r wrote:

> I am currently working on my bachelor's thesis on the implementation of
> spatial index structures in C++. That includes an R-tree --- and that's
> where your post became quite interesting to me. As a big fan of
> doing-things-generically I eagerly browsed through your proposal. It's
> great. I happen to have a few comments though. Before I start, I want to
> make clear, that I consider your work brilliant. I spent a lot of time
> trying to figure out some feasible concept of iterator-based tree
> traversing and N2101 seems like a gift from the heavens :-).

Thanks a lot for your friendly and motivating feedback! R-trees are
admittedly something I'm pretty ignorant about, but if you think they'd
fit in nicely with N2101, I'd be of course curious to hear more about
that :-)

> In the proposal, you have introduced a concept of a cursor, which is in
> fact an iterator. The question which arises is why you have chosen to
> name an iterator with a new name "cursor". If I read the proposal
> correctly, a (non-descending) cursor is equivalent to a forward
> iterator, save some minor differences like cursor's requirement for a
> member "cursor" typedef.

That's about correct --- the main reason is that the "pure" cursor
definition (ie, as of 24.X.1) is more or less academic as it's the
descending cursor that represents the "least" level of cursors we're
interested in as that one models the range requirements (allowing us to
descend in the hierarchy).
Keep in mind that many cursor terms and requirements are also stated as
part of the introductory section in 24.X.

[snip]
> Ironically, in section 24.4.X, a tree traversal structures are
> mislabeled as "iterators", which they aren't, since they break
> iterators' constraint on constant time complexity.

Are you sure about that point? I have to admit that I haven't
researched this very closely yet, but gathering from all three {in,
pre, post}order algorithms taking up O(n) for one complete traversal,
I'd say increment and decrement can be done in amortized constant time
(as demanded by 24.1,    8).
I'm actually pretty positive about inorder (and less sure about the
others).

Note that I've actually added these structures at a point when I was
thinking pre- and postorder increment and decrement weren't amortized
O(1), but was convinced by Ren    that iterators, even when breaking one
or two requirements, suggest themselves for that kind of linear
traversal/view of a hierarchy (so even if it turned out they really
aren't, I'd probably opt for keeping them as iterators, just with a
statement added that they can't provide amortized O(1) increment and
decrement.)

> (I also don't like
> separate namespaces for different kinds of traversals; why not
> preorder_iterator, postorder_iterator etc., or better yet
> preorder_somethingelse?).

That's perhaps a matter of personal taste. The namespace approach
looked somewhat appealing to me (e.g. in connection with a using
namespace inorder; line), although I can't really bring forward many
arguments in favor of this one over the other approach (but neither
vice versa, so here I am being admittedly arbitrary :-)

> To summarize, I'd vote for renaming cursors to iterators. That would
> leave us with, for example, forward iterators, forward descending
> iterators, bidirectional ascending iterators etc.

Frankly, I wouldn't do that, although I believe that I get your point.
But doing that would add yet one or two further orthogonal aspects to
the existing collection of iterator concepts --- which aren't ever
needed when dealing with non-hierarchical structures. (Plus, given that
the relevant addition is [descending] cursors modelling range
requirements, it would be somewhat arbitrary calling any type of cursor
an iterator --- why not call them ranges instead?)

> On another subject, there are two operations on cursors that have (at
> most) linear complexity --- thats a.parity() and a.parent(). I strongly
> believe, that cursors (and iterators) should have their operations constant.

I agree, they should; but maybe not as a conceptual requirement. I had
the galore of possible (well, at least some more or less pathological)
tree implementations (see e.g. Knuth 2.3.3) in mind which probably
yield completely different time complexities for parent and parity.
Okay, some maybe also for increment/decrement etc, but I wanted at
least to take some of the threaded tree representations into account,
especially the ones that have e.g. the rightmost node's right link
point up to its parent (which, for
a given cursor, obviously yields worst case time complexity linear in
the number of its siblings). (This is linked to one pretty tough basic
issue here, how to have the interface map implementations abstractly
but minimally, ie without imposing any extra information that we
wouldn't require if we just contented ourselves with the pure
implementation and omitted the interface...)

> This parity() beast is especially confusing to me --- it is supposed to
> return a position of the cursor amongst its siblings. From my point of
> view (applying the cursor-is-an-iterator theorem), that's like asking
> for position of an iterator within its controlling sequence. That seems
> to be more of an algorithmical issue and should be handled by a
> free-standing function. Besides, to implement parity(), the cursor
> 1. would either needed to know its position already (in which case
> parity() would be constant-time)
> 2. would have to have the ability to retrieve the beginning of a
> sequence of its siblings, either by storing the relevant cursor or by
> being an ascending cursor and traversing through its parent.

Agree on all points. This is due to current insertion semantics
requiring a cursor's parity to be stored in it, but conceptually
problematic, so thanks for pointing that out.

> The first case is fine, let the cursor have parity(), but why limit
> parity() to descending cursors? Every iterator that stores its parity
> can have parity(), right?
>
> For the latter, it might be better for the cursor to reveal its first
> sibling. We can then call "std::distance" ourselves, being aware of its
> complexity.

My approach would be to remove the current a.parity() from the
descending cursor requirements and add parity(a) to the ascending ones
instead (which would be implemented as distance(a.parent().begin(), a)
--- its complexity then is a function of the type of cursor that a
models). In relevant places, one might require guaranteed constant --
or just more efficient -- specializations of that algorithm (maybe even
for descending cursors), in a "freestanding calls member function" way
of doing things (as swap() does in the standard).

> About parent(). Well, again, it is linear. I understand, that a general
> tree with parent pointers would only have access to the first sibling of
> its parent. Why not provide the first_sibling_of_parent() member
> function (which would have constant complexity) and give parent() only
> to those cursors, which can get it in constant time?

Again because of different possible implementations. What we should
probably do is add requirements to such important things as binary or
nary_trees for their cursors' parent() member functions to be O(1).

> I know, that I make the cursor/iterator hierarchy quite complicated.
> (Well, not really. After all, the parent-knowing iterator is only a
> refinement of the ascending iterator, parity-aware iterator and
> random-access iterator. :-) ) But this just seems little more coherent.

Well, here I tend to disagree...
I think parity-aware shouldn't really be a concept (for reasons pointed
out above). And as for parent-knowing, I'm a little confused --- that's
precisely what ascending is, no?

> The last point (a defect maybe) is that technically, descending
> bidirectional cursor is not an input iterator. That's because objects of
> descending_bidirectional_cursor_tag are not convertible to
> input_iterator_tag --- the conversion is ambiguous.

Okay, I'll need to look into this and find out how to work around that.
Thanks for notifying.

Bernhard Reiter


---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]





Author: =?ISO-8859-1?Q?Martin_Vejn=E1r?= <avakar@volny.cz>
Date: Fri, 17 Nov 2006 14:08:57 CST
Raw View
Bernhard Reiter wrote:
[snipped an overview of N2101 proposal]
> Find proposal WG21/N2101=J16/06-0171 at
> http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2006/n2101.html
> The relevant part of the Boost GSoC wiki is at
> https://boost-consulting.com:8443/trac/soc/wiki/tree
> The current (buggy!) state of the implementation can be downloaded via
> svn from
> https://www.boost-consulting.com:8443/svn/main/boost/soc/2006/tree/trunk
> and browsed online at
> https://boost-consulting.com:8443/trac/soc/browser/boost/soc/2006/tree/trunk
[...]
> Any comments, especially with regard to the issues as described above,
> are highly welcome.

I am currently working on my bachelor's thesis on the implementation of
spatial index structures in C++. That includes an R-tree --- and that's
where your post became quite interesting to me. As a big fan of
doing-things-generically I eagerly browsed through your proposal. It's
great. I happen to have a few comments though. Before I start, I want to
make clear, that I consider your work brilliant. I spent a lot of time
trying to figure out some feasible concept of iterator-based tree
traversing and N2101 seems like a gift from the heavens :-).

In the proposal, you have introduced a concept of a cursor, which is in
fact an iterator. The question which arises is why you have chosen to
name an iterator with a new name "cursor". If I read the proposal
correctly, a (non-descending) cursor is equivalent to a forward
iterator, save some minor differences like cursor's requirement for a
member "cursor" typedef.

Note, that in the current standard, bidirectional iterators extend
forward iterators by adding another direction in which the iterator can
be advanced. Cursors and bidirectional iterators are very similar when
it comes to refining forward iterators. They just happen to choose
different directions --- where bidirectional iterator goes backwards,
cursor descends downwards.

Ironically, in section 24.4.X, a tree traversal structures are
mislabeled as "iterators", which they aren't, since they break
iterators' constraint on constant time complexity. (I also don't like
separate namespaces for different kinds of traversals; why not
preorder_iterator, postorder_iterator etc., or better yet
preorder_somethingelse?).

To summarize, I'd vote for renaming cursors to iterators. That would
leave us with, for example, forward iterators, forward descending
iterators, bidirectional ascending iterators etc.

On another subject, there are two operations on cursors that have (at
most) linear complexity --- thats a.parity() and a.parent(). I strongly
believe, that cursors (and iterators) should have their operations constant.

This parity() beast is especially confusing to me --- it is supposed to
return a position of the cursor amongst its siblings. From my point of
view (applying the cursor-is-an-iterator theorem), that's like asking
for position of an iterator within its controlling sequence. That seems
to be more of an algorithmical issue and should be handled by a
free-standing function. Besides, to implement parity(), the cursor
1. would either needed to know its position already (in which case
parity() would be constant-time)
2. would have to have the ability to retrieve the beginning of a
sequence of its siblings, either by storing the relevant cursor or by
being an ascending cursor and traversing through its parent.

The first case is fine, let the cursor have parity(), but why limit
parity() to descending cursors? Every iterator that stores its parity
can have parity(), right?

For the latter, it might be better for the cursor to reveal its first
sibling. We can then call "std::distance" ourselves, being aware of its
complexity.

About parent(). Well, again, it is linear. I understand, that a general
tree with parent pointers would only have access to the first sibling of
its parent. Why not provide the first_sibling_of_parent() member
function (which would have constant complexity) and give parent() only
to those cursors, which can get it in constant time?

I know, that I make the cursor/iterator hierarchy quite complicated.
(Well, not really. After all, the parent-knowing iterator is only a
refinement of the ascending iterator, parity-aware iterator and
random-access iterator. :-) ) But this just seems little more coherent.

The last point (a defect maybe) is that technically, descending
bidirectional cursor is not an input iterator. That's because objects of
descending_bidirectional_cursor_tag are not convertible to
input_iterator_tag --- the conversion is ambiguous.

Best regards,
--
Martin Vejnar

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]