Topic: Inconsistent definition of erase() ?


Author: James Kuyper <kuyper@wizard.net>
Date: 1999/06/06
Raw View
Valentin Bonnard wrote:
>
> Ed Brey wrote:
> >
> > Valentin Bonnard <Bonnard.V@wanadoo.fr> wrote in message
> > news:37571058.4432@wanadoo.fr...
> > >
> > > I think it's a shame not to have extended AssocCont::erase the
> > > same way. I have heard some nonsensical arguments against
> > > AssocCont::erase returning an iterator but nothing serious.
> >
> > What do you think of the reasoning that ++it is often a more expensive
> > operation for an associative container than for a sequence container, and so
> > for an associative erase where the return value is not used, there would be
> > a performance penalty if it were provided?
>
> Actually this is the nonsensical argument I was talking about.
>
> > I'm thinking of the situation in a binary tree when ++it requires walking
> > all the way up or down a tree, which would make ++it a log2(n) operation.
>
> The next node in the tree is just one pointer away; every node has
> next and previous pointers. You may like it or not, but this is one
> of teh roots of the STL.

Expanding on that statement:

Associative containers are required to have bidirectional iterators.
Bidirectional iterators are required to provide constant-time ++ and
--.   Therefore, determining the next node cannot be a log2(n)
operation. Whether or not there's a per-node pointer to the next and
previous node is implementation detail, though that's the most obvious
way to meet this requirement.
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]





Author: jpotter@falcon.lhup.edu (John Potter)
Date: 1999/06/06
Raw View
James Kuyper <kuyper@wizard.net> wrote:

: Valentin Bonnard wrote:
: >
: > The next node in the tree is just one pointer away; every node has
: > next and previous pointers. You may like it or not, but this is one
: > of teh roots of the STL.

: Expanding on that statement:

Which is incorrect.

: Associative containers are required to have bidirectional iterators.
: Bidirectional iterators are required to provide constant-time ++ and
: --.

Missing a key word, "amortized" constant time.

: Therefore, determining the next node cannot be a log2(n)
: operation.

It can as long as a complete traversal looks like constant time.
The same as vector::push_back which may be linear sometimes but
in the long run must be averaged to constant.  This requirement
forces grow to be multiplicative not additive.

: Whether or not there's a per-node pointer to the next and
: previous node is implementation detail, though that's the most obvious
: way to meet this requirement.

Never seen one.  The nodes already have left, right and parent
pointers.  Consider a perfectly balanced tree with seven nodes.
Begin must be constant time, so there must be a way to get to
the left most node without starting at the root and working
down the left side.  Now lets do the seven ++ operations to get
to end.  1 up to parent(1).  2 down to right(1).  3 up to parent
and up to root(2 lgN).  4 down right and down left(2 lgN).
5 up to parent(1).  6 down right(1).  7 up to parent and up to
root and up to end(3 lgN).  Total 11 / 7 is amortized constant
time of 2 pointer chases per ++.

In a complete traversal of a tree, each arc will be followed
in each direction once for a total of 2N.  Linear time traversal
gives amortized constant time increment.

Check your favorite rb-tree implementation sitting under the
associative containers.  You will not likely find any pred/succ
pointers.  Nor threads which would remove the up chases leaving
only the down chases.  These things add space and only transfer
the time to other places for maintaining them.  It is always a
space loss and usually an overall time loss.

John
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]





Author: jpotter@falcon.lhup.edu (John Potter)
Date: 1999/06/07
Raw View
Valentin Bonnard <Bonnard.V@wanadoo.fr> wrote:
: Ed Brey wrote:
: > Valentin Bonnard <Bonnard.V@wanadoo.fr> wrote in message
: > news:37571058.4432@wanadoo.fr...
: > >
: > > I think it's a shame not to have extended AssocCont::erase the
: > > same way. I have heard some nonsensical arguments against
: > > AssocCont::erase returning an iterator but nothing serious.

: > What do you think of the reasoning that ++it is often a more expensive
: > operation for an associative container than for a sequence container, and so
: > for an associative erase where the return value is not used, there would be
: > a performance penalty if it were provided?

: Actually this is the nonsensical argument I was talking about.

: > I'm thinking of the situation in a binary tree when ++it requires walking
: > all the way up or down a tree, which would make ++it a log2(n) operation.

: The next node in the tree is just one pointer away; every node has
: next and previous pointers. You may like it or not, but this is one
: of teh roots of the STL.

Not in the implementations that I have seen.  Every node has a left,
right, and parent pointer.  Moving from the rightmost node of the
left subtree to the root requires chasing parent pointers up the
tree.  That is why ++ is amortized constant time and not constant
time like the sequences.

: > Assuming the increment overhead is significant, the next question is whether
: > an optimizer would see that the return value is not used and avoid
: > performing the increment anyway.

: It wouldn't be a trivial optimisation, but since the loop would
: be something like:

: while (test)
:     assign auto variables

: and if it's inline than I think it's doable.

Let's see.  We just deleted a node.  If an internal node, it has
been replaced by its successor and we have the correct iterator.
If a leaf node, we must have its parent.  If left of parent we
have the iterator.  If right of parent we need to move up the
tree.
   do {
      c = p;
      p = c->parent;
      } while (p->right == c && p->parent != c);
   return iterator(p);

To optimize that away, it is required to know that the constructor
has no side effects.  Considering that this is just the end of the
nasty stuff needed to remove the node and rebalance the tree, it is
likely not inline.  Yes, it would be non-trivial.

John
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]





Author: hinnant@_anti-spam_metrowerks.com (Howard Hinnant)
Date: 1999/06/07
Raw View
In article <7j93h3$eiv3@interserv.etn.com>, "Ed Brey"
<brey@afd.mke.etn.com> wrote:

> What do you think of the reasoning that ++it is often a more expensive
> operation for an associative container than for a sequence container, and so
> for an associative erase where the return value is not used, there would be
> a performance penalty if it were provided?
>
> I'm thinking of the situation in a binary tree when ++it requires walking
> all the way up or down a tree, which would make ++it a log2(n) operation.
> Of course, this is worst case, and for most values of it, the walking would
> be much less.  Still, overall there may be enough overhead in incrementing
> that erase shouldn't take the time to do an increment just to have it be
> thrown away.

For a perfectly balanced, large binary tree the average cost of ++it is
twice that of going to an immediate neighbor (that is, amortized
constant).  This statement assumes that incrementing from the last node to
end() walks all the way up the tree to the root, and then up one more
level to a psuedo node.

Here is a short program that computes the average for several sizes of
tree, and the resulting output:

#include <cmath>
#include <iostream>

double
avg(int level)
{
   double a = 0;
   for (int i = 1; i <= level; ++i)
      a += i*2*std::pow(2.0, level-(1+i));
   a /= std::pow(2.0, level) - 1;
   return a;
}

int main()
{
   for (int i = 1; i <= 10; ++i)
      std::cout << "size = " << std::pow(2.0, i) - 1 << "\t\tavg = " <<
avg(i) << '\n';
}

size = 1        avg = 1
size = 3        avg = 1.33333
size = 7        avg = 1.57143
size = 15       avg = 1.73333
size = 31       avg = 1.83871
size = 63       avg = 1.90476
size = 127      avg = 1.94488
size = 255      avg = 1.96863
size = 511      avg = 1.98239
size = 1023     avg = 1.99022
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]





Author: jcoffin@taeus.com (Jerry Coffin)
Date: 1999/06/07
Raw View
In article <7j93h3$eiv3@interserv.etn.com>, brey@afd.mke.etn.com
says...

[ ... ]

> What do you think of the reasoning that ++it is often a more expensive
> operation for an associative container than for a sequence container, and so
> for an associative erase where the return value is not used, there would be
> a performance penalty if it were provided?

[ example using binary tree elided ]

If this were really a major concern, they could require that it
happen, AND that it happen in O(1) time.  One obvious implementation
that would satisfy the requirement would be a threaded tree.
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]





Author: James Kuyper <kuyper@wizard.net>
Date: 1999/06/07
Raw View
John Potter wrote:
>
> James Kuyper <kuyper@wizard.net> wrote:
>
> : Valentin Bonnard wrote:
> : >
> : > The next node in the tree is just one pointer away; every node has
> : > next and previous pointers. You may like it or not, but this is one
> : > of teh roots of the STL.
>
> : Expanding on that statement:
>
> Which is incorrect.
>
> : Associative containers are required to have bidirectional iterators.
> : Bidirectional iterators are required to provide constant-time ++ and
> : --.
>
> Missing a key word, "amortized" constant time.

I deliberately left it out to avoid an unnecessary discussion of what
"amortized" means. In this context the important issue was whether the
increment was expensive.

> : Therefore, determining the next node cannot be a log2(n)
> : operation.
>
> It can as long as a complete traversal looks like constant time.
> The same as vector::push_back which may be linear sometimes but
> in the long run must be averaged to constant.  This requirement
> forces grow to be multiplicative not additive.

In this context, it's the amortized time that matters, and the amortized
time cannot be log2(n).

> : Whether or not there's a per-node pointer to the next and
> : previous node is implementation detail, though that's the most obvious
> : way to meet this requirement.
>
> Never seen one.  The nodes already have left, right and parent
> pointers.  Consider a perfectly balanced tree with seven nodes.
> Begin must be constant time, so there must be a way to get to
> the left most node without starting at the root and working
> down the left side.  Now lets do the seven ++ operations to get
> to end.  1 up to parent(1).  2 down to right(1).  3 up to parent
> and up to root(2 lgN).  4 down right and down left(2 lgN).
> 5 up to parent(1).  6 down right(1).  7 up to parent and up to
> root and up to end(3 lgN).  Total 11 / 7 is amortized constant
> time of 2 pointer chases per ++.
>
> In a complete traversal of a tree, each arc will be followed
> in each direction once for a total of 2N.  Linear time traversal
> gives amortized constant time increment.

I said it was "the most obvious way"; I emphasized that it was an
implementation detail. The important point is that erase() would have a
cheap way of returning an iterator to the next node, removing that as
reason for erase() not returning it.
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]





Author: jpotter@falcon.lhup.edu (John Potter)
Date: 1999/06/08
Raw View
James Kuyper <kuyper@wizard.net> wrote:

: John Potter wrote:
: >
: > James Kuyper <kuyper@wizard.net> wrote:
: >
: > : Associative containers are required to have bidirectional iterators.
: > : Bidirectional iterators are required to provide constant-time ++ and
: > : --.
: >
: > Missing a key word, "amortized" constant time.

: I deliberately left it out to avoid an unnecessary discussion of what
: "amortized" means.

OK.  I will leave that to the other new thread.

: In this context the important issue was whether the
: increment was expensive.

: > : Therefore, determining the next node cannot be a log2(n)
: > : operation.
: >
: > It can as long as a complete traversal looks like constant time.
: > The same as vector::push_back which may be linear sometimes but
: > in the long run must be averaged to constant.  This requirement
: > forces grow to be multiplicative not additive.

: In this context, it's the amortized time that matters, and the amortized
: time cannot be log2(n).

Accepted.  Erase is only amortized constant time to start with and
could be log2(n) for a particular case.  Adding another log2(n) is
not expensive anyway.  It does not change either the worst case or
average case complexity.

The original objection to having erase return an iterator was based
on worst case behavior.  Your argument states that worst case is
not important and the decision should be made on average behavior.
I agree with your analysis which invalidates the original objection.

John
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]





Author: Philip Brabbin <pabrabbin@hotmail.com>
Date: 1999/06/03
Raw View
A while ago, I noticed that erase() is defined differently for
sequences and associative containers, in that there's a version which
returns an iterator to the next element for sequences but not for
associative containers.

I recently found out that originally both versions had void return
type, and the behaviour for sequences was extended to return the next
iterator.  Does anybody know the reason for this ?  I can't immediately
think why the behaviour can't be extended for associative containers as
well.

- Phil Brabbin


Sent via Deja.com http://www.deja.com/
Share what you know. Learn what you don't.


[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]






Author: Valentin Bonnard <Bonnard.V@wanadoo.fr>
Date: 1999/06/03
Raw View
Philip Brabbin wrote:

> A while ago, I noticed that erase() is defined differently for
> sequences and associative containers, in that there's a version which
> returns an iterator to the next element for sequences but not for
> associative containers.

Correct

> I recently found out that originally both versions had void return
> type, and the behaviour for sequences was extended to return the next
> iterator.  Does anybody know the reason for this ?  I can't immediately
> think why the behaviour can't be extended for associative containers as
> well.

I think it's a shame not to have extended AssocCont::erase the
same way. I have heard some nonsensical arguments against
AssocCont::erase returning an iterator but nothing serious.

Note that for associative containers you cannot write:

    it = m.erase (it);

but you can write

    m.erase (it++);

with the same effects.

BTW, erase in the Dinkumware returns an iterator for
associative containers.

--

Valentin Bonnard
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]





Author: "Ed Brey" <brey@afd.mke.etn.com>
Date: 1999/06/04
Raw View
Valentin Bonnard <Bonnard.V@wanadoo.fr> wrote in message
news:37571058.4432@wanadoo.fr...
>
> I think it's a shame not to have extended AssocCont::erase the
> same way. I have heard some nonsensical arguments against
> AssocCont::erase returning an iterator but nothing serious.

What do you think of the reasoning that ++it is often a more expensive
operation for an associative container than for a sequence container, and so
for an associative erase where the return value is not used, there would be
a performance penalty if it were provided?

I'm thinking of the situation in a binary tree when ++it requires walking
all the way up or down a tree, which would make ++it a log2(n) operation.
Of course, this is worst case, and for most values of it, the walking would
be much less.  Still, overall there may be enough overhead in incrementing
that erase shouldn't take the time to do an increment just to have it be
thrown away.

Assuming the increment overhead is significant, the next question is whether
an optimizer would see that the return value is not used and avoid
performing the increment anyway.
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]





Author: Valentin Bonnard <Bonnard.V@wanadoo.fr>
Date: 1999/06/04
Raw View
Ed Brey wrote:
>
> Valentin Bonnard <Bonnard.V@wanadoo.fr> wrote in message
> news:37571058.4432@wanadoo.fr...
> >
> > I think it's a shame not to have extended AssocCont::erase the
> > same way. I have heard some nonsensical arguments against
> > AssocCont::erase returning an iterator but nothing serious.
>
> What do you think of the reasoning that ++it is often a more expensive
> operation for an associative container than for a sequence container, and so
> for an associative erase where the return value is not used, there would be
> a performance penalty if it were provided?

Actually this is the nonsensical argument I was talking about.

> I'm thinking of the situation in a binary tree when ++it requires walking
> all the way up or down a tree, which would make ++it a log2(n) operation.

The next node in the tree is just one pointer away; every node has
next and previous pointers. You may like it or not, but this is one
of teh roots of the STL.

> Assuming the increment overhead is significant, the next question is whether
> an optimizer would see that the return value is not used and avoid
> performing the increment anyway.

It wouldn't be a trivial optimisation, but since the loop would
be something like:

while (test)
    assign auto variables

and if it's inline than I think it's doable.

--

Valentin Bonnard


[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]