Thread

Topic: Why doesn't list::splice always execute in constant time

Author: phalpern@newview.org (Pablo Halpern)
Date: 1999/11/03 Raw View

Valentin Bonnard <Bonnard.V@wanadoo.fr> wrote:

>Pablo Halpern wrote:
>
>> Ah. In that case, perhaps the standard should clarify that the splice
>> operation can take linear time but does not call any copy operations on
>> T. The thing that scared me was the prospect that splice would actually
>> copy the elements (which could be very expensive). Just counting them
>> doesn't bother me as much.
>
>If the allocators aren't compatibles, then the list has
>to copy the elements anyway.

Hmmm. The two lists must have the same allocator type, accoring to the
definition of splice(), and the implementation is permitted to assume
that the allocators compare equal [section 20.1.5]. If the
implementation does not make that assumption, however, you are correct.
-------------------------------------------------------------
Pablo Halpern                            phalpern@newview.org

I am self-employed. Therefore, my opinions *do* represent
those of my employer.

[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: Vesa A J Karvonen <vkarvone@cc.helsinki.fi>
Date: 1999/11/03 Raw View

Bill Wade <bill.wade@stoner.com> wrote:
> Vesa A J Karvonen wrote in message <7upkrm$2fg$1@oravannahka.helsinki.fi>...
> >If you use a combination of lazy and on-time evaluation, you can make it
> >so that repeated uses of list<>::size() are constant time and
> >list<>::splice() is always constant time.

> I don't see how to make M invocations of size() and splice() take O(M) time.
[snip]

Perhaps I was too unclear. I meant that repeated uses of size(), without
a splice() in between, would be constant time. Although I have no
statistics, I believe that loops with sequence splice()s between two
distinct lists and size() occure extremely rarely. Note that only the
sequence splice() operation (the one which takes three iterators) needs
to invalidate the cached size and only if the splice happens between two
distinct lists.

[snip - loop with splice and size]
> The solution you gave causes the loop to take O(M*N) time, where N is the
> combined size of the lists.

In some cases, you could get away with O(M) time (e.g. when the splice
is one element). However, the loop that you presented was rather
artificial. Can you come up with a reasonable application for such a
loop?

[snip]
> I think there are plenty of reasonable applications that would want O(1)
> list::size().  In queue applications we might want to add a processor when a
> list gets too big.  We might want to remove a processor when its queue gets
> too small (and splice its remaining elements into some other processor's
> queue).  A new customer may want to quickly find the shortest queue.

If you think about it carefully, you will notice that you need to find
both the first and last iterator in order to perform the splice. This
means that you may have to traverse O(n) elements. If you only perform
the splice infrequently, then the cached size scheme will give you
amortized constant time size().

> A stronger argument, IMO, for O(1) splice is that if splice is O(1) and size
> is O(N), it is reasonably easy for me to write a wrapper that reverses the
> O() behavior of the two functions.  On the other hand if splice is O(N) and
> size is O(1) I have to write my own list class entirely from scratch to get
> the other behavior.

That is certainly a good reason.

---
Vesa Karvonen

[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: phalpern@newview.org (Pablo Halpern)
Date: 1999/10/30 Raw View

scorp@btinternet.com (Dave Harris) wrote:

>phalpern@newview.org (Pablo Halpern) wrote:
>> Making splice execute in constant time seems like a reasonable
>> requirement to put in the standard. Any idea why it is allowed to
>> execute in linear time?
>
>The problem is list::size(). Some implementations want size() to be O(1),
>which they get by storing its value explicitly and updating it whenever
>items are added and removed. For the third version of splice, the only way
>to know how many items got added is to count them. Hence splice can be
>O(N).

Ah. In that case, perhaps the standard should clarify that the splice
operation can take linear time but does not call any copy operations on
T. The thing that scared me was the prospect that splice would actually
copy the elements (which could be very expensive). Just counting them
doesn't bother me as much.

-------------------------------------------------------------
Pablo Halpern                            phalpern@newview.org

I am self-employed. Therefore, my opinions *do* represent
those of my employer.
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: Valentin Bonnard <Bonnard.V@wanadoo.fr>
Date: 1999/10/30 Raw View

Pablo Halpern wrote:

> Ah. In that case, perhaps the standard should clarify that the splice
> operation can take linear time but does not call any copy operations on
> T. The thing that scared me was the prospect that splice would actually
> copy the elements (which could be very expensive). Just counting them
> doesn't bother me as much.

If the allocators aren't compatibles, then the list has
to copy the elements anyway.

--

Valentin Bonnard
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: "Bill Wade" <bill.wade@stoner.com>
Date: 1999/10/28 Raw View

Vesa A J Karvonen wrote in message <7upkrm$2fg$1@oravannahka.helsinki.fi>...
>If you use a combination of lazy and on-time evaluation, you can make it
>so that repeated uses of list<>::size() are constant time and
>list<>::splice() is always constant time.

I don't see how to make M invocations of size() and splice() take O(M) time.
Consider the loop
   for(i = 0; i < M; ++i)
   {
      splice from list1 to list2
      list2.size();
      splice from list2 to list1
      list1.size();
   }

The solution you gave causes the loop to take O(M*N) time, where N is the
combined size of the lists.

>I would assert that list<>::splice() is more frequent than
>list<>::size() in applications that use lists properly. The only real
>use for list<>::size(), I can think of, is when you want to reserve
>space for copying the list contents into a random access container.

I think there are plenty of reasonable applications that would want O(1)
list::size().  In queue applications we might want to add a processor when a
list gets too big.  We might want to remove a processor when its queue gets
too small (and splice its remaining elements into some other processor's
queue).  A new customer may want to quickly find the shortest queue.

A stronger argument, IMO, for O(1) splice is that if splice is O(1) and size
is O(N), it is reasonably easy for me to write a wrapper that reverses the
O() behavior of the two functions.  On the other hand if splice is O(N) and
size is O(1) I have to write my own list class entirely from scratch to get
the other behavior.
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: Vesa A J Karvonen <vkarvone@cc.helsinki.fi>
Date: 1999/10/25 Raw View

Dave Harris <scorp@btinternet.com> wrote:
> phalpern@newview.org (Pablo Halpern) wrote:
> > Making splice execute in constant time seems like a reasonable
> > requirement to put in the standard. Any idea why it is allowed to
> > execute in linear time?

> The problem is list::size(). Some implementations want size() to be O(1),
> which they get by storing its value explicitly and updating it whenever
> items are added and removed. For the third version of splice, the only way
> to know how many items got added is to count them. Hence splice can be
> O(N).

> > Is this a defect?

> Well, I think it's a mistake, but many people disagree. They like the O(1)
> size. However, size is not actually guaranteed to be O(1), so either way
> the standard is too loose (in my opinion). There's an interview somewhere
> with Stepanov, in which he indicates he'd have preferred the O(1) splice.

> On the other hand, I very much doubt this situation was due to oversight;
> I think the committee wanted it this way.

If you use a combination of lazy and on-time evaluation, you can make it
so that repeated uses of list<>::size() are constant time and
list<>::splice() is always constant time. Basically, what you do is that
you simply mark the size 'unknown' in list<>::splice(). Then you
implement each list<> operation so that they update the size, but do not
convert an unknown size into a known size. Finally, you make it so that
list<>::size() returns either the known size or recomputes the size when
it is unknown.

A highly efficient way to do this is to use a signed type as large as
the unsigned size_type to hold the list size. When you want to mark the
size unknown, you just store half the negative maximum into the size
(i.e. numeric_limits<difference_type>::min()/2). Updating an unknown
size, i.e. negative, in any operation, as if it was known, can never
make the size known (i.e. positive or zero). Finally, in list<>::size(),
you check if the size is negative, and if so, recompute the size. The
only extra overhead in this scheme is the one extra conditional in
list<>::size() (and also in some other operations, but only in a very
few operations).

However, this probably isn't the reason why list<>::size() is not
guaranteed to be O(1). I would suspect that the reason is simply to
allow minimal size empty lists (i.e. without the cached size). This can
be very important in some applications (that create millions of lists).

I would assert that list<>::splice() is more frequent than
list<>::size() in applications that use lists properly. The only real
use for list<>::size(), I can think of, is when you want to reserve
space for copying the list contents into a random access container. When
you do that, there is little advantage to have constant time
list<>::size(), because the entire operation is linear time anyway.
Furthermore, the operation is linear time even without the reserve,
when the random access container grows exponentially, although it can
save considerable processing time.

---
Vesa Karvonen
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: phalpern@newview.org (Pablo Halpern)
Date: 1999/10/21 Raw View

In section 23.2.2.4 [lib.list.ops], the description of the third splice
function (splice(iterator position, list x, iterator first, iterator
last)) says that the complexity is constant time if &x == this and
linear time otherwise. This seems to defeat the point of a splice
function. (The same effect could be gotton in linear time with an insert
operation followed by an erase operation.) If list<T> is implemented as
a doubly-linked list, then splice should execute in constant time
regardless of whether the splice is between two lists or within a single
list. Making splice execute in constant time seems like a reasonable
requirement to put in the standard. Any idea why it is allowed to
execute in linear time? Is this a defect?

-------------------------------------------------------------
Pablo Halpern                            phalpern@newview.org

I am self-employed. Therefore, my opinions *do* represent
those of my employer.
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: jpotter@falcon.lhup.edu (John Potter)
Date: 1999/10/21 Raw View

On 21 Oct 99 06:50:51 GMT, phalpern@newview.org (Pablo Halpern) wrote:

: In section 23.2.2.4 [lib.list.ops], the description of the third splice
: function (splice(iterator position, list x, iterator first, iterator
: last)) says that the complexity is constant time if &x == this and
: linear time otherwise. This seems to defeat the point of a splice
: function. (The same effect could be gotton in linear time with an insert
: operation followed by an erase operation.) If list<T> is implemented as
: a doubly-linked list, then splice should execute in constant time
: regardless of whether the splice is between two lists or within a single
: list. Making splice execute in constant time seems like a reasonable
: requirement to put in the standard. Any idea why it is allowed to
: execute in linear time? Is this a defect?

A feature :)

See also the discussion about list.size().  Making size execute in
constant time seems like a reasonable requirement to put in the
standard.  It's not there.  If size is constant time, then splice is
linear to adjust the two sizes.  If size is linear, the splice is
constant because there is no size_ member.  If they are both linear,
it's QoI.  If they are both constant, it's ???

John
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: scorp@btinternet.com (Dave Harris)
Date: 1999/10/21 Raw View

phalpern@newview.org (Pablo Halpern) wrote:
> Making splice execute in constant time seems like a reasonable
> requirement to put in the standard. Any idea why it is allowed to
> execute in linear time?

The problem is list::size(). Some implementations want size() to be O(1),
which they get by storing its value explicitly and updating it whenever
items are added and removed. For the third version of splice, the only way
to know how many items got added is to count them. Hence splice can be
O(N).

> Is this a defect?

Well, I think it's a mistake, but many people disagree. They like the O(1)
size. However, size is not actually guaranteed to be O(1), so either way
the standard is too loose (in my opinion). There's an interview somewhere
with Stepanov, in which he indicates he'd have preferred the O(1) splice.

On the other hand, I very much doubt this situation was due to oversight;
I think the committee wanted it this way.

  Dave Harris, Nottingham, UK | "Weave a circle round him thrice,
      brangdon@cix.co.uk      |   And close your eyes with holy dread,
                              |  For he on honey dew hath fed
 http://www.bhresearch.co.uk/ |   And drunk the milk of Paradise."
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]