Topic: Complexity of std::list<T>::size()


Author: pavel_vozenilek@yahoo.co.uk (Pavel Vozenilek)
Date: Thu, 29 May 2003 04:48:17 +0000 (UTC)
Raw View
jgottman@carolina.rr.com ("Joe Gottman") wrote in message news:<nEXwa.55040
[snip]
> > The choice is actually between having size() run in constant time and
> > having this last case of splice() run in constant time.
>
>    There's another possible choice.  Keep an private mutable Boolean member
> variable size_is_known_ .  When the problem splice() is called, set
> size_is_known_ to false.  When size() is called, if size_is_known_ is false
> recalculate and cache the size of the list, and set size_is_known_ to true.
> This way we have constant-time splice(), and we also have constant time
> size() as long as the problem splice() was not called.
>
The mutable member is not needed: after splice() the 'm_size' value
can be set to value -1024 * 1024 * 1024 (-1G). If 'm_size' is
negative, size() must calculate real size (and set 'm_size').
Functions like push_back() will preserve 'm_size' negative-ness.

/Pavel

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]





Author: AlbertoBarbati@libero.it (Alberto Barbati)
Date: Thu, 29 May 2003 04:49:34 +0000 (UTC)
Raw View
Bo Persson wrote:
> Unfortunately, if you move a segment from one list to another,  you loose
> count of the number of elements in each list. That is the conflict - when
> are you going to count the elements, during the splice or whenever you need
> to know the size? What if you never need the size? What if you never do a
> splice?

What I am going to say may be obvious to many of you, but there is a
simple solution to this problem. Suppose the list keeps its size in a
data member. Such a counter can be kept up-to-date in constant time for
all list operations except:

1) range ctor/assign()/insert()
2) range erase()
3) four-argument splice() when &x!=this

Cases #1 and #2 already take linear time, so counting the element won't
add complexity to those operations (btw, in case the range is described
by random iterators, it is not even necessary to do the count).

In case #3 the counter is set to a special value meaning "size unknown"
and is no longer kept up-to-date.

Method size() will return the value of the counter if it's valid or
count the element otherwise. In the latter case, the counter is reset to
the computed value.

Summarizing, with such an implementation we have:

1) splice() always takes linear time
2) size() always takes linear time unless a "difficult" splice is
performed. In this case the first call to size() *only* takes linear time.

How does it sound? Are there STL implementations using this approach? Of
course, given an implementation that doesn't cache the size it's very
easy to provide a size-caching wrapper, while you can't do the other way
round. So I don't blame implementors that don't provide such feature.

Alberto Barbati

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]





Author: bop2@telia.com ("Bo Persson")
Date: Thu, 29 May 2003 15:54:27 +0000 (UTC)
Raw View
"Alberto Barbati" <AlbertoBarbati@libero.it> skrev i meddelandet
news:XrbBa.41987$lK4.1260046@twister1.libero.it...
> Bo Persson wrote:
> > Unfortunately, if you move a segment from one list to another,  you
loose
> > count of the number of elements in each list. That is the conflict -
when
> > are you going to count the elements, during the splice or whenever you
need
> > to know the size? What if you never need the size? What if you never do
a
> > splice?
>
> What I am going to say may be obvious to many of you, but there is a
> simple solution to this problem. Suppose the list keeps its size in a
> data member. Such a counter can be kept up-to-date in constant time for
> all list operations except:
>
> 1) range ctor/assign()/insert()
> 2) range erase()
> 3) four-argument splice() when &x!=this
>
> Cases #1 and #2 already take linear time, so counting the element won't
> add complexity to those operations (btw, in case the range is described
> by random iterators, it is not even necessary to do the count).
>
> In case #3 the counter is set to a special value meaning "size unknown"
> and is no longer kept up-to-date.
>
> Method size() will return the value of the counter if it's valid or
> count the element otherwise. In the latter case, the counter is reset to
> the computed value.
>
> Summarizing, with such an implementation we have:
>
> 1) splice() always takes linear time
> 2) size() always takes linear time unless a "difficult" splice is
> performed. In this case the first call to size() *only* takes linear time.
>
> How does it sound? Are there STL implementations using this approach? Of
> course, given an implementation that doesn't cache the size it's very
> easy to provide a size-caching wrapper, while you can't do the other way
> round. So I don't blame implementors that don't provide such feature.

It's one possible compromise, giving up some extremes for the price of a
slighly worse average. I haven't seen anyone doing it this way.

One problem might be that you sort of have to add some additional logic to
all other operations, to check if the current size is known, so they can
increment or decrement it. Or you can choose INT_MIN/2 for the invalid-size
value, and hope that it will stay negative even if you add or remove a large
number of nodes. Will it? How large can a std::list<char> really be anyway??
:-)



Bo Persson
bop2@telia.com

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]





Author: AlbertoBarbati@libero.it (Alberto Barbati)
Date: Fri, 30 May 2003 21:19:41 +0000 (UTC)
Raw View
Bo Persson wrote:
> [snip] How large can a std::list<char> really be anyway??
> :-)

Exactly the value returned by std::list<char>::max_size(), a value which
is implementation-dependent.

So the question is: will the user be upset if its favorite
implementation has a max_size() that returns INT_MAX/2 - 1?

:-))

Alberto Barbati

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]





Author: sksjava@hotmail.com ("sks_cpp")
Date: Tue, 27 May 2003 17:17:46 +0000 (UTC)
Raw View
> > Out of curiosity i checked the C++ Standard just to see what it says,
and
> > all i can see relating to this issue is:
> >     *    23.1 (table 65) - "[size()] should have constant complexity"
> >     *    23.2.2.4 - splice() must take constant time

How can you implement splicing to have constant time? Aren't you possibly
copying the list nodes from one list to another? What am I missing here?

Thanks.

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]





Author: bop2@telia.com ("Bo Persson")
Date: Tue, 27 May 2003 23:36:05 +0000 (UTC)
Raw View
""sks_cpp"" <sksjava@hotmail.com> skrev i meddelandet
news:64nAa.149955$3n5.141897@news2.central.cox.net...
> > > Out of curiosity i checked the C++ Standard just to see what it says,
> and
> > > all i can see relating to this issue is:
> > >     *    23.1 (table 65) - "[size()] should have constant complexity"
> > >     *    23.2.2.4 - splice() must take constant time
>
> How can you implement splicing to have constant time? Aren't you possibly
> copying the list nodes from one list to another? What am I missing here?

Splice isn't copying the nodes, it is *moving* them from one position to
another, possibly even into another list. This can be done by just
rearranging some pointers at the end of the sublist moved.

Unfortunately, if you move a segment from one list to another,  you loose
count of the number of elements in each list. That is the conflict - when
are you going to count the elements, during the splice or whenever you need
to know the size? What if you never need the size? What if you never do a
splice?


>
> Thanks.
>


Bo Persson
bop2@telia.com

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]





Author: markDOTvandijk@perform-solDOTcom.ucar.edu ("Mark van Dijk")
Date: Thu, 15 May 2003 16:45:21 +0000 (UTC)
Raw View
 Hi there,  I posted this question to comp.lang.c++, but i didn't receive a
response there, so i'll try here...

According to Scott Meyers' Effective STL (Item 4), either
std::list<T>.size() or std::list<T>.splice() can run in constant time - but
not both.  He states that different library implimentations may choose
between these two options.

Out of curiosity i checked the C++ Standard just to see what it says, and
all i can see relating to this issue is:
    *    23.1 (table 65) - "[size()] should have constant complexity"
    *    23.2.2.4 - splice() must take constant time

So it seems to me that the only valid interpretation is that
std::list<>.size() takes liner time, and std::list<>.splice() takes constant
time.

Have i missed something?  Are library implementors free to choose between
these options or not?

Cheers
Mark van Dijk


---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]





Author: petebecker@acm.org (Pete Becker)
Date: Thu, 15 May 2003 21:02:31 +0000 (UTC)
Raw View
Mark van Dijk wrote:
>
>  Hi there,  I posted this question to comp.lang.c++, but i didn't receive a
> response there, so i'll try here...
>
> According to Scott Meyers' Effective STL (Item 4), either
> std::list<T>.size() or std::list<T>.splice() can run in constant time - but
> not both.  He states that different library implimentations may choose
> between these two options.
>
> Out of curiosity i checked the C++ Standard just to see what it says, and
> all i can see relating to this issue is:
>     *    23.1 (table 65) - "[size()] should have constant complexity"
>     *    23.2.2.4 - splice() must take constant time
>
> So it seems to me that the only valid interpretation is that
> std::list<>.size() takes liner time, and std::list<>.splice() takes constant
> time.
>
> Have i missed something?  Are library implementors free to choose between
> these options or not?
>

Not free to, but forced to. You can't have both. Think about how you'd
update size when you do a splice...

--

Pete Becker
Dinkumware, Ltd. (http://www.dinkumware.com)

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]





Author: do-not-spam-ben.hutchings@businesswebsoftware.com (Ben Hutchings)
Date: Thu, 15 May 2003 21:02:40 +0000 (UTC)
Raw View
In article <3ec3259e$1@news.iconz.co.nz>, "Mark van Dijk" wrote:
>  Hi there,  I posted this question to comp.lang.c++, but i didn't receive a
> response there, so i'll try here...
>
> According to Scott Meyers' Effective STL (Item 4), either
> std::list<T>.size() or std::list<T>.splice() can run in constant time - but
> not both.  He states that different library implimentations may choose
> between these two options.
>
> Out of curiosity i checked the C++ Standard just to see what it says, and
> all i can see relating to this issue is:
>     *    23.1 (table 65) - "[size()] should have constant complexity"
>     *    23.2.2.4 - splice() must take constant time
>
> So it seems to me that the only valid interpretation is that
> std::list<>.size() takes liner time, and std::list<>.splice() takes constant
> time.
>
> Have i missed something?  Are library implementors free to choose between
> these options or not?

I don't have that book, so I don't know if you've read it correctly.  As
you stated the choice, it's not quite correct.

There are four different cases to consider:

1. Splicing an entire list into another list: size increases by the size
   of the other list and the size of the other list becomes 0.
2. Splicing a single element into a list: size is unchanged if it's in
   the same list; otherwise the size increases by 1 and the size of the
   other list decreases by 1.
3a. Splicing an arbitrary range of elements to another position in the
    same list: size is unchanged.
3b. Splicing an arbitrary range of elements into another list: size
    increases by the length of the range.

If the list maintains its size as a member variable to allow size() to
run in constant time, then the first three cases can be handled in
constant time.  Only case 3b will require counting elements.  The
standard requires splice() to have constant time complexity except in
case 3b where it is allowed to have linear time complexity.

The choice is actually between having size() run in constant time and
having this last case of splice() run in constant time.

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]





Author: jpotter@falcon.lhup.edu (John Potter)
Date: Thu, 15 May 2003 22:29:42 +0000 (UTC)
Raw View
On Thu, 15 May 2003 16:45:21 +0000 (UTC),
markDOTvandijk@perform-solDOTcom.ucar.edu ("Mark van Dijk") wrote:

> According to Scott Meyers' Effective STL (Item 4), either
> std::list<T>.size() or std::list<T>.splice() can run in constant time - but
> not both.  He states that different library implimentations may choose
> between these two options.

> Out of curiosity i checked the C++ Standard just to see what it says, and
> all i can see relating to this issue is:
>     *    23.1 (table 65) - "[size()] should have constant complexity"
>     *    23.2.2.4 - splice() must take constant time

There are three versions of splice.  The third, splice(pos, lis, first,
last) may be linear when &lis != this.

> Have i missed something?  Are library implementors free to choose between
> these options or not?

Yes they are.

John

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]





Author: jgottman@carolina.rr.com ("Joe Gottman")
Date: Fri, 16 May 2003 18:46:58 +0000 (UTC)
Raw View
"Ben Hutchings" <do-not-spam-ben.hutchings@businesswebsoftware.com> wrote in
message news:slrnbc7nif.16k.do-not-spam-ben.hutchings@tin.bwsint.com...
> There are four different cases to consider:
>
> 1. Splicing an entire list into another list: size increases by the size
>    of the other list and the size of the other list becomes 0.
> 2. Splicing a single element into a list: size is unchanged if it's in
>    the same list; otherwise the size increases by 1 and the size of the
>    other list decreases by 1.
> 3a. Splicing an arbitrary range of elements to another position in the
>     same list: size is unchanged.
> 3b. Splicing an arbitrary range of elements into another list: size
>     increases by the length of the range.
>
> If the list maintains its size as a member variable to allow size() to
> run in constant time, then the first three cases can be handled in
> constant time.  Only case 3b will require counting elements.  The
> standard requires splice() to have constant time complexity except in
> case 3b where it is allowed to have linear time complexity.
>
> The choice is actually between having size() run in constant time and
> having this last case of splice() run in constant time.

   There's another possible choice.  Keep an private mutable Boolean member
variable size_is_known_ .  When the problem splice() is called, set
size_is_known_ to false.  When size() is called, if size_is_known_ is false
recalculate and cache the size of the list, and set size_is_known_ to true.
This way we have constant-time splice(), and we also have constant time
size() as long as the problem splice() was not called.

Joe Gottman

Joe Gottman

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]