Thread

Topic: computational complexity in STL

Author: jcoffin@taeus.com (Jerry Coffin)
Date: 1998/07/03 Raw View

In article <359AD2BC.6190@noSPAM.central.beasys.com>,
david.tribble@noSPAM.central.beasys.com says...

[ ... ]

> Jerry Coffin wrote:
> > You're missing the point: the standard doesn't USE big-O notation.
> > Instead, it specifies things like (quoting from CD2 in the description
> > of sort): "Complexity: Approximately NlogN (where N==last-first)
> > comparisons on average."

[ ... ]

> The standard uses the term "complexity", as in "algorithmic
> complexity".  You can't change the measure of the complexity of an
> algorithm simply by changing the logarithm bases.  An algorithm
> exhibits logarithmic (or linear, or quadratic, or exponential, or
> whatever) complexity no matter how you measure it, and no matter
> how big N is.

It uses "Complexity" simply as a label.  The only part that it looks
to me like can be enforced is what follows: the number of comparisons
that can be made.

> Or think of it this way: divide log(time(N2)) by log(time(N1)) and
> you'll see a linear relationship if it has logarithmic complexity.
> The division will cancel out the time units used, and the base of
> the logarithm used will drop out as well.  Complexity is really
> a characterization of ratios, and as such has no units or number
> base.

Yes, I'm well aware of what they intended, and how algorithmic
complexity is normally computed.  The question is whether the wording
in the standard actually enforces anything of the sort.  I'm
reasonably certain that the answer is NO, it does not.

--
    Later,
    Jerry.

The Universe is a figment of its own imagination.

[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: Christopher Eltschka <celtschk@physik.tu-muenchen.de>
Date: 1998/07/03 Raw View

David R Tribble wrote:

[...]

> The standard uses the term "complexity", as in "algorithmic
> complexity".  You can't change the measure of the complexity of an
> algorithm simply by changing the logarithm bases.  An algorithm
> exhibits logarithmic (or linear, or quadratic, or exponential, or
> whatever) complexity no matter how you measure it, and no matter
> how big N is.
>
> Beside, algorithmic complexity really only means something when
> you COMPARE the time/space requirements for different values of N.
> If log(time(N1)) = K*log(time(N2)) for all N above a minimum
> arbitrary value, then the routine exhibits logarithmic complexity.
>
> Or think of it this way: divide log(time(N2)) by log(time(N1)) and
> you'll see a linear relationship if it has logarithmic complexity.
> The division will cancel out the time units used, and the base of
> the logarithm used will drop out as well.  Complexity is really
> a characterization of ratios, and as such has no units or number
> base.
>
> So it doesn't matter what the logarithm base is.

Since there may be an arbitrary constant be added (since O(log N)
is the same as O(log N)+const.), the quotient alone would not
be sufficient. Instead, you'd have to fit a curve C1 + C2*log(N)
to your container, and see how well it fits (and even that may
not be sufficient in all cases, for example, in linear case,
O(N) could be C1*N + C2* log(N) + C3*sqrt(N) + C4*N^(1/10) + C5
- the dominant term still would be N.

However, if viewed in a range small enough, the logarithm curve
is nearly indistinguishable from a linear curve.
What if a vector decided to expand by a factor of 1.001
with consequent rounding up? This would satisfy the constant time
condition (since for large N, the extra allocated memory is
proportional to the memory already allocated), but for size()<1000
it would result in the container expanded by one single element.
This can be optimized by using if(size<1000) allocate_1_extra();
else allocate_1point001_times_size(); and would therefore still
be conforming, since the behaviour has not changed for the previous.
Now, if due to resource limits the container cannot be larger than
1000, the test can well be ommitted, and allocate_1_extra() can
be directly used - still not changing the behaviour of the
container. Therefore, this container should be conforming, since
its behaviour cannot be distinguished from the very first, std
conforming container by any conforming program (inside the resource
limits, both allocate only the one extra element they need.
The first one uses an algorithm as demanded by the standard
- the O(1) behaviour of adding an element can effectively be
seen only far beyond the resource limits, though). Since no
conforming program can tell the differnce to the last container,
the last container is conforming under the as-if rule.

I'm shure this is not the intention of the standard, though.
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: David R Tribble <david.tribble@noSPAM.central.beasys.com>
Date: 1998/07/02 Raw View

Jerry Coffin wrote ...
>>>Well, technically, you don't even need to get into defining
>>>"approximately" -- the standard says things like "N Log N", but
>>>doesn't define the base of the logarithm.  By choosing the
>>>appropriate
>>>base for the logarithm, you can more or less pick arbitrary numbers
>>>and make them fit...

ross.s@ihug.co.nz wrote:
>> I'm afraid not. Logs to different bases differ only by a constant
>> factor. In terms if big-O notation, which ignores constant factors, >> any expression involving "log N" represents the same complexity
>> regardless of what the base of logs is.

Jerry Coffin wrote:
> You're missing the point: the standard doesn't USE big-O notation.
> Instead, it specifies things like (quoting from CD2 in the description
> of sort): "Complexity: Approximately NlogN (where N==last-first)
> comparisons on average."
>
> Note that neither Big-O notation, NOR a base of the logarithm is
> specified.  For any finite sized set, you can make virtually any
> algorithm  meet the specification simply by selecting the correct base
> of the logarithm.
>
> In short, Big-O notation refers to a purely theoretical situation
> where you're dealing with sets of arbitrary size.  The standard, of
> necessity, restricts itself to dealing with sets of finite size.  As
> long as the sizes remain finite, and the base of the logarithm remains
> unspecified, the result is that virtually any algorithm (technically)
> fits the standard's specification for most of the sorting algorithms.

The standard uses the term "complexity", as in "algorithmic
complexity".  You can't change the measure of the complexity of an
algorithm simply by changing the logarithm bases.  An algorithm
exhibits logarithmic (or linear, or quadratic, or exponential, or
whatever) complexity no matter how you measure it, and no matter
how big N is.

Beside, algorithmic complexity really only means something when
you COMPARE the time/space requirements for different values of N.
If log(time(N1)) = K*log(time(N2)) for all N above a minimum
arbitrary value, then the routine exhibits logarithmic complexity.

Or think of it this way: divide log(time(N2)) by log(time(N1)) and
you'll see a linear relationship if it has logarithmic complexity.
The division will cancel out the time units used, and the base of
the logarithm used will drop out as well.  Complexity is really
a characterization of ratios, and as such has no units or number
base.

So it doesn't matter what the logarithm base is.

-- David R. Tribble, dtribble@technologist.com --
-- C++, the PL/1 of the 90s.

[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: jcoffin@taeus.com (Jerry Coffin)
Date: 1998/07/01 Raw View

In article <6n99vu$7un$1@newsource.ihug.co.nz>, ross.s@ihug.co.nz
says...
> Jerry Coffin wrote ...
> >
> >Well, technically, you don't even need to get into defining
> >"approximately" -- the standard says things like "N Log N", but
> >doesn't define the base of the logarithm.  By choosing the appropriate
> >base for the logarithm, you can more or less pick arbitrary numbers
> >and make them fit...
>
> I'm afraid not. Logs to different bases differ only by a constant
> factor. In terms if big-O notation, which ignores constant factors, any
> expression involving "log N" represents the same complexity regardless
> of what the base of logs is.

You're missing the point: the standard doesn't USE big-O notation.
Instead, it specifies things like (quoting from CD2 in the description
of sort): "Complexity: Approximately NlogN (where N==last-first)
comparisons on average."

Note that neither Big-O notation, NOR a base of the logarithm is
specified.  For any finite sized set, you can make virtually any
algorithm  meet the specification simply by selecting the correct base
of the logarithm.

In short, Big-O notation refers to a purely theoretical situation
where you're dealing with sets of arbitrary size.  The standard, of
necessity, restricts itself to dealing with sets of finite size.  As
long as the sizes remain finite, and the base of the logarithm remains
unspecified, the result is that virtually any algorithm (technically)
fits the standard's specification for most of the sorting algorithms.

--
    Later,
    Jerry.

The Universe is a figment of its own imagination.
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]