Topic: should std::vector<> exponential growth rate be followedstrictly in times of low availabe memory.


Author: tom_usenet@hotmail.com (Tom Widmer)
Date: Thu, 4 Nov 2004 22:29:30 GMT
Raw View
On Wed, 27 Oct 2004 03:43:46 GMT, dhruvbird@gmx.net ("Dhruv Matani")
wrote:

>Yes, but it would obviously be a nonsense implementation as you have
>mentioned. I'm looking more at making it allowable by the standard in
>such low memory situations to deviate from the exponential growth
>policy.

And the standard already allows that, since it doesn't specify
"exponential growth", but rather amortized constant time push_back.

However, I don't think a 1GB+ std::vector is a very good idea in any
case. If you use malloc, then you get access to realloc, or you should
perhaps be using a custom allocator, or redesigning your algorithm,
perhaps to allow a segmented container of some kind, or to use less
RAM.

Tom

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]





Author: dhruvbird@gmx.net ("Dhruv Matani")
Date: Fri, 5 Nov 2004 06:18:04 GMT
Raw View
On Thu, 04 Nov 2004 22:29:30 +0000, Tom Widmer wrote:

> On Wed, 27 Oct 2004 03:43:46 GMT, dhruvbird@gmx.net ("Dhruv Matani")
> wrote:
>
>>Yes, but it would obviously be a nonsense implementation as you have
>>mentioned. I'm looking more at making it allowable by the standard in
>>such low memory situations to deviate from the exponential growth
>>policy.
>
> And the standard already allows that, since it doesn't specify
> "exponential growth", but rather amortized constant time push_back.
>
> However, I don't think a 1GB+ std::vector is a very good idea in any
> case. If you use malloc, then you get access to realloc, or you should
> perhaps be using a custom allocator, or redesigning your algorithm,
> perhaps to allow a segmented container of some kind, or to use less
> RAM.

Yes, but in certain cases, it becomes necessary for using such big
vectors. Like in my case, I want to use it as a replacement for
std::multimap<>, because:
1. The locality of the nodes is destroyed by the re-balancing algorithm,
and
2. Every insertion involves touching a lot of data, so places where data
is coming in at a very fast rate, this is unacceptable.

So, I would rather use a vector and then sort it after all the insertions
are over.(because the application/use happens to be ammenable to that
change).

Here, I know in advance the maximum size of the vector, but in other cases
I might not. Now, given that fact that I have say 2GB of Phy. mem. I
should not be consrtained by the exponential growth policy.

Also, as others have mentioned, the underlying implementation could use
1.5 as the factor, but then, again that would cause a lot of copies
unnecessarily. Esp. for a heavy copying object.

But, then again looking at Dave Harris' reply, I'm not quite convinced
about the way I'm using vector?
Maybe?

Reards,
-Dhruv.

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]





Author: Michael.Karcher@writeme.com (Michael Karcher)
Date: Fri, 5 Nov 2004 15:52:14 GMT
Raw View
"Dhruv Matani" <dhruvbird@gmx.net> wrote:
>> case. If you use malloc, then you get access to realloc, or you should
>> perhaps be using a custom allocator, or redesigning your algorithm,
>> perhaps to allow a segmented container of some kind, or to use less
>> RAM.
[I don't use multimap because...]
> 2. Every insertion involves touching a lot of data, so places where data
> is coming in at a very fast rate, this is unacceptable.

If data is coming in at a very fast rate, why don't you use a deque (a
segmented container, as you were told).

Michael Karcher

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]





Author: dhruvbird@gmx.net ("Dhruv Matani")
Date: Sat, 6 Nov 2004 18:39:21 GMT
Raw View
On Fri, 05 Nov 2004 15:52:42 +0000, johnchx wrote:

> dhruvbird@gmx.net ("Dhruv Matani") wrote
>
>> 2. Every insertion involves touching a lot of data, so places where data
>> is coming in at a very fast rate, this is unacceptable.
>
> If you need *consistently* fast push_back() times -- to keep up with a
> fast input source, for instance -- then vector's amortized-constant
> time probably won't be good enough.  Amortized-constant time is fine
> if you don't mind the application grinding to a halt from time to time
> while reallocation is performed, but it sounds like you may not be
> able to afford it.

You're probably right, but before going one way or the other I would like
to profile and see for myself what the actual situation would be like
practically. I would tend to agree with your observation as of now though.


>
>> Here, I know in advance the maximum size of the vector, but in other cases
>> I might not.
>
> Your best bet is probably to use vector only if you know the maximum
> size in advance.  If you can reserve() sufficient space up-front, you
> don't have to worry about re-allocation.  If you can't, and you can't
> afford long copy times while receiving input, you'll probably be
> better off with a deque.

Yes, even Michael Karcher has mentioned that, and it seems like a nice
prospect. Again, iteration would suffer here. And believe me, iteration is
quite slow compared to vector's, but I have to first concretely apply both
the containers and time and then make any non-naive observations.


>
>> Now, given that fact that I have say 2GB of Phy. mem. I
>> should not be consrtained by the exponential growth policy.
>
> Physical memory is probably irrelevant.  The real constraint is more
> likely to be the amount of unfragmented space available in the portion
> of the virtual address space of the process which is earmarked for
> heap allocation.  If you don't absolutely need the entire data
> structure to be in a single continguous chunk of address space, you'll
> get better results (fewer bad_alloc exceptions, no copies, no
> reallocation delays) from a deque.

Yes.

>
> Of course, it's not hard to implement your "reallocation fallback"
> policy with a wrapper function like this:
>
> template < class T, class A >
> void modified_push_back ( std::vector<T,A>& v, const T& t )
> {
>   try {
>     v.push_back(t);
>   }
>   catch (std::bad_alloc&) {
>     std::vector<T,A> v2;
>     v2.reserve(v.capacity() + 1 );  // or whatever
>     std::copy( v.begin(), v.end(), std::back_inserter(v2) );
>     v.swap(v2);
>   }

I don't think this would work for the simple reason that a vector
implementation is not required to allocate ONLY n bytes(memory for n
objects) after a reserve(n)
command. The guarantee is that at least n objects can be inserted(assuming
that it is empty initially) without reallocation. The vector can do this
internally:

vector<T>::reserve(int n)
{
  int new_size = std::max(this->capacity()*2, n);
  // Normal stuff.
}

And we would be back to square one.

Regards,
-Dhruv.

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]





Author: dhruvbird@gmx.net ("Dhruv Matani")
Date: Sun, 7 Nov 2004 17:30:55 GMT
Raw View
On Sat, 06 Nov 2004 18:39:21 +0000, Dhruv Matani wrote:

[snip]...

>> Of course, it's not hard to implement your "reallocation fallback"
>> policy with a wrapper function like this:
>>
>> template < class T, class A >
>> void modified_push_back ( std::vector<T,A>& v, const T& t )
>> {
>>   try {
>>     v.push_back(t);
>>   }
>>   catch (std::bad_alloc&) {

>>     std::vector<T,A> v2; _________[0]

>>     v2.reserve(v.capacity() + 1 );  // or whatever
>>     std::copy( v.begin(), v.end(), std::back_inserter(v2) );
>>     v.swap(v2);
>>   }
>
> I don't think this would work for the simple reason that a vector
> implementation is not required to allocate ONLY n bytes(memory for n
> objects) after a reserve(n)
> command. The guarantee is that at least n objects can be inserted(assuming
> that it is empty initially) without reallocation. The vector can do this
> internally:
>
> vector<T>::reserve(int n)
> {
>   int new_size = std::max(this->capacity()*2, n);
>   // Normal stuff.
> }
>
> And we would be back to square one.

Ah! Sorry, I did not see [0]. I was under the impression that you are
reserving space in the original vector.

Yes, this would work!

Regards,
-Dhruv.

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]





Author: dhruvbird@gmx.net ("Dhruv Matani")
Date: Wed, 27 Oct 2004 03:43:46 GMT
Raw View
On Tue, 26 Oct 2004 21:18:18 +0000, chris wrote:

> Dhruv Matani wrote:
>> Hello,
>>  I recently had a discussion about whether the exponential growth policy
>> for std::vector<> should be followed strictly no matter what.
>>
>> The case I'm looking at is something like this:
>>
>> const int init_sz = 1.2 * 1024 * 1024 * 1024; // 1.2GB.
>> std::vector<char> cv(init_sz);
>> cv.push_back('a');
>>
> ..
>> Assume that the growth factor is 2.
> ..
>> What my argument is that an implementations should be allowed to deviate
>> slightly from the standard to prevent useless exception like the above
>> from being thrown. It should be clear that from the user's point of view,
>> what he sees is that he already has 1.2GB of data, and he is trying to add
>> 1-byte to that! Why for any reason should that fail? Assume that the user
>> is in no position to use reserve. Obviously if that is considered, then
>> the whole post is useless ;-)
> If your only aim is to remain standards compliant, then one option would
> be to reduce what you claim the minimum growth factor will be in tight
> memory situations, say to 1.05. If you can't extend the vector by at
> least 5% then is it really worth doing it? In this way you could perform
> better in low memory situations but still be standard compliant.
>
> If you wanted to be really sick, you could claim that your minimum
> growth factor is 1+5*10^(-20), which would mean that on a system with a
> 64-bit address space you would still never have to expand by more than
> 1, but I believe you could still claim to be standards compliant ;)

Yes, but it would obviously be a nonsense implementation as you have
mentioned. I'm looking more at making it allowable by the standard in
such low memory situations to deviate from the exponential growth
policy.

Reagrds,
-Dhruv.

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]