Topic: growing a large vector iteratively


Author: "jams.lee" <jams.lee.1@gmail.com>
Date: Sun, 26 Aug 2007 00:27:04 CST
Raw View
===================================== MODERATOR'S COMMENT:
 Please trim moderation footers when replying to posts.


===================================== END OF MODERATOR'S COMMENT
er       :
> On Aug 15, 1:21 pm, Martin Bonner <martinfro...@yahoo.co.uk> wrote:
>> On Aug 12, 8:35 pm, er <erwann.rog...@gmail.com> wrote:> hi,
>>
>>> in the code below, i compare two options i) and ii). i have a feeling
>>> we should use i) if A's size is a smaller than some threshold ii)
>>> otherwise
>>> 1-is this correct?
>> In a very general sense, yes (usually)> 2-what would that threshold be?
>>
>> You need to profile.
>>
>>> 3-more precisely, in defining "A's size" i'm not sure if both member
>>> functions and member variables affect the memory needed to store/copy
>>> an object or only the member variables).
>> Member functions are irrelevent.  The real question is how much work
>> is involved in performing the copy construction?  If A has a member
>> which is a large vector, then copying that could get very expensive
>> (even though sizeof(vector<>) is typically three pointers.)
>>
>> You NEED to profile!
>>
>>> i)
>>> vector<A> v;
>>> while(has_not_converged()){
>>>  //some code that generates a temporary A a;
>>>  v.push_back(a);
>> Note that this ALWAYS invokes one copy constructor for A.  On average,
>> it will invoke one more copy constructor for the resize (assuming
>> doubling.  If the vector capacity grows by 50%, I think that rises to
>> about 2.25).
>>
>> Your optimization is worthwhile once the cost of a few copies of A
>> becomes large compared to the cost of generating the temporary.
>>
>>> };
>>> //typically converges after up to a large, but unknown, number of
>>> loops, such as say 100, 1000
>>> ii)
>>> vector<A*> v;
>>> while(has_not_converged()){
>>>  //some code that generates A* a = new A();
>>>  v.push_back(a);
>>> };
>> ---
>> [ comp.std.c++ is moderated.  To submit articles, try just posting with ]
>> [ your news-reader.  If that fails, use mailto:std-...@ncar.ucar.edu    ]
>> [              --- Please see the FAQ before posting. ---               ]
>> [ FAQ:http://www.comeaucomputing.com/csc/faq.html                     ]
>
> thank you both for these very clear answers.
>
>
>
> ---
> [ comp.std.c++ is moderated.  To submit articles, try just posting with ]
> [ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
> [              --- Please see the FAQ before posting. ---               ]
> [ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]
>
I find this is very userful to me, thanks!

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]





Author: er <erwann.rogard@gmail.com>
Date: Wed, 15 Aug 2007 18:22:29 CST
Raw View
On Aug 15, 1:21 pm, Martin Bonner <martinfro...@yahoo.co.uk> wrote:
> On Aug 12, 8:35 pm, er <erwann.rog...@gmail.com> wrote:> hi,
>
> > in the code below, i compare two options i) and ii). i have a feeling
> > we should use i) if A's size is a smaller than some threshold ii)
> > otherwise
> > 1-is this correct?
>
> In a very general sense, yes (usually)> 2-what would that threshold be?
>
> You need to profile.
>
> > 3-more precisely, in defining "A's size" i'm not sure if both member
> > functions and member variables affect the memory needed to store/copy
> > an object or only the member variables).
>
> Member functions are irrelevent.  The real question is how much work
> is involved in performing the copy construction?  If A has a member
> which is a large vector, then copying that could get very expensive
> (even though sizeof(vector<>) is typically three pointers.)
>
> You NEED to profile!
>
> > i)
> > vector<A> v;
> > while(has_not_converged()){
> >  //some code that generates a temporary A a;
> >  v.push_back(a);
>
> Note that this ALWAYS invokes one copy constructor for A.  On average,
> it will invoke one more copy constructor for the resize (assuming
> doubling.  If the vector capacity grows by 50%, I think that rises to
> about 2.25).
>
> Your optimization is worthwhile once the cost of a few copies of A
> becomes large compared to the cost of generating the temporary.
>
> > };
> > //typically converges after up to a large, but unknown, number of
> > loops, such as say 100, 1000
>
> > ii)
> > vector<A*> v;
> > while(has_not_converged()){
> >  //some code that generates A* a = new A();
> >  v.push_back(a);
> > };
>
> ---
> [ comp.std.c++ is moderated.  To submit articles, try just posting with ]
> [ your news-reader.  If that fails, use mailto:std-...@ncar.ucar.edu    ]
> [              --- Please see the FAQ before posting. ---               ]
> [ FAQ:http://www.comeaucomputing.com/csc/faq.html                     ]

thank you both for these very clear answers.



---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]





Author: Alberto Ganesh Barbati <AlbertoBarbati@libero.it>
Date: Sat, 18 Aug 2007 09:17:11 CST
Raw View
er ha scritto:
> hi,
>
> in the code below, i compare two options i) and ii). i have a feeling
> we should use i) if A's size is a smaller than some threshold ii)
> otherwise
> 1-is this correct?
> 2-what would that threshold be?
> 3-more precisely, in defining "A's size" i'm not sure if both member
> functions and member variables affect the memory needed to store/copy
> an object or only the member variables).
>

In addition to the answers you've already got, you can also consider
using a deque instead of a vector. A deque provides reasonably fast
random access but never requires reallocation.

Anyway: you need to profile ;-)

Ganesh

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]





Author: er <erwann.rogard@gmail.com>
Date: Sun, 12 Aug 2007 13:35:13 CST
Raw View
hi,

in the code below, i compare two options i) and ii). i have a feeling
we should use i) if A's size is a smaller than some threshold ii)
otherwise
1-is this correct?
2-what would that threshold be?
3-more precisely, in defining "A's size" i'm not sure if both member
functions and member variables affect the memory needed to store/copy
an object or only the member variables).

i)
vector<A> v;
while(has_not_converged()){
 //some code that generates a temporary A a;
 v.push_back(a); //a has to be copied//every--time v.capacity() is
reached, reallocation is necessary.
};//typically converges after up to a large, but unknown, number of
loops, such as say 100, 1000

ii)
vector<A*> v;
while(has_not_converged()){
 //some code that generates A* a = new A();
 vector<A*> v.push_back(a);
};//typically converges after up to a large number, but unknown number
of loops, for example 100, 1000

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]





Author: tazmaster@rocketmail.com ("Jim Langston")
Date: Mon, 13 Aug 2007 14:03:07 GMT
Raw View
"er" <erwann.rogard@gmail.com> wrote in message
news:1186940634.122682.226290@b79g2000hse.googlegroups.com...
> hi,
>
> in the code below, i compare two options i) and ii). i have a feeling
> we should use i) if A's size is a smaller than some threshold ii)
> otherwise
> 1-is this correct?
> 2-what would that threshold be?
> 3-more precisely, in defining "A's size" i'm not sure if both member
> functions and member variables affect the memory needed to store/copy
> an object or only the member variables).
>
> i)
> vector<A> v;
> while(has_not_converged()){
> //some code that generates a temporary A a;
> v.push_back(a); //a has to be copied//every--time v.capacity() is
> reached, reallocation is necessary.
> };//typically converges after up to a large, but unknown, number of
> loops, such as say 100, 1000
>
> ii)
> vector<A*> v;
> while(has_not_converged()){
> //some code that generates A* a = new A();
> vector<A*> v.push_back(a);
> };//typically converges after up to a large number, but unknown number
> of loops, for example 100, 1000

If you pre-estimate the size of the vector and .reserve() that many elements
or more, then there wouldn't be an issue with having to recopy the elements
each time.

Since typically when a container has to grab more memory it will double the
amount of memory needed, even if your estimate is off, there should only be
one copy necessary as the memory amount is effectively doubled.

If you are going to go with ii it is better to use some form of smart
pointer.  I personally use ii in some maps std::map<int, someclass*> only
becuase someclass is not copyable.  I should really use a smart pointer,
however.

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]





Author: Martin Bonner <martinfrompi@yahoo.co.uk>
Date: Wed, 15 Aug 2007 11:21:02 CST
Raw View
On Aug 12, 8:35 pm, er <erwann.rog...@gmail.com> wrote:
> hi,
>
> in the code below, i compare two options i) and ii). i have a feeling
> we should use i) if A's size is a smaller than some threshold ii)
> otherwise
> 1-is this correct?
In a very general sense, yes (usually)
> 2-what would that threshold be?
You need to profile.

> 3-more precisely, in defining "A's size" i'm not sure if both member
> functions and member variables affect the memory needed to store/copy
> an object or only the member variables).

Member functions are irrelevent.  The real question is how much work
is involved in performing the copy construction?  If A has a member
which is a large vector, then copying that could get very expensive
(even though sizeof(vector<>) is typically three pointers.)

You NEED to profile!


> i)
> vector<A> v;
> while(has_not_converged()){
>  //some code that generates a temporary A a;
>  v.push_back(a);

Note that this ALWAYS invokes one copy constructor for A.  On average,
it will invoke one more copy constructor for the resize (assuming
doubling.  If the vector capacity grows by 50%, I think that rises to
about 2.25).

Your optimization is worthwhile once the cost of a few copies of A
becomes large compared to the cost of generating the temporary.


> };
> //typically converges after up to a large, but unknown, number of
> loops, such as say 100, 1000
>
> ii)
> vector<A*> v;
> while(has_not_converged()){
>  //some code that generates A* a = new A();
>  v.push_back(a);
> };

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]