Thread

Topic: Proposal on extending STL containers using rvalue allocator extensions

Author: "=?iso-8859-1?q?Ion_Gazta=F1aga?=" <ion_g_m@terra.es>
Date: 30 Jul 2005 17:30:01 GMT Raw View

Hi to all,

 Re-reading the paper N1771 "Impact of the rvalue reference on the
Standard Library" and after reading the impact on the allocator
interface, I've remembered my old idea, discussed with Howard Hinnant
in an old mail regarding allocator swapping and allocators in shared
memory, that we should be able to insert values in STL containers from
convertible types without temporaries. According to N1771:

20.1.5 - Allocator requirements

Add rows to the Descriptive variable definitions:

-> V: A type convertible to T

This extension allows two major optimizations:

1. The number of needed allocator instances in node containers can be
reduced

With current STL allocator interface, a std::list implementation that
does not suppose allocator::pointer to be a raw pointer needs up to 3
allocators:

a) If pointer is not supposed to be a POD: one to allocate the node,
one to construct the next/previous pointers and one to construct the
value. Dinkum STL uses this approach

b) If we suppose pointer is POD: one to allocate the node and one to
construct the value. Metrowerks uses this aproach.

Even with empty allocators, like std::allocator, we have a space
overhead in the container, because in some systems, multiple
inheritance from empty bases does not lead to EBO with all base
classes. With allocators with members, this is even worse. Even if we
have an allocator called my_allocator that has only a pointer as a
member, std::list <int, my_allocator<int> > is significantly bigger
that std::list <int, std::allocator<int> >.

However, with the new interface, we need only one allocator just to
allocate and construct nodes, since we could construct the node from
the value as argument if we define a node constructor taking a rvalue
of value_type as argument or taking a structure of references to value
and pointers. All this thanks to the template parameter U:

template <class U> void construct(pointer p, U&& val);

This is an internal optimization that does not affect users, but will
be very useful to container designers. So we don't need to change any
interface to implement this optimization.


2. We can construct container values from convertible types

This new allocator interface allows the interface of STL containers to
be extended to eliminate temporaries and change what I consider a major
interface annoyance.

Consider the vector:

 std::vector<std::string> myvector

If I want to insert a new value as the last element, I can write:

 //mystring is std::string
 myvector.push_back(mystring);

Perfect. But the problem is that many times the string is not
std::string, but a const char *, so with current STL I must write:

 //my_const_char_ptr is const char *
 myvector.push_back(std::string(my_const_char_ptr));

So I have to create a temporary which is very inneficient: an extra
memory allocation and deallocation to insert a value. With the rvalue
reference changes, the overhead dissappears since:

 void push_back(T&& x);

moves the temporary to the container, which is fine. But if the
value_type constructor is more complicated the explicit temporary
creation is ugly.

Now consider that we have a vector of const char * values called
my_ptr_vector. To avoid  temporaries with the proposed vector interface
I have two possibilities:

a) Use push_back

 //my_ptr_vector is of type std::vector<const char *>
 auto it = my_ptr_vector.begin(), itend = my_ptr_vector.end();

 for(; it != itend; ++it){
    myvector.push_back(std::string(*it));
 }

 This avoids temporaries but the insertion is not optimized. The vector
can reallocate internal memory several times during the for loop.

b) Construct a vector of std::string and move it to the target:

    template <class InputIterator>
        void insert(iterator position,
                    InputIterator first,
                    InputIterator last);
        // CC and CA only if InputIterator returns lvalue

 //my_ptr_vector is of type std::vector<const char *>
 std::vector<std::string> aux_vector;
 auto it = my_ptr_vector.begin(), itend = my_ptr_vector.end();

 //Reserve memory in vector to avoid reallocations
 aux_vector.reserve(my_ptr_vector.size());

 for(; it != itend; ++it){
    aux_vector.push_back(std::string(*it));
 }

 std::move_iterator<std::vector<std::string>::iterator>
    move_beg (aux_vector.begin()), move_end(aux_vector.end());

 my_vector.insert(my_vector.end(), move_beg, move_end);

There is no extra std::string construction, but we have to allocate an
auxiliary vector. So it is clear we can't efficiently construct a
container of std::string from a container of const char *.

We could consider changing the interface of STL containers so that, for
example:

 void push_back(T&& x);
 iterator insert(iterator position, T&& x);

become:

 template <class Convertible>
    void push_back(Convertible&& x);
 template <class Convertible>
    iterator insert(iterator position, Convertible &&x);

and make these functions' InputIterator

 template <class InputIterator>
 void insert(iterator position,
                InputIterator first,
                InputIterator last);
    // CC and CA only if InputIterator returns lvalue

 template <class InputIterator>
    vector(InputIterator first, InputIterator last,
           const Allocator& = Allocator());
    // CC only if InputIterator returns lvalue

 template <class InputIterator>
    void assign(InputIterator first,
             InputIterator last);
    // CC and CA only if InputIterator returns lvalue


to have a typename InputIterator::value_type to be a convertible type
to container::value_type. The examples would be simply:

a) Single value

 //my_const_char_ptr is const char *
 //const char *object is mantained until allocator::construct
 //function constructs std::string from const_char_ptr
 myvector.push_back(my_const_char_ptr);


b) Multiple values

 //my_ptr_vector is of type std::vector<const char *>
 auto it = my_ptr_vector.begin(), itend = my_ptr_vector.end();

 //const char *objects are mantained until allocator::construct
 //function constructs std::string from const_char_ptr
 my_vector.insert(my_vector.end(), it, itend);

with no explicit temporaries or named temporary with std::moves cast.

However, with associative containers this is more complicated, since
the key_type is used to find the position. If we change the the
function in std::set:

 pair<iterator,bool> insert(value_type&& x);

to this one:

 template <class Convertible>
    pair<iterator,bool> insert(Convertible&& x);

We have to create a temporary container::value_type inside the function
from const Convertible &x since we need a container::value_type to
compare it with other values in the std::set. However, we could "move"
this temporary value to the target node.

In a std::map we can convert the mapped_type, so this:

 pair<iterator, bool> insert(pair<key_type,mapped_type>&& x);

can be converted to:

 template <class ConvertibleKey, class ConvertibleMapped>
    pair<iterator, bool>
insert(pair<ConvertibleKey,ConvertibleMapped>&& x);

We can create a temporary key_type and move it, while the mapped type
can be directly converted to the target type. We can even propose a
better syntax, separating key and mapped:

 template <class ConvertibleKey, class ConvertibleMapped>
    pair<iterator, bool> insert_from(ConvertibleKey &&key,
ConvertibleMapped
&& mapped);

so this way we can move both key and mapped if both are movable to
target or convert both types (the key would be first converted to a
temporary key_type to find its place and after that, moved to the
target, the mapped type can be directly converted).

What to you think about this? I think this is an interesting feature
that can simplify a lot user interface and obtain a performance boost
at the same time. I think it fits best to non-associative containers.
Do you think a more formal proposal should be written to address all
the changes it would require from the standard containers? I'm ready to
work a bit on this and even to write some implementation-tester
containers (help is welcome, because I would need access to a rvalue
capable compiler, at least until gcc implements this, or begin
implementing without rvalues but changing STL allocator construct
function to a template function).

I think that move semantics + reallocation protocol + convertible types
combo would provide simply an impressive performance boost to standard
containers, eliminating virtually all possible temporaries, and
simplifying use interface.

Regards,

Ion

___________________________________

Shmem library, a Boost pre-candidate library about shared memory,
process-shared synchronization objects, STL compatible
containers/allocators in shared memory and more:

Library download:

http://boost-sandbox.sourceforge.net/vault/index.php?direction=0&order=&directory=Shmem

Online documentation:

http://ice.prohosting.com/newfunk/boost/libs/shmem/index.html

___________________________________

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]

Author: hinnant@metrowerks.com (Howard Hinnant)
Date: Sun, 31 Jul 2005 22:28:47 GMT Raw View

In article <1122714356.850005.220100@f14g2000cwb.googlegroups.com>,
 "Ion Gazta   aga" <ion_g_m@terra.es> wrote:

> 20.1.5 - Allocator requirements
>
> Add rows to the Descriptive variable definitions:
>
> -> V: A type convertible to T
>
> This extension allows two major optimizations:
>
> 1. The number of needed allocator instances in node containers can be
> reduced

<snip>

> This is an internal optimization that does not affect users, but will
> be very useful to container designers. So we don't need to change any
> interface to implement this optimization.

This is a very astute observation, and one I had missed until Ion
pointed it out to me several months ago.  This will benefit all of the
Freescale node-based containers when dealing with stateful allocators.

> 2. We can construct container values from convertible types

<snip>

>  template <class Convertible>
>     void push_back(Convertible&& x);
>  template <class Convertible>
>     iterator insert(iterator position, Convertible &&x);

I believe this will be allowed already just due to the new
allocator::construct overload plus the existing 17.4.4.4p2 which
explicitly allows member overloads.  Since these overloads will carry
the same requirements and semantics as the non-templated variety their
presence will only be detectable by counting value_type copy or move
constructions.

It isn't definite that these functions would benefit an implementation.
While they might eliminate a move construction, they also might increase
code size in doing so.  Perhaps an implementation could include them
when optimized for speed, and exclude them when optimized for size.

> and make these functions' InputIterator
>
>  template <class InputIterator>
>  void insert(iterator position,
>                 InputIterator first,
>                 InputIterator last);
>     // CC and CA only if InputIterator returns lvalue
>
>  template <class InputIterator>
>     vector(InputIterator first, InputIterator last,
>            const Allocator& = Allocator());
>     // CC only if InputIterator returns lvalue
>
>  template <class InputIterator>
>     void assign(InputIterator first,
>              InputIterator last);
>     // CC and CA only if InputIterator returns lvalue
>
>
> to have a typename InputIterator::value_type to be a convertible type
> to container::value_type. The examples would be simply:

This is already true (has been since C++98).

> What to you think about this?

Except for insert_from, I'm sold. :-)  But I think we already have
permission to add the member template overloads and might should stop
short of requiring them.  The elimination of an extra move construction
may or may not end up being sufficiently compelling.

-Howard

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]

Author: "=?iso-8859-1?q?Ion_Gazta=F1aga?=" <ion_g_m@terra.es>
Date: Mon, 1 Aug 2005 12:23:49 CST Raw View

> It isn't definite that these functions would benefit an implementation.
> While they might eliminate a move construction, they also might increase
> code size in doing so.  Perhaps an implementation could include them
> when optimized for speed, and exclude them when optimized for size.

It's true that we should be careful and try to share code between
instances of templated functions, so that the only difference when
constructing from a related type is just when constructing values.
Memory allocation, moves, copies... should be written in a common
functions so that we don't suffer a template bloat. I will try to do
some experiments with vector. Thanks for pointing it out.

> > to have a typename InputIterator::value_type to be a convertible type
> > to container::value_type. The examples would be simply:
>
> This is already true (has been since C++98).

Sorry for my ignorance.

> Except for insert_from, I'm sold. :-)

This is, in my opinion, a very handful function because you almost
never have the key and the mapped in a pair to insert it in the map.
And constructing one for each insertion is very ugly.

> But I think we already have
> permission to add the member template overloads and might should stop
> short of requiring them.  The elimination of an extra move construction
> may or may not end up being sufficiently compelling.

Yes, move constructors eliminate a lot of overhead so insertions in
vectors will be very fast and node containers really can use insert()
without overhead, since each value to be inserted does not lead to
reallocations+moves. Only vectors and deques (and assoc_vector, which
obviously will benefit a lot from move semantics) can optimize
insertions checking the distance between source begin and end iterator,
to make just a reallocation and a move (or memcpy). In this case, if
iterator::value_type can be a convertible type we can save (source_end
- source_begin) moves when inserting in the middle of the vector and an
auxiliary vector allocation.

Anyway, I will try to implement some ideas to see if we can have any
performance benefit, since obviously we won't win as much as with move
semantics

Thanks for the reply,

Ion

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]

Author: hinnant@metrowerks.com (Howard Hinnant)
Date: Mon, 1 Aug 2005 22:08:05 GMT Raw View

In article <1122887573.925456.168370@g43g2000cwa.googlegroups.com>,
 "Ion Gazta   aga" <ion_g_m@terra.es> wrote:

> Yes, move constructors eliminate a lot of overhead so insertions in
> vectors will be very fast and node containers really can use insert()
> without overhead, since each value to be inserted does not lead to
> reallocations+moves. Only vectors and deques (and assoc_vector, which
> obviously will benefit a lot from move semantics) can optimize
> insertions checking the distance between source begin and end iterator,
> to make just a reallocation and a move (or memcpy). In this case, if
> iterator::value_type can be a convertible type we can save (source_end
> - source_begin) moves when inserting in the middle of the vector and an
> auxiliary vector allocation.

Since your last post, I've found out that the way map and multimap are
listed in N1771 is incorrect.  I'm correcting it for the proposed
wording paper I'm working on for this Fall.  Specifically, this:

pair<iterator, bool> insert(const value_type& x);
pair<iterator, bool> insert(const pair<key_type,mapped_type>& x);
pair<iterator, bool> insert(pair<key_type,mapped_type>&& x);
iterator insert(iterator position, const value_type& x);
iterator insert(iterator position, const pair<key_type,mapped_type>& x);
iterator insert(iterator position, pair<key_type,mapped_type>&& x);

Needs to change to:

pair<iterator, bool> insert(const value_type& x);
template <class P> pair<iterator, bool> insert(P&& x);
iterator             insert(iterator position, const value_type& x);
template <class P> iterator insert(iterator position, P&& x);

Similar to as you suggested! :-)  (and similarly for multimap)

The problem with the way I had it in N1771 is that this:

typedef std::pair<convertible_to_key_type, convertible_to_map_type> P;
std::map<key_type, mapped_type> m;
m.insert(P());

is ambiguous!  Should it convert to pair<key_type, mapped_type> or to
pair<const key_type, mapped_type>?

So the member template must catch inserts such as this and forward
appropriately.

The other possibility (as you suggested) is instead adding:

template <class ConvertibleKey, class ConvertibleMapped>
   pair<iterator, bool>
insert(pair<ConvertibleKey,ConvertibleMapped>&& x);

I haven't prototyped this way yet, but I'm worried that it might also
cause an ambiguity with the existing:

pair<iterator, bool> insert(const value_type& x);

when the argument is a tuple (which is convertible to pair<key_type,
mapped_type>).

set and the sequences don't have this issue only because of the lack of
a const value_type.

On insert_from:

m.insert_from(x, y);

vs:

m.insert(std::make_pair(x, y));

?

<shrug>  Would you also suggest an insert_from overload taking a hint
iterator?  I'm concerned about unnecessarily bloating the interface
(like string).

> Anyway, I will try to implement some ideas to see if we can have any
> performance benefit, since obviously we won't win as much as with move
> semantics

I'm willing to run demoes for you if that would help.

-Howard

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]

Author: "=?iso-8859-1?q?Ion_Gazta=F1aga?=" <ion_g_m@terra.es>
Date: 2 Aug 2005 17:00:05 GMT Raw View

> pair<iterator, bool> insert(const value_type& x);
> template <class P> pair<iterator, bool> insert(P&& x);
> iterator             insert(iterator position, const value_type& x);
> template <class P> iterator insert(iterator position, P&& x);
>
> Similar to as you suggested! :-)  (and similarly for multimap)

Looks good.

> The other possibility (as you suggested) is instead adding:
>
> template <class ConvertibleKey, class ConvertibleMapped>
>    pair<iterator, bool>
> insert(pair<ConvertibleKey,ConvertibleMapped>&& x);

template <class P> pair<iterator, bool> insert(P&& x);

seems more general, since P can be any type, but which are the
requirements for P? I suppose it should be a type with "first" and
"second" members and "first" should be convertible to "key_type" and
"second" to "mapped_type"? Or "first" must be some of the following:
(const) key_type / (const) key_type &/ key_type && and ditto for
"second"?

> On insert_from:
>
> m.insert_from(x, y);
> ...
> <shrug>  Would you also suggest an insert_from overload taking a hint
> iterator?  I'm concerned about unnecessarily bloating the interface
> (like string).

Ummm. Obviously it would be interesting to have that form too, since I
wouldn't want to spoil a "move" I save with log(N) string comparisons
if I know where it should go! Off topic: And I don't like when
insert(it, value) checks if it is the correct position, I know this is
safe, but the extra 2 comparisons hurt my optimization-paranoia.

> I'm willing to run demoes for you if that would help.

First I will try to add the insert_from() call and after that, try to
build from convertible types modifying allocator construct to a
template until I get access to a move-capable compiler. I plan to use
the shared memory containers I use for Shmem, so I have to think how to
mantain compatibility with all this stuff. After that we can try to add
move semantics to see how this works. Maybe it's time in Boost to have
BOOST_HAS_RVALUE_REFERENCE workaround?

Regards,

Ion

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]

Author: hinnant@metrowerks.com (Howard Hinnant)
Date: Tue, 2 Aug 2005 17:46:46 GMT Raw View

In article <1122971427.528218.22130@f14g2000cwb.googlegroups.com>,
 "Ion Gazta   aga" <ion_g_m@terra.es> wrote:

> template <class P> pair<iterator, bool> insert(P&& x);
>
> seems more general, since P can be any type, but which are the
> requirements for P? I suppose it should be a type with "first" and
> "second" members and "first" should be convertible to "key_type" and
> "second" to "mapped_type"? Or "first" must be some of the following:
> (const) key_type / (const) key_type &/ key_type && and ditto for
> "second"?

I believe this wording is sufficient:

> The signatures taking a P&& parameter require P to be convertible to
> value_type. If P is instantiated as a reference type, then the argument x is
> copied from. Otherwise x is considered to be an rvalue as it is converted to
> value_type and inserted into the map. Specifically, in such cases
> CopyConstructible is not required of key_type or mapped_type unless the
> conversion from P specifically requires it (e.g. if P is a tuple<const
> key_type, mapped_type>, then key_type must be CopyConstructible).

However I struggled with this paragraph and others are bound to improve
it.

-Howard

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]