Thread

Topic: std::string and const char[N]

Author: gdr@cs.tamu.edu (Gabriel Dos Reis)
Date: Fri, 14 May 2004 15:02:47 +0000 (UTC) Raw View

daniel.frey@aixigo.de (Daniel Frey) writes:

[...]

| When the user passes a const char[N], it is promoted to const char*
| and everything works. Indeed, when I write
|
| std::string s( "Hello, world!" );
|
| I'm converting the type information passed from a const char[14] to
| const char*. That means that I'm loosing the information that the
| buffer pointed to has a known length. This is not necessary, I have a
| sample program which uses the length for several purposes. The
| information on the buffer's length allows me to do these things:
|
| - Detecting the empty string "" for optimization purposes (especially
| when comparing against "" this is now equally efficient as .empty() or
| when constructing an empty string).

[...]

| What currently isn't standard conformant and what I would like to see
| changed is: When constructing a string from a const char* which
| actually is a const char[N], it is not guaranteed that the size of the
| constructed string is < N. In other words: The standard doesn't
| require the \0 which marks the string's end to be within the bounds of
| the buffer.

There is a core language proposal to add "sequence constructor" to the
language.   If accepted, it would remove the embarassment you're
talking about.

--
                                                        Gabriel Dos Reis
                                                         gdr@cs.tamu.edu
  Texas A&M University -- Computer Science Department
 301, Bright Building -- College Station, TX 77843-3112

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]

Author: daniel.frey@aixigo.de (Daniel Frey)
Date: Fri, 7 May 2004 16:25:08 +0000 (UTC) Raw View

Ron Natalie wrote:
> "Daniel Frey" <daniel.frey@aixigo.de> wrote in message news:c7cvik$o5t$=
1@swifty.westend.com...
>=20
>>I'm converting the type information passed from a const char[14] to=20
>>const char*.=20
>=20
> You can pass the length.   There is a constuctor that takes both the ch=
aracter
> pointer and the length.
>     std::string s(array, sizeof array / sizeof array[0]);

<irony>
I guess that must be the reason why people when given this function:

   void f( const std::string& s );

never write

   f( "Hello, world!" );

and instead always use the convenient and short replacement

   f( std::string( "Hello, world!", 13 ) );

to call it (including the minimal time to count the characters), right?=20
</irony>

>>- Detecting the empty string "" for optimization purposes (especially=20
>>when comparing against "" this is now equally efficient as .empty() or=20
>>when constructing an empty string).
>=20
>=20
> I'm not even sure what this has to do with the issue.   It's trivially =
easy to check
> for the empty string in a char* passed in, just check the first byte fo=
r nullness.

In fact, you need to access the memory pointed to once to check for the=20
\0. My solution doesn't require that, only the string object itself is=20
involved. And memory bandwidth is a problem on todays architectures, and=20
a useless memory access is IMHO to be avoided. In fact my boss insists=20
on writing std::string() instead of "" everywhere. IMHO this reads=20
horrible and bloats the code. (Bloat here means it literally bloats the=20
source that I have to read).

>>- Faster construction even for non-empty strings: No scanning for the=20
>>length first, we know beforehand the size of the buffer we need to=20
>>reserve and can copy the content in a single run, no double loop like t=
oday.
>=20
>=20
> And right you are, that's one of the reasons they allow you to provide =
the length to
> the constructor if you know it.

But people are too lazy. And I even like that, I don't see why you=20
shouldn't allow them to be lazy given that you can still make the result=20
efficient and safe by some transparent library extensions.

>>- Detecting buffer overflows: Consider this small example:
>>
>>std::string asString( const int value )
>>{
>>  char buffer[ 16 ];
>>  sprintf( buffer, "%d", value );
>>  return buffer;
>>}
>>
>>whether or not this is a good implementation, it is a common code=20
>>fragment I've often seen (with various lengths for the buffer :). In th=
e=20
>>above example I can detect in the string ctor when the resulting string=
=20
>>is 16 characters or longer.=20
>=20
>=20
> Say what?   How are you going to determine that?   If the sprintf write=
s off the
> end of the buffer, undefined behavior has already occurred.   No test y=
ou can do
> further will help.

In theory, you are right. In practice, OTOH, you can catch some of the=20
problems. Of course you need to fix the code anyway and there are cases=20
which I can't detect but a "safe" implementation of string's ctor like I=20
proposed will improve the current situation, where things go unnoticed=20
today.

>>What currently isn't standard conformant and what I would like to see=20
>>changed is: When constructing a string from a const char* which actuall=
y=20
>>is a const char[N], it is not guaranteed that the size of the=20
>>constructed string is < N. In other words: The standard doesn't require=
=20
>=20
>  > the \0 which marks the string's end to be within the bounds of the b=
uffer.
>=20
> Std::strings no nothing about \0 in general.   They can have embedded \=
0 and
> \0 is NOT necessary to terminate std::strings.   The only time it cares=
 about null
> characters are for the few cases when it is converting back and forth t=
o charT*
> when it needs the \0 to detect the length.

Maybe bad wording on my side above. The \0 obviously doesn't marks=20
std::string's end, but the "string"'s end. By this, I meant the end of=20
the buffer pointed to by the const char*. And this is the only change=20
that I would like to make: Where a const char* is passed today, I want=20
the standard to guarantee that when the real type is const char[N], that=20
the size of a std::string constructed from such a buffer/pointer must be =
<N.

Regards, Daniel

--=20
Daniel Frey

aixigo AG - financial solutions & technology
Schlo=DF-Rahe-Stra=DFe 15, 52072 Aachen, Germany
fon: +49 (0)241 936737-42, fax: +49 (0)241 936737-99
eMail: daniel.frey@aixigo.de, web: http://www.aixigo.de


---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]

Author: daniel.frey@aixigo.de (Daniel Frey)
Date: Fri, 7 May 2004 16:50:35 +0000 (UTC) Raw View

Bo Persson wrote:
> But this is more of a QoI issue than a library problem. If the
> constructor is inlined, and uses an intrinsic for strlen(), the info ca=
n
> be retained and even propagated to another string.

> Should we change the library becase some compilers have problems
> optimizing the code?

Good point. I have to admit that I tend to see things from the=20
language's point of view. And sometimes compilers are able to do=20
optimizations that rely on information that is only kept in the=20
background. The N of const char[N] being a good example as the language=20
itself doesn't pass it to std::string::string( const char* ), but the=20
compiler does. Thanks for pointing this out.

Regards, Daniel

--=20
Daniel Frey

aixigo AG - financial solutions & technology
Schlo=DF-Rahe-Stra=DFe 15, 52072 Aachen, Germany
fon: +49 (0)241 936737-42, fax: +49 (0)241 936737-99
eMail: daniel.frey@aixigo.de, web: http://www.aixigo.de

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]

Author: nevin@spies.com ("Nevin \":-]\" Liber")
Date: Fri, 7 May 2004 17:54:36 +0000 (UTC) Raw View

In article <c7cvik$o5t$1@swifty.westend.com>,
 daniel.frey@aixigo.de (Daniel Frey) wrote:

> When the user passes a const char[N], it is promoted to const char* and
> everything works. Indeed, when I write
>
> std::string s( "Hello, world!" );
>
> I'm converting the type information passed from a const char[14] to
> const char*. That means that I'm loosing the information that the buffer
> pointed to has a known length.

The buffer has a known length, but that has always been strictly greater
than the strlen() of the string.

> This is not necessary, I have a sample
> program which uses the length for several purposes. The information on
> the buffer's length allows me to do these things:
>
> - Faster construction even for non-empty strings: No scanning for the
> length first, we know beforehand the size of the buffer we need to
> reserve and can copy the content in a single run, no double loop like today.

It won't do what you think it should do.

When converting a string literal to a std::string, there are two
choices:  either you want to consider it as a string with strlen()
number of characters, or you want to consider it as an array of (at
least, in case the literal has embedded '\0's) strlen()+1 characters.

The former, which is what std::string does now, is the least number of
surprises.

> The biggest drawback is that we might reserve too much space, but that
> is standard conformant - capacity can be larger than size without a
> problem.

Not true.  The size() of the string, as it does for any container,
matters.  Unless all your code is just using std::string as a convienent
container while all your interfaces use C style strings, you really do
need size() to be well defined.

> Code which would suffer from this could easily be fixed by
> explicitly providing a const char* instead of const char[N], to force
> the "old" behaviour.

It would be hard to get the rules right for determining the "best"
constructor to call in the presense of both pointers and (templated)
array references.  Take the following example:

struct StringSize
{
    unsigned    size_;

    StringSize(const char * s) : size_(strlen(s)) {}
    template<unsigned N> StringSize(const char (&)[N]) : size_(N) {}

    operator unsigned() const { return size_; }
};

int main()
{
   char* CharStar = "";
   char const* CharConstStar = "";
   char CharArray[] = "";
   char const CharConstArray[] = "";

   std::cout << "StringSize(CharStar)==" << StringSize(CharStar) <<
std::endl;
   std::cout << "StringSize(CharConstStar)==" <<
StringSize(CharConstStar) << std::endl;
   std::cout << "StringSize(CharArray)==" << StringSize(CharArray) <<
std::endl;
   std::cout << "StringSize(CharConstArray)==" <<
StringSize(CharConstArray) << std::endl;
   std::cout << "StringSize(\"\")==" << StringSize("") << std::endl;

   return 0;
}

Produces the following output with Metrowerks 9:

StringSize(CharStar)==0
StringSize(CharConstStar)==0
StringSize(CharArray)==1
StringSize(CharConstArray)==0
StringSize("")==0

Adding the following two constructors to StringSize:

   StringSize(char * s) : size_(strlen(s)) {}
   template<unsigned N> StringSize(char (&)[N]) : size_(N) {}

And even the CharArray returns 0, which is pretty much the same behavior
I would get if I didn't have all those templated array reference
constructors.

How would you declare the constructors for a string class to have it do
what you expect?

> What currently isn't standard conformant and what I would like to see
> changed is: When constructing a string from a const char* which actually
> is a const char[N], it is not guaranteed that the size of the
> constructed string is < N. In other words: The standard doesn't require
>   the \0 which marks the string's end to be within the bounds of the buffer.

Which section of the standard are you referring to?  Also, all the
implementations that I know of don't include anything from a '\0' on.

> IMHO this is unfortunate and most probably wasn't intended (the
> standard's authors may correct me if I'm wrong :). We should therefore
> consider extending the standard to make this guarantee.

The guarantee, if it isn't there, would be not to include anything from
a '\0' onwards for the sake of compatibility.

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]

Author: daniel.frey@aixigo.de (Daniel Frey)
Date: Fri, 7 May 2004 17:55:22 +0000 (UTC) Raw View

Stefan Heinzmann wrote:
> Daniel Frey wrote:
>=20
> [expos=E9 snipped with proposal to add string construction from C-style=
=20
> array]
>=20
> So you propose to add a constructor template to std::basic_string<T>=20
> that behaves like this:
>=20
> template<unsigned n>
> basic_string(const T(&array)[n])
> {
>     reserve(n);
>     append(array);
> }
>=20
> Is this correct?

Basically, yes. But the actual implementation can be more complicated=20
but is IMHO irrelevant to understand and discuss the general idea and=20
its benefits and problems.

Regards, Daniel

--=20
Daniel Frey

aixigo AG - financial solutions & technology
Schlo=DF-Rahe-Stra=DFe 15, 52072 Aachen, Germany
fon: +49 (0)241 936737-42, fax: +49 (0)241 936737-99
eMail: daniel.frey@aixigo.de, web: http://www.aixigo.de


---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]

Author: magfr@comp.std.cpp.magfr.user.lysator.liu.se (Magnus Fromreide)
Date: Sun, 9 May 2004 03:28:19 +0000 (UTC) Raw View

> > So you propose to add a constructor template to std::basic_string<T>
> > that behaves like this:
> template<unsigned n>
> > basic_string(const T(&array)[n])
> > {
> >     reserve(n);
> >     append(array);

I'd guess you meant

        append(array, n - 1);

here since the point was to use the knowledge about n.

> > }
> > Is this correct?
>
> Basically, yes. But the actual implementation can be more complicated
> but is IMHO irrelevant to understand and discuss the general idea and
> its benefits and problems.

I'd really like to see that one as well but in order to get there you
have to change the language since

basic_string(const T*)

is a better match to the call to string("foo").

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]

Author: stefan_heinzmann@yahoo.com (Stefan Heinzmann)
Date: Mon, 10 May 2004 18:14:32 +0000 (UTC) Raw View

Magnus Fromreide schrieb:

>>>So you propose to add a constructor template to std::basic_string<T>
>>>that behaves like this:
>>
>>template<unsigned n>
>>
>>>basic_string(const T(&array)[n])
>>>{
>>>    reserve(n);
>>>    append(array);
>
>
> I'd guess you meant
>
>         append(array, n - 1);
>
> here since the point was to use the knowledge about n.

No. I used the single argument version on purpose, because I wanted to
copy the string only up to the first null character, wherever it appears
in the array. But I'm not sure which version the OP would have
preferred, hence my question.

>>>}
>>>Is this correct?
>>
>>Basically, yes. But the actual implementation can be more complicated
>>but is IMHO irrelevant to understand and discuss the general idea and
>>its benefits and problems.
>
>
> I'd really like to see that one as well but in order to get there you
> have to change the language since
>
> basic_string(const T*)
>
> is a better match to the call to string("foo").

Is it? Why?

--
Cheers
Stefan

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]

Author: d.frey@gmx.de (Daniel Frey)
Date: Mon, 10 May 2004 21:39:58 +0000 (UTC) Raw View

Stefan Heinzmann wrote:
> Magnus Fromreide schrieb:
>
>>> template<unsigned n>
>>>
>>>> basic_string(const T(&array)[n])
>>>> {
>>>>    reserve(n);
>>>>    append(array);
>>
>>
>>
>> I'd guess you meant
>>
>>         append(array, n - 1);
>>
>> here since the point was to use the knowledge about n.
>
>
> No. I used the single argument version on purpose, because I wanted to
> copy the string only up to the first null character, wherever it appears
> in the array. But I'm not sure which version the OP would have
> preferred, hence my question.

Your implementation is 100% transparent, unlike what I was suggesting,
namely that in your implementation we add

assert( size() < n );

after the append() as anything else is most likely a big problem and
undefined behaviour already occured - while useless in theory, it still
seems reasonable in the real world to me. A real STL's implementation
will still look different and probbaly uses an initializer list and some
of its internal helpers - but that's not really important here in csc++. :)

>>> Basically, yes. But the actual implementation can be more complicated
>>> but is IMHO irrelevant to understand and discuss the general idea and
>>> its benefits and problems.
>>
>> I'd really like to see that one as well but in order to get there you
>> have to change the language since
>>
>> basic_string(const T*)
>>
>> is a better match to the call to string("foo").
>
> Is it? Why?

It is, see 4.2 and the referred sections. But you can use:

template< typename T > class basic_string
{
   ...
   basic_string( const T (&)[ 1 ] );
   template< size_t N > basic_string( const T (&rhs)[ N ] );
   template< typename U > basic_string( const U* rhs );
   ...
};

Three different overloads, thus three different implementations: The
first for the empty string (""), the second for known-length-char-arrays
and the third for const char*. You can add some BOOST_STATIC_ASSERT's to
verify that U is actually what you want.

But given the comments from Bo Persson, I think one has to verify on the
assembler level whether or not this makes a difference and really leads
to faster implementations.

And looking at the comments in general, it seems to me that there is
little (no) support for the guarantee I asked for, so the assert() I
mentioned above will remain a contradiction to the standard.

Regards, Daniel

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]

Author: do-not-spam-benh@bwsint.com (Ben Hutchings)
Date: Thu, 13 May 2004 05:44:15 +0000 (UTC) Raw View

Daniel Frey wrote:
> Ron Natalie wrote:
>> "Daniel Frey" <daniel.frey@aixigo.de> wrote in message
>> news:c7cvik$o5t$1@swifty.westend.com...
>>
>>>I'm converting the type information passed from a const char[14] to
>>>const char*.
>>
>> You can pass the length.   There is a constuctor that takes both the
>> character pointer and the length.
>>     std::string s(array, sizeof array / sizeof array[0]);
>
><irony>

This is not irony but sarcasm.  That might perhaps be meta-ironic.

> I guess that must be the reason why people when given this function:
>
>    void f( const std::string& s );
>
> never write
>
>    f( "Hello, world!" );
>
> and instead always use the convenient and short replacement
>
>    f( std::string( "Hello, world!", 13 ) );
>
> to call it (including the minimal time to count the characters), right?
></irony>
<snip>

There's no need to count characters when sizeof can do it for you:

    #define SL(lit) std::string(lit, sizeof(lit)-1)

Of course it would be better still if the language provided a means of
creating std::string "literals" without the need for separate string
literals in memory or some complex optimisations by implementors to
eliminate them.

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]

Author: daniel.frey@aixigo.de (Daniel Frey)
Date: Thu, 6 May 2004 17:30:48 +0000 (UTC) Raw View

Hello, world!

I have a question on the ctors, assignment and comparison of=20
std::strings. Currently, the standard allows me to construct a=20
std::string from a const char*. (Yeah, a basic_string<T> from a const T*=20
- but I'll refer to string/char for brevity here).

When the user passes a const char[N], it is promoted to const char* and=20
everything works. Indeed, when I write

std::string s( "Hello, world!" );

I'm converting the type information passed from a const char[14] to=20
const char*. That means that I'm loosing the information that the buffer=20
pointed to has a known length. This is not necessary, I have a sample=20
program which uses the length for several purposes. The information on=20
the buffer's length allows me to do these things:

- Detecting the empty string "" for optimization purposes (especially=20
when comparing against "" this is now equally efficient as .empty() or=20
when constructing an empty string).

- Faster construction even for non-empty strings: No scanning for the=20
length first, we know beforehand the size of the buffer we need to=20
reserve and can copy the content in a single run, no double loop like tod=
ay.

- Detecting buffer overflows: Consider this small example:

std::string asString( const int value )
{
   char buffer[ 16 ];
   sprintf( buffer, "%d", value );
   return buffer;
}

whether or not this is a good implementation, it is a common code=20
fragment I've often seen (with various lengths for the buffer :). In the=20
above example I can detect in the string ctor when the resulting string=20
is 16 characters or longer. Although this might cost some time, it can=20
be used in an STL's safe mode as a quality of implementation issue.

The biggest drawback is that we might reserve too much space, but that=20
is standard conformant - capacity can be larger than size without a=20
problem. Code which would suffer from this could easily be fixed by=20
explicitly providing a const char* instead of const char[N], to force=20
the "old" behaviour.

What currently isn't standard conformant and what I would like to see=20
changed is: When constructing a string from a const char* which actually=20
is a const char[N], it is not guaranteed that the size of the=20
constructed string is < N. In other words: The standard doesn't require=20
  the \0 which marks the string's end to be within the bounds of the buff=
er.

IMHO this is unfortunate and most probably wasn't intended (the=20
standard's authors may correct me if I'm wrong :). We should therefore=20
consider extending the standard to make this guarantee. STLs could then=20
benefit from this guarantee by generating more efficient and safer code=20
than today as outlined above.

I'd like to hear your comments and thoughts on this before taking this=20
issue further. Do you agree that the standard should make that=20
guarantee? Is it reasonable? Or have I missed some drawbacks?

Regards, Daniel

--=20
Daniel Frey

aixigo AG - financial solutions & technology
Schlo=DF-Rahe-Stra=DFe 15, 52072 Aachen, Germany
fon: +49 (0)241 936737-42, fax: +49 (0)241 936737-99
eMail: daniel.frey@aixigo.de, web: http://www.aixigo.de


---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]

Author: ron@sensor.com ("Ron Natalie")
Date: Thu, 6 May 2004 18:35:50 +0000 (UTC) Raw View

"Daniel Frey" <daniel.frey@aixigo.de> wrote in message news:c7cvik$o5t$1@swifty.westend.com...

> I'm converting the type information passed from a const char[14] to
> const char*.

You can pass the length.   There is a constuctor that takes both the character
pointer and the length.
    std::string s(array, sizeof array / sizeof array[0]);

> - Detecting the empty string "" for optimization purposes (especially
> when comparing against "" this is now equally efficient as .empty() or
> when constructing an empty string).

I'm not even sure what this has to do with the issue.   It's trivially easy to check
for the empty string in a char* passed in, just check the first byte for nullness.

> - Faster construction even for non-empty strings: No scanning for the
> length first, we know beforehand the size of the buffer we need to
> reserve and can copy the content in a single run, no double loop like today.

And right you are, that's one of the reasons they allow you to provide the length to
the constructor if you know it.

> - Detecting buffer overflows: Consider this small example:
>
>std::string asString( const int value )
>{
>   char buffer[ 16 ];
>   sprintf( buffer, "%d", value );
>   return buffer;
>}
>
>whether or not this is a good implementation, it is a common code
>fragment I've often seen (with various lengths for the buffer :). In the
>above example I can detect in the string ctor when the resulting string
>is 16 characters or longer.

Say what?   How are you going to determine that?   If the sprintf writes off the
end of the buffer, undefined behavior has already occurred.   No test you can do
further will help.

> What currently isn't standard conformant and what I would like to see
> changed is: When constructing a string from a const char* which actually
> is a const char[N], it is not guaranteed that the size of the
> constructed string is < N. In other words: The standard doesn't require
 > the \0 which marks the string's end to be within the bounds of the buffer.

Std::strings no nothing about \0 in general.   They can have embedded \0 and
\0 is NOT necessary to terminate std::strings.   The only time it cares about null
characters are for the few cases when it is converting back and forth to charT*
when it needs the \0 to detect the length.

It could be possible to add a templated constructor to allow the automatic length
determination, but it's not clear what the gain is.

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]

Author: stefan_heinzmann@yahoo.com (Stefan Heinzmann)
Date: Thu, 6 May 2004 18:36:06 +0000 (UTC) Raw View

Daniel Frey wrote:

[expos=E9 snipped with proposal to add string construction from C-style a=
rray]

So you propose to add a constructor template to std::basic_string<T>=20
that behaves like this:

template<unsigned n>
basic_string(const T(&array)[n])
{
     reserve(n);
     append(array);
}

Is this correct?

--=20
Cheers
Stefan

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]

Author: bop@gmb.dk ("Bo Persson")
Date: Thu, 6 May 2004 22:12:52 +0000 (UTC) Raw View

"Daniel Frey" <daniel.frey@aixigo.de> skrev i meddelandet
news:c7cvik$o5t$1@swifty.westend.com...
> Hello, world!
>
> I have a question on the ctors, assignment and comparison of
> std::strings. Currently, the standard allows me to construct a
> std::string from a const char*. (Yeah, a basic_string<T> from a const
T*
> - but I'll refer to string/char for brevity here).
>
> When the user passes a const char[N], it is promoted to const char*
and
> everything works. Indeed, when I write
>
> std::string s( "Hello, world!" );
>
> I'm converting the type information passed from a const char[14] to
> const char*. That means that I'm loosing the information that the
buffer
> pointed to has a known length.

But this is more of a QoI issue than a library problem. If the
constructor is inlined, and uses an intrinsic for strlen(), the info can
be retained and even propagated to another string.

On a good day, and with a tuned library, MSVC 7.1 can produce code like
this:

; 381  :
; 382  :    std::string whatever = "abcd";

  005ef a1 00 00 00 00   mov     eax, DWORD PTR
??_C@_04EHKALCEN@abcd?$AA@
  005f4 b9 17 00 00 00   mov     ecx, 23                        ;
00000017H
  005f9 89 84 24 d8 01
        00 00            mov     DWORD PTR _whatever$[esp+1496], eax
  00600 89 8c 24 f4 01
        00 00            mov     DWORD PTR _whatever$[esp+1524], ecx
  00607 89 b4 24 f0 01
        00 00            mov     DWORD PTR _whatever$[esp+1520], esi
  0060e 88 9c 24 dc 01
        00 00            mov     BYTE PTR _whatever$[esp+1500], bl

; 383  :
; 384  :    std::string whatever2 = whatever;

  00615 89 84 24 98 02
        00 00            mov     DWORD PTR _whatever2$[esp+1496], eax
  0061c 89 8c 24 b4 02
        00 00            mov     DWORD PTR _whatever2$[esp+1524], ecx
  00623 89 b4 24 b0 02
        00 00            mov     DWORD PTR _whatever2$[esp+1520], esi
  0062a 88 9c 24 9c 02
        00 00            mov     BYTE PTR _whatever2$[esp+1500], bl


That's two constructors in 10 machine instructions, using the fluke that
ESI and BL happen to contain useful values from preceding code, but also
that the values are propagated from one string to another.


Should we change the library becase some compilers have problems
optimizing the code?


Bo Persson


---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]