Topic: NUL character in basic_string


Author: Curtis Smith <curtis@vger.vsin.com>
Date: Thu, 12 Oct 2000 00:28:23 GMT
Raw View
My reading of the C++ standard leads me to believe that the basic_string
template (string, wstring, &c.) may contain NUL characters (that is, 0
widened to the appropriate type), but some library implementors don't seem
to support such strings.  Is that one of those thing that is ludicrous for
me to expect, given the traditional handling of strings in C, or what?

Curtis Smith

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.research.att.com/~austern/csc/faq.html                ]
[ Note that the FAQ URL has changed!  Please update your bookmarks.     ]





Author: Ron Natalie <ron@sensor.com>
Date: Thu, 12 Oct 2000 04:10:05 GMT
Raw View

Curtis Smith wrote:
>
> My reading of the C++ standard leads me to believe that the basic_string
> template (string, wstring, &c.) may contain NUL characters (that is, 0
> widened to the appropriate type), but some library implementors don't seem
> to support such strings.  Is that one of those thing that is ludicrous for
> me to expect, given the traditional handling of strings in C, or what?
>

VC++, the Spar Compiler, and G++ all seem to support it fine:

#include <iostream>
#include <string>
using namespace std;
int main() {
   char* s = "abcd\0efth";
   string str(s, 9);
   cout << str <<endl;
   cout << (int) str[4] << endl;
}

You do have to be careful because there are a few places where the string things
are converted to/from "default" terminated strings.  Like some of the constructor
choices and c_str();

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.research.att.com/~austern/csc/faq.html                ]
[ Note that the FAQ URL has changed!  Please update your bookmarks.     ]





Author: Stephen Clamage <stephen.clamage@sun.com>
Date: 2000/10/12
Raw View
On Thu, 12 Oct 2000 00:28:23 GMT, Curtis Smith <curtis@vger.vsin.com>
wrote:

>My reading of the C++ standard leads me to believe that the basic_string
>template (string, wstring, &c.) may contain NUL characters (that is, 0
>widened to the appropriate type),

Yes.

> but some library implementors don't seem
>to support such strings.

Are you sure the limitation is in the library, and not a consequence
of how you are using the string? For example:
 string s("X\000Y");
 char buf[10];
 strcpy(buf, s.c_str());
The strcpy stops on the embedded null. That is not a limitation of
class string, but a consequence of the semantics of strcpy.

---
Steve Clamage, stephen.clamage@sun.com

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.research.att.com/~austern/csc/faq.html                ]
[ Note that the FAQ URL has changed!  Please update your bookmarks.     ]






Author: "Richard Peters" <R.A.Peters@Student.tue.nl>
Date: 2000/10/12
Raw View
"Stephen Clamage" <stephen.clamage@sun.com> wrote in message
news:1k8ausskls36o3spt506d5veul47cdob6d@4ax.com...
> On Thu, 12 Oct 2000 00:28:23 GMT, Curtis Smith <curtis@vger.vsin.com>
> wrote:
>
> Are you sure the limitation is in the library, and not a consequence
> of how you are using the string? For example:
> string s("X\000Y");
> char buf[10];
> strcpy(buf, s.c_str());
> The strcpy stops on the embedded null. That is not a limitation of
> class string, but a consequence of the semantics of strcpy.
>
> ---
the copying already stops at string s("X\0
as the \0 is virtually the same as the closing ". The closing " generates a
\0, and the s("blah") thinks it has reached the end if it encounters the \0

Richard Peters



---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.research.att.com/~austern/csc/faq.html                ]
[ Note that the FAQ URL has changed!  Please update your bookmarks.     ]






Author: Ron Natalie <ron@sensor.com>
Date: 2000/10/12
Raw View

Stephen Clamage wrote:
>

>         strcpy(buf, s.c_str());
> The strcpy stops on the embedded null. That is not a limitation of
> class string, but a consequence of the semantics of strcpy.
>
Even before strcpy is called c_str() is one of the two places in std_string
that depend on char_traits<>::length() (the other being the constructor which
is passed only a charT* and no explicit length), which marches down the array
looking for a default initialized element (i.e., null).

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.research.att.com/~austern/csc/faq.html                ]
[ Note that the FAQ URL has changed!  Please update your bookmarks.     ]






Author: sirwillard@my-deja.com
Date: 2000/10/12
Raw View
In article <39E5CC07.3BDF4ADB@sensor.com>,
  Ron Natalie <ron@sensor.com> wrote:
>
>
> Stephen Clamage wrote:
> >
>
> >         strcpy(buf, s.c_str());
> > The strcpy stops on the embedded null. That is not a limitation of
> > class string, but a consequence of the semantics of strcpy.
> >
> Even before strcpy is called c_str() is one of the two places in
std_string
> that depend on char_traits<>::length() (the other being the
constructor which
> is passed only a charT* and no explicit length), which marches down
the array
> looking for a default initialized element (i.e., null).

If I'm reading what you wrote correctly, you're wrong.  c_str() does
not (at least it doesn't have to) call char_traits<>::length() and so
is not subject to the "translation error" for which you're referring
to.  To quote the standard:

"Returns: A pointer to the initial element of an array of length size()
+ 1 whose first size() elements equal the corresponding elements of the
string controlled by *this and whose last element is a null character
specified by charT()."

Any library that fails the following test is non-compliant:

std::string test("abc\0def", 7);  // Here we *must* specify the size
const char* pstr = test.c_str();
for (int i = 0; i < test.size(); ++i)
{
   if (*pstr == 0)
      cout << "\\0" << endl;
   else
      cout << *pstr << endl;
   ++pstr;
}

The output should be:
a
b
c
\0
d
e
f

--
William E. Kempf
Software Engineer, MS Windows Programmer


Sent via Deja.com http://www.deja.com/
Before you buy.

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.research.att.com/~austern/csc/faq.html                ]
[ Note that the FAQ URL has changed!  Please update your bookmarks.     ]