Topic: basic_string conversions


Author: ncm@netcom.com (Nathan Myers)
Date: 1995/08/21
Raw View
In article <ncmDDJ83M.Dvu@netcom.com>, Nathan Myers <ncm@netcom.com> wrote:
>In article <MATT.95Aug17235138@physics2.berkeley.edu>,
>Steve F. Cipolli <dscope!sfc@uu.psi.com> wrote:
>> What is the recommended method of converting basic_string's of one
>> parameterized type to another, and from another parameterized type itself?

>Encodings for conversion between wide and narrow strings differ,
>and you should say which encoding you mean.
>
>Unfortunately the C functions stop when they hit a NUL character.
>
>The correct mechanism to use is the locale facilities:

I wrote a bad code example, here.  Compare to the much
better code following.

>  std::wstring
>  widen(const std::string& s, const std::locale& loc = std::locale())
>  {
>    using namespace std;
>    const char* from = s.data();
>    const char* from_end = from + s.length();
>    char* to = new wchar_t[s.length()];  // should be big enough.
>    char* to_end = to + s.length();
>    try {
>      typedef codecvt<char,wchar_t,mbstate_t> Cvt;
>      if (use_facet<Cvt>(loc).convert(
>            from, from_end, from,
>     to,   to_end,   to_end, mbstate_t) != codecvt_base::ok)
> throw runtime_error();
>    } catch (...) {
>      delete [] to;
>      throw;
>    }
>    wstring w(to, to_end);
>    delete [] to;
>    return w;
>  }

I'm kind of embarrassed... To use the language and the standard
library effectively, the code above should have been:

std::wstring
widen(const std::string& s, const std::locale& loc = std::locale())
{
using namespace std;
vector<wchar_t> to(s.length());
const char* from_end;
const wchar_t* to_end;
if (use_facet< codecvt<char,wchar_t,mbstate_t> >(loc).convert(
s.data(), s.data()+s.length(), from_end,
to.begin(), to.end(), to_end, mbstate_t()) != codecvt_base::ok)
throw runtime_error();
return wstring(to, to_end);
}

This is lots shorter and cleaner, and the difference illustrates
a very general principle in designing with exceptions.

>[In case anybody didn't notice, the above code is written to
> work when Standard implementations are available.  Until then,
> you'll need ifdefs, at least.]

Nathan Myers
myersn@roguewave.com
---
[ comp.std.c++ is moderated.  Submission address: std-c++@ncar.ucar.edu.
Contact address: std-c++-request@ncar.ucar.edu.  The moderation policy
is summarized in http://dogbert.lbl.gov/~matt/std-c++/policy.html. ]





Author: ncm@netcom.com (Nathan Myers)
Date: 1995/08/23
Raw View
Distribution:
In article <ncmDDnGr2.K0E@netcom.com>, Nathan Myers <ncm@netcom.com> wrote:
>> In article <MATT.95Aug17235138@physics2.berkeley.edu>,
>> Steve F. Cipolli <dscope!sfc@uu.psi.com> wrote:
>>> What is the recommended method of converting basic_string's of one
>>> parameterized type to another, and from another parameterized type itself?

In my previous posting, somebody chopped out all the indentation.
Here is the code as it should have appeared.

>std::wstring
>widen(const std::string& s, const std::locale& loc = std::locale())
>{
>  using namespace std;
>  vector<wchar_t> to(s.length());
>  const char* from_end;
>  const wchar_t* to_end;
   mbstate_t st;
>  if (use_facet< codecvt<char,wchar_t,mbstate_t> >(loc).convert(
>              s.data(), s.data()+s.length(), from_end,
>              to.begin(), to.end(), to_end, st)) != codecvt_base::ok)
>    throw runtime_error();
>  return wstring(to, to_end);
>}

I hope this is the last try.


---
[ comp.std.c++ is moderated.  Submission address: std-c++@ncar.ucar.edu.
  Contact address: std-c++-request@ncar.ucar.edu.  The moderation policy
  is summarized in http://dogbert.lbl.gov/~matt/std-c++/policy.html. ]





Author: dscope!sfc@uu.psi.com (Steve F. Cipolli)
Date: 1995/08/18
Raw View
Distribution:
What is the recommended method of converting basic_string's of one
parameterized type to another, and from another parameterized type itself.
To clarify:

char *ntbs = "abc";
wchar_t *ntcws = L"abc";
string cs(ntbs); // basic_string<char>(char *) - OK
wstring ws1(ntcws); // basic_string<wchar_t>(wchar_t *) - OK
wstring ws2(ntbs); // basic_string<wchar_t>(char *) - ????
wstring ws3(cs); // basic_string<wchar_t>(basic_string<char>) - ????

Will the last two constructors work.  If so how?
If not, is the <cwchar> conversion function btowc from the C libray the answer?

Stephen Cipolli
Datascope Corp.
sfc@datascope.com


---
[ comp.std.c++ is moderated.  Submission address: std-c++@ncar.ucar.edu.
  Contact address: std-c++-request@ncar.ucar.edu.  The moderation policy
  is summarized in http://dogbert.lbl.gov/~matt/std-c++/policy.html. ]