Topic: String to T conversions - getting it right this time


Author: Matt Fioravante <fmatthew5876@gmail.com>
Date: Sun, 26 Jan 2014 08:25:02 -0800 (PST)
Raw View
------=_Part_308_22492382.1390753502054
Content-Type: text/plain; charset=UTF-8

string to T (int, float, etc..) conversions seem like to rather easy task
(aside from floating point round trip issues), and yet for the life of C
and C++ the standard library has consistently failed to provide a decent
interface.

Lets review:

int atoi(const char* s); //and atoll,atol,atoll, atof etc..

Whats wrong with this?

   - Returns 0 on parsing failure, making it impossible to parse 0 strings.
   This already renders this function effectively useless and we can skip the
   rest of the bullet points right here.
   - It discards leading whitespace, this has several problems of its own:
      - If we want to check whether the string is strictly a numeric
      string, we have to add our own check that the first character is a digit.
      This makes the interface clumsy to use and easy to screw up.
      - std::isspace() is locale dependent and requires an indirect
      function call (try it on gcc.godbolt.org). This makes what could be a very
      simple and inlinable conversion potentially expensive. It also prevents
      constexpr.
      - From a design standpoint, this whitespace handling is a very narrow
      use case. It does too many things and in my opinion is a bad design. I
      often do not have whitespace delimited input in my projects.
   - No atod() for doubles or atold() for long doubles.
   - No support for unsigned types, although this may not actually be a
   problem.
   - Uses horrible C interface (type suffixes in names) with no overloading
   or template arguments. What function do we use if we want to parse an
   int32_t?

long strtol(const char* str, char **str_end, int base);

Whats wrong with this one?

   - Again it has this silly leading whitespace behavior (see above).
   - Its not obvious how to correctly determine whether or not parsing
   failed. Every time I use this function I have to look it up again to make
   sure I get it exactly right and have covered all of the corner cases.
   - Uses 0/T_MAX/T_MIN to denote errors, when these could be validly
   parsed from strings. Checking whether or not these values were parsed or
   are representing errors is clumsy.
   - Again C interface issues (see above).


At this point, I think we are ready to define a new set of int/float
parsing routines.

Design goals:

   - Easy to use, usage is obvious.
   - No assumptions about use cases, we just want to parse strings. This
   means none of this automatic whitespace handling.
   - Efficient and inline
   - constexpr

Here is a first attempt for an integer parsing routine.

//Attempts to parse s as an integer. The valid integer string consists of
the following:
//* '+' or '-' sign as the first character (- only acceptable for signed
integral types)
//* prefix (0) indicating octal base (applies only when base is 0 or 8)
//* prefix (0x or 0X) indicating hexadecimal base (applies only when base
is 16 or 0).
//* All of the rest of the characters MUST be digits.
//Returns true if an integral value was successfully parsed and stores the
value in val,
//otherwise returns false and leaves val unmodified.
//Sets errno to ERANGE if the string was an integer but would overflow type
integral.
template <typename integral>
constexpr bool strto(string_view s, integral& val, int base);

//Same as the previous, except that instead of trying to parse the entire
string, we only parse the integral part.
//The beginning of the string must be an integer as specified above. Will
set tail to point to the end of the string after the integral part.
template <typename integral>
constexpr bool strto(string_view s, integral& val, int base, string_view&
tail);


First off, all of these return bool which makes it very easy to check
whether or not parsing failed.

While the interface does not allow this idom:

int x = atoi(s);

It works with this idiom which in all of my use cases is much more common:
int val;
if(!strto(s, val, 10)) {
  throw some_error();
}
printf("We parsed %d!\n", val);

Some examples:

int val;
string_view sv= "12345";
assert(strto(sv, val, 10));
assert(val == 12345);
sv = "123 456";
val = -2;
assert(!strto(sv, val, 10));
assert(val == -2);
assert(strto(sv, val, 10, sv));
assert(val == 123);
assert(sv == " 456");
sv.remove_prefix(1); //chop off the " ";
assert(sv == "456");
assert(strto(sv, val, 10));
assert(val = 456);
val = 0;
assert(strto(sv, val, 10, sv));
assert(val == 456);
assert(sv == "");


Similarly we can define this for floating point types. We may also want
null terminated const char* versions as converting a const char* to
sting_view requires a call to strlen().

--

---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.

------=_Part_308_22492382.1390753502054
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr">string to T (int, float, etc..) conversions seem like to r=
ather easy task (aside from floating point round trip issues), and yet for =
the life of C and C++ the standard library has consistently failed to provi=
de a decent interface.<div><br>Lets review:</div><div><br></div><div>int at=
oi(const char* s); //and atoll,atol,atoll, atof etc..</div><div><br></div><=
div>Whats wrong with this?</div><div><ul><li><span style=3D"line-height: no=
rmal;">Returns 0 on parsing failure, making it impossible to parse 0 string=
s. This already renders this function effectively useless and we can skip t=
he rest of the bullet points right here.</span></li><li><span style=3D"line=
-height: normal;">It discards leading whitespace, this has several problems=
 of its own:</span></li><ul><li><span style=3D"line-height: normal;">If we =
want to check whether the string is strictly a numeric string, we have to a=
dd our own check that the first character is a digit. This makes the interf=
ace clumsy to use and easy to screw up.</span></li><li><span style=3D"line-=
height: normal;">std::isspace() is locale dependent and requires an indirec=
t function call (try it on gcc.godbolt.org). This makes what could be a ver=
y simple and inlinable conversion potentially expensive. It also prevents c=
onstexpr.</span></li><li><span style=3D"line-height: normal;">From a design=
 standpoint, this whitespace handling is a very narrow use case. It does to=
o many things and in my opinion is a bad design. I often do not have whites=
pace delimited input in my projects.</span></li></ul><li><span style=3D"lin=
e-height: normal;">No atod() for doubles or atold() for long doubles.</span=
></li><li><span style=3D"line-height: normal;">No support for unsigned type=
s, although this may not actually be a problem.</span></li><li><span style=
=3D"line-height: normal;">Uses horrible C interface (type suffixes in names=
) with no overloading or template arguments. What function do we use if we =
want to parse an int32_t?</span></li></ul></div><div>long strtol(const char=
* str, char **str_end, int base);</div><div><br></div><div>Whats wrong with=
 this one?</div><div><ul><li><span style=3D"line-height: normal;">Again it =
has this silly leading whitespace behavior (see above).</span></li><li><spa=
n style=3D"line-height: normal;">Its not obvious how to correctly determine=
 whether or not parsing failed. Every time I use this function I have to lo=
ok it up again to make sure I get it exactly right and have covered all of =
the corner cases.</span></li><li><span style=3D"line-height: normal;">Uses =
0/T_MAX/T_MIN to denote errors, when these could be validly parsed from str=
ings. Checking whether or not these values were parsed or are representing =
errors is clumsy.</span></li><li><span style=3D"line-height: normal;">Again=
 C interface issues (see above).</span></li></ul><div><br></div></div><div>=
At this point, I think we are ready to define a new set of int/float parsin=
g routines.</div><div><br>Design goals:</div><div><ul><li><span style=3D"li=
ne-height: normal;">Easy to use, usage is obvious.</span></li><li><span sty=
le=3D"line-height: normal;">No assumptions about use cases, we just want to=
 parse strings. This means none of this automatic whitespace handling.</spa=
n></li><li><span style=3D"line-height: normal;">Efficient and inline</span>=
</li><li><span style=3D"line-height: normal;">constexpr</span></li></ul><di=
v>Here is a first attempt for an integer parsing routine.</div></div><div><=
br></div><div>//Attempts to parse s as an integer. The valid integer string=
 consists of the following:</div><div>//* '+' or '-' sign as the first char=
acter (- only acceptable for signed integral types)</div><div>//* prefix (0=
) indicating octal base (applies only when base is 0 or 8)</div><div>//* pr=
efix (0x or 0X) indicating hexadecimal base (applies only when base is 16 o=
r 0).</div><div>//* All of the rest of the characters MUST be digits.</div>=
<div>//Returns true if an integral value was successfully parsed and stores=
 the value in val,</div><div>//otherwise returns false and leaves val unmod=
ified.&nbsp;</div><div>//Sets errno to ERANGE if the string was an integer =
but would overflow type integral.</div><div>template &lt;typename integral&=
gt;<br>constexpr bool strto(string_view s, integral&amp; val, int base);</d=
iv><div><br></div><div>//Same as the previous, except that instead of tryin=
g to parse the entire string, we only parse the integral part.&nbsp;<br>//T=
he beginning of the string must be an integer as specified above. Will set =
tail to point to the end of the string after the integral part.</div><div>t=
emplate &lt;typename integral&gt;</div><div>constexpr bool strto(string_vie=
w s, integral&amp; val, int base, string_view&amp; tail);</div><div><br></d=
iv><div><br></div><div>First off, all of these return bool which makes it v=
ery easy to check whether or not parsing failed.</div><div><br></div><div>W=
hile the interface does not allow this idom:</div><div><br></div><div>int x=
 =3D atoi(s);</div><div><br></div><div>It works with this idiom which in al=
l of my use cases is much more common:</div><div>int val;</div><div>if(!str=
to(s, val, 10)) {</div><div>&nbsp; throw some_error();<br>}</div><div>print=
f("We parsed %d!\n", val);</div><div><br></div><div>Some examples:</div><di=
v><br></div><div>int val;</div><div>string_view sv=3D "12345";</div><div>as=
sert(strto(sv, val, 10));</div><div>assert(val =3D=3D 12345);</div><div>sv =
=3D "123 456";</div><div>val =3D -2;</div><div>assert(!strto(sv, val, 10));=
</div><div>assert(val =3D=3D -2);</div><div>assert(strto(sv, val, 10, sv));=
</div><div>assert(val =3D=3D 123);</div><div>assert(sv =3D=3D " 456");</div=
><div>sv.remove_prefix(1); //chop off the " ";</div><div>assert(sv =3D=3D "=
456");</div><div>assert(strto(sv, val, 10));</div><div>assert(val =3D 456);=
</div><div>val =3D 0;</div><div>assert(strto(sv, val, 10, sv));</div><div>a=
ssert(val =3D=3D 456);</div><div>assert(sv =3D=3D "");</div><div><br></div>=
<div><br></div><div>Similarly we can define this for floating point types. =
We may also want null terminated const char* versions as converting a const=
 char* to sting_view requires a call to strlen().&nbsp;</div></div>

<p></p>

-- <br />
&nbsp;<br />
--- <br />
You received this message because you are subscribed to the Google Groups &=
quot;ISO C++ Standard - Future Proposals&quot; group.<br />
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to std-proposals+unsubscribe@isocpp.org.<br />
To post to this group, send email to std-proposals@isocpp.org.<br />
Visit this group at <a href=3D"http://groups.google.com/a/isocpp.org/group/=
std-proposals/">http://groups.google.com/a/isocpp.org/group/std-proposals/<=
/a>.<br />

------=_Part_308_22492382.1390753502054--

.


Author: "dgutson ." <danielgutson@gmail.com>
Date: Sun, 26 Jan 2014 14:54:47 -0200
Raw View
On Sun, Jan 26, 2014 at 1:25 PM, Matt Fioravante <fmatthew5876@gmail.com> w=
rote:
> string to T (int, float, etc..) conversions seem like to rather easy task
> (aside from floating point round trip issues), and yet for the life of C =
and
> C++ the standard library has consistently failed to provide a decent
> interface.
>
> Lets review:

Why didn't you include stringstream in your review? E.g. something
like https://code.google.com/p/mili/source/browse/mili/string_utils.h#267

>
> int atoi(const char* s); //and atoll,atol,atoll, atof etc..
>
> Whats wrong with this?
>
> Returns 0 on parsing failure, making it impossible to parse 0 strings. Th=
is
> already renders this function effectively useless and we can skip the res=
t
> of the bullet points right here.
> It discards leading whitespace, this has several problems of its own:
>
> If we want to check whether the string is strictly a numeric string, we h=
ave
> to add our own check that the first character is a digit. This makes the
> interface clumsy to use and easy to screw up.
> std::isspace() is locale dependent and requires an indirect function call
> (try it on gcc.godbolt.org). This makes what could be a very simple and
> inlinable conversion potentially expensive. It also prevents constexpr.
> From a design standpoint, this whitespace handling is a very narrow use
> case. It does too many things and in my opinion is a bad design. I often =
do
> not have whitespace delimited input in my projects.
>
> No atod() for doubles or atold() for long doubles.
> No support for unsigned types, although this may not actually be a proble=
m.
> Uses horrible C interface (type suffixes in names) with no overloading or
> template arguments. What function do we use if we want to parse an int32_=
t?
>
> long strtol(const char* str, char **str_end, int base);
>
> Whats wrong with this one?
>
> Again it has this silly leading whitespace behavior (see above).
> Its not obvious how to correctly determine whether or not parsing failed.
> Every time I use this function I have to look it up again to make sure I =
get
> it exactly right and have covered all of the corner cases.
> Uses 0/T_MAX/T_MIN to denote errors, when these could be validly parsed f=
rom
> strings. Checking whether or not these values were parsed or are
> representing errors is clumsy.
> Again C interface issues (see above).
>
>
> At this point, I think we are ready to define a new set of int/float pars=
ing
> routines.
>
> Design goals:
>
> Easy to use, usage is obvious.
> No assumptions about use cases, we just want to parse strings. This means
> none of this automatic whitespace handling.
> Efficient and inline
> constexpr
>
> Here is a first attempt for an integer parsing routine.
>
> //Attempts to parse s as an integer. The valid integer string consists of
> the following:
> //* '+' or '-' sign as the first character (- only acceptable for signed
> integral types)
> //* prefix (0) indicating octal base (applies only when base is 0 or 8)
> //* prefix (0x or 0X) indicating hexadecimal base (applies only when base=
 is
> 16 or 0).
> //* All of the rest of the characters MUST be digits.
> //Returns true if an integral value was successfully parsed and stores th=
e
> value in val,
> //otherwise returns false and leaves val unmodified.
> //Sets errno to ERANGE if the string was an integer but would overflow ty=
pe
> integral.
> template <typename integral>
> constexpr bool strto(string_view s, integral& val, int base);
>
> //Same as the previous, except that instead of trying to parse the entire
> string, we only parse the integral part.
> //The beginning of the string must be an integer as specified above. Will
> set tail to point to the end of the string after the integral part.
> template <typename integral>
> constexpr bool strto(string_view s, integral& val, int base, string_view&
> tail);
>
>
> First off, all of these return bool which makes it very easy to check
> whether or not parsing failed.
>
> While the interface does not allow this idom:
>
> int x =3D atoi(s);
>
> It works with this idiom which in all of my use cases is much more common=
:
> int val;
> if(!strto(s, val, 10)) {
>   throw some_error();
> }
> printf("We parsed %d!\n", val);
>
> Some examples:
>
> int val;
> string_view sv=3D "12345";
> assert(strto(sv, val, 10));
> assert(val =3D=3D 12345);
> sv =3D "123 456";
> val =3D -2;
> assert(!strto(sv, val, 10));
> assert(val =3D=3D -2);
> assert(strto(sv, val, 10, sv));
> assert(val =3D=3D 123);
> assert(sv =3D=3D " 456");
> sv.remove_prefix(1); //chop off the " ";
> assert(sv =3D=3D "456");
> assert(strto(sv, val, 10));
> assert(val =3D 456);
> val =3D 0;
> assert(strto(sv, val, 10, sv));
> assert(val =3D=3D 456);
> assert(sv =3D=3D "");
>
>
> Similarly we can define this for floating point types. We may also want n=
ull
> terminated const char* versions as converting a const char* to sting_view
> requires a call to strlen().
>
> --
>
> ---
> You received this message because you are subscribed to the Google Groups
> "ISO C++ Standard - Future Proposals" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to std-proposals+unsubscribe@isocpp.org.
> To post to this group, send email to std-proposals@isocpp.org.
> Visit this group at
> http://groups.google.com/a/isocpp.org/group/std-proposals/.



--=20
Who=92s got the sweetest disposition?
One guess, that=92s who?
Who=92d never, ever start an argument?
Who never shows a bit of temperament?
Who's never wrong but always right?
Who'd never dream of starting a fight?
Who get stuck with all the bad luck?

--=20

---=20
You received this message because you are subscribed to the Google Groups "=
ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposa=
ls/.

.


Author: Miro Knejp <miro@knejp.de>
Date: Sun, 26 Jan 2014 18:29:09 +0100
Raw View
This is a multi-part message in MIME format.
--------------010308020707040100040002
Content-Type: text/plain; charset=UTF-8; format=flowed


Am 26.01.2014 17:25, schrieb Matt Fioravante:
> string to T (int, float, etc..) conversions seem like to rather easy
> task (aside from floating point round trip issues), and yet for the
> life of C and C++ the standard library has consistently failed to
> provide a decent interface.
>
> Lets review:
>
> int atoi(const char* s); //and atoll,atol,atoll, atof etc..
>
> Whats wrong with this?
>
>   * Returns 0 on parsing failure, making it impossible to parse 0
>     strings. This already renders this function effectively useless
>     and we can skip the rest of the bullet points right here.
>   * It discards leading whitespace, this has several problems of its own:
>       o If we want to check whether the string is strictly a numeric
>         string, we have to add our own check that the first character
>         is a digit. This makes the interface clumsy to use and easy to
>         screw up.
>       o std::isspace() is locale dependent and requires an indirect
>         function call (try it on gcc.godbolt.org). This makes what
>         could be a very simple and inlinable conversion potentially
>         expensive. It also prevents constexpr.
>       o From a design standpoint, this whitespace handling is a very
>         narrow use case. It does too many things and in my opinion is
>         a bad design. I often do not have whitespace delimited input
>         in my projects.
>   * No atod() for doubles or atold() for long doubles.
>   * No support for unsigned types, although this may not actually be a
>     problem.
>   * Uses horrible C interface (type suffixes in names) with no
>     overloading or template arguments. What function do we use if we
>     want to parse an int32_t?
>
> long strtol(const char* str, char **str_end, int base);
>
> Whats wrong with this one?
>
>   * Again it has this silly leading whitespace behavior (see above).
>   * Its not obvious how to correctly determine whether or not parsing
>     failed. Every time I use this function I have to look it up again
>     to make sure I get it exactly right and have covered all of the
>     corner cases.
>   * Uses 0/T_MAX/T_MIN to denote errors, when these could be validly
>     parsed from strings. Checking whether or not these values were
>     parsed or are representing errors is clumsy.
>   * Again C interface issues (see above).
>
I am currently facing the same problems while working on my format
proposal:
https://groups.google.com/a/isocpp.org/forum/?fromgroups#!topic/std-proposals/CIlWCTOe5kc
All the existing functions work on null-terminated strings only which is
totally useless for my use cases as I am parsing substrings in-place. I
intend to design the string processing stuff that I'm using general
enough so it can be used independently but for now I just want to make
it work in the first place.
>
> At this point, I think we are ready to define a new set of int/float
> parsing routines.
>
> Design goals:
>
>   * Easy to use, usage is obvious.
>   * No assumptions about use cases, we just want to parse strings.
>     This means none of this automatic whitespace handling.
>   * Efficient and inline
>   * constexpr
>
> Here is a first attempt for an integer parsing routine.
>
> //Attempts to parse s as an integer. The valid integer string consists
> of the following:
> //* '+' or '-' sign as the first character (- only acceptable for
> signed integral types)
> //* prefix (0) indicating octal base (applies only when base is 0 or 8)
> //* prefix (0x or 0X) indicating hexadecimal base (applies only when
> base is 16 or 0).
> //* All of the rest of the characters MUST be digits.
> //Returns true if an integral value was successfully parsed and stores
> the value in val,
> //otherwise returns false and leaves val unmodified.
> //Sets errno to ERANGE if the string was an integer but would overflow
> type integral.
> template <typename integral>
> constexpr bool strto(string_view s, integral& val, int base);
Please no ERxxx nonsense. optional, expected, exceptions, pairs,
whatever but no ER codes, that's even more silly C. I currently base
mine on iterators and provide string_view as convenience overloads.
>
> //Same as the previous, except that instead of trying to parse the
> entire string, we only parse the integral part.
> //The beginning of the string must be an integer as specified above.
> Will set tail to point to the end of the string after the integral part.
> template <typename integral>
> constexpr bool strto(string_view s, integral& val, int base,
> string_view& tail);
>
With iterators it could return the iterator to the first element not
part of the integer. a pair<optional<int>, Iter> or similar is a
possibility. Certainly not the best concept but I'd prefer it to
checking errno anyday and depending on the combination of optional's
engaged and the iterator position you can determine whether it failed
and if so, why. Well, just giving some spontaneous food for thought. I
only very recently started with the number parsing part of my proposal,
so the interface will probbaly be very unstable for quite a while. And
then there's locales, and together with them a whole new world of
problems...
>
> First off, all of these return bool which makes it very easy to check
> whether or not parsing failed.
>
> While the interface does not allow this idom:
>
> int x = atoi(s);
>
> It works with this idiom which in all of my use cases is much more common:
> int val;
> if(!strto(s, val, 10)) {
>   throw some_error();
> }
> printf("We parsed %d!\n", val);
>
> Some examples:
>
> int val;
> string_view sv= "12345";
> assert(strto(sv, val, 10));
> assert(val == 12345);
> sv = "123 456";
> val = -2;
> assert(!strto(sv, val, 10));
> assert(val == -2);
> assert(strto(sv, val, 10, sv));
> assert(val == 123);
> assert(sv == " 456");
> sv.remove_prefix(1); //chop off the " ";
> assert(sv == "456");
> assert(strto(sv, val, 10));
> assert(val = 456);
> val = 0;
> assert(strto(sv, val, 10, sv));
> assert(val == 456);
> assert(sv == "");
>
>
> Similarly we can define this for floating point types. We may also
> want null terminated const char* versions as converting a const char*
> to sting_view requires a call to strlen().
> --
>
> ---
> You received this message because you are subscribed to the Google
> Groups "ISO C++ Standard - Future Proposals" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to std-proposals+unsubscribe@isocpp.org.
> To post to this group, send email to std-proposals@isocpp.org.
> Visit this group at
> http://groups.google.com/a/isocpp.org/group/std-proposals/.

--

---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.

--------------010308020707040100040002
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

<html>
  <head>
    <meta content=3D"text/html; charset=3DUTF-8" http-equiv=3D"Content-Type=
">
  </head>
  <body text=3D"#000000" bgcolor=3D"#FFFFFF">
    <br>
    <div class=3D"moz-cite-prefix">Am 26.01.2014 17:25, schrieb Matt
      Fioravante:<br>
    </div>
    <blockquote
      cite=3D"mid:097fb6c8-56f9-433e-a2f0-8c0f69609bf0@isocpp.org"
      type=3D"cite">
      <div dir=3D"ltr">string to T (int, float, etc..) conversions seem
        like to rather easy task (aside from floating point round trip
        issues), and yet for the life of C and C++ the standard library
        has consistently failed to provide a decent interface.
        <div><br>
          Lets review:</div>
        <div><br>
        </div>
        <div>int atoi(const char* s); //and atoll,atol,atoll, atof etc..</d=
iv>
        <div><br>
        </div>
        <div>Whats wrong with this?</div>
        <div>
          <ul>
            <li><span style=3D"line-height: normal;">Returns 0 on parsing
                failure, making it impossible to parse 0 strings. This
                already renders this function effectively useless and we
                can skip the rest of the bullet points right here.</span></=
li>
            <li><span style=3D"line-height: normal;">It discards leading
                whitespace, this has several problems of its own:</span></l=
i>
            <ul>
              <li><span style=3D"line-height: normal;">If we want to check
                  whether the string is strictly a numeric string, we
                  have to add our own check that the first character is
                  a digit. This makes the interface clumsy to use and
                  easy to screw up.</span></li>
              <li><span style=3D"line-height: normal;">std::isspace() is
                  locale dependent and requires an indirect function
                  call (try it on gcc.godbolt.org). This makes what
                  could be a very simple and inlinable conversion
                  potentially expensive. It also prevents constexpr.</span>=
</li>
              <li><span style=3D"line-height: normal;">From a design
                  standpoint, this whitespace handling is a very narrow
                  use case. It does too many things and in my opinion is
                  a bad design. I often do not have whitespace delimited
                  input in my projects.</span></li>
            </ul>
            <li><span style=3D"line-height: normal;">No atod() for doubles
                or atold() for long doubles.</span></li>
            <li><span style=3D"line-height: normal;">No support for
                unsigned types, although this may not actually be a
                problem.</span></li>
            <li><span style=3D"line-height: normal;">Uses horrible C
                interface (type suffixes in names) with no overloading
                or template arguments. What function do we use if we
                want to parse an int32_t?</span></li>
          </ul>
        </div>
        <div>long strtol(const char* str, char **str_end, int base);</div>
        <div><br>
        </div>
        <div>Whats wrong with this one?</div>
        <div>
          <ul>
            <li><span style=3D"line-height: normal;">Again it has this
                silly leading whitespace behavior (see above).</span></li>
            <li><span style=3D"line-height: normal;">Its not obvious how
                to correctly determine whether or not parsing failed.
                Every time I use this function I have to look it up
                again to make sure I get it exactly right and have
                covered all of the corner cases.</span></li>
            <li><span style=3D"line-height: normal;">Uses 0/T_MAX/T_MIN to
                denote errors, when these could be validly parsed from
                strings. Checking whether or not these values were
                parsed or are representing errors is clumsy.</span></li>
            <li><span style=3D"line-height: normal;">Again C interface
                issues (see above).</span></li>
          </ul>
        </div>
      </div>
    </blockquote>
    I am currently facing the same problems while working on my format
    proposal:
<a class=3D"moz-txt-link-freetext" href=3D"https://groups.google.com/a/isoc=
pp.org/forum/?fromgroups#!topic/std-proposals/CIlWCTOe5kc">https://groups.g=
oogle.com/a/isocpp.org/forum/?fromgroups#!topic/std-proposals/CIlWCTOe5kc</=
a><br>
    All the existing functions work on null-terminated strings only
    which is totally useless for my use cases as I am parsing substrings
    in-place. I intend to design the string processing stuff that I'm
    using general enough so it can be used independently but for now I
    just want to make it work in the first place.<br>
    <blockquote
      cite=3D"mid:097fb6c8-56f9-433e-a2f0-8c0f69609bf0@isocpp.org"
      type=3D"cite">
      <div dir=3D"ltr">
        <div>
          <div><br>
          </div>
        </div>
        <div>At this point, I think we are ready to define a new set of
          int/float parsing routines.</div>
        <div><br>
          Design goals:</div>
        <div>
          <ul>
            <li><span style=3D"line-height: normal;">Easy to use, usage is
                obvious.</span></li>
            <li><span style=3D"line-height: normal;">No assumptions about
                use cases, we just want to parse strings. This means
                none of this automatic whitespace handling.</span></li>
            <li><span style=3D"line-height: normal;">Efficient and inline</=
span></li>
            <li><span style=3D"line-height: normal;">constexpr</span></li>
          </ul>
          <div>Here is a first attempt for an integer parsing routine.</div=
>
        </div>
        <div><br>
        </div>
        <div>//Attempts to parse s as an integer. The valid integer
          string consists of the following:</div>
        <div>//* '+' or '-' sign as the first character (- only
          acceptable for signed integral types)</div>
        <div>//* prefix (0) indicating octal base (applies only when
          base is 0 or 8)</div>
        <div>//* prefix (0x or 0X) indicating hexadecimal base (applies
          only when base is 16 or 0).</div>
        <div>//* All of the rest of the characters MUST be digits.</div>
        <div>//Returns true if an integral value was successfully parsed
          and stores the value in val,</div>
        <div>//otherwise returns false and leaves val unmodified.=C2=A0</di=
v>
        <div>//Sets errno to ERANGE if the string was an integer but
          would overflow type integral.</div>
        <div>template &lt;typename integral&gt;<br>
          constexpr bool strto(string_view s, integral&amp; val, int
          base);</div>
      </div>
    </blockquote>
    Please no ERxxx nonsense. optional, expected, exceptions, pairs,
    whatever but no ER codes, that's even more silly C. I currently base
    mine on iterators and provide string_view as convenience overloads.<br>
    <blockquote
      cite=3D"mid:097fb6c8-56f9-433e-a2f0-8c0f69609bf0@isocpp.org"
      type=3D"cite">
      <div dir=3D"ltr">
        <div><br>
        </div>
        <div>//Same as the previous, except that instead of trying to
          parse the entire string, we only parse the integral part.=C2=A0<b=
r>
          //The beginning of the string must be an integer as specified
          above. Will set tail to point to the end of the string after
          the integral part.</div>
        <div>template &lt;typename integral&gt;</div>
        <div>constexpr bool strto(string_view s, integral&amp; val, int
          base, string_view&amp; tail);</div>
        <div><br>
        </div>
      </div>
    </blockquote>
    With iterators it could return the iterator to the first element not
    part of the integer. a pair&lt;optional&lt;int&gt;, Iter&gt; or
    similar is a possibility. Certainly not the best concept but I'd
    prefer it to checking errno anyday and depending on the combination
    of optional's engaged and the iterator position you can determine
    whether it failed and if so, why. Well, just giving some spontaneous
    food for thought. I only very recently started with the number
    parsing part of my proposal, so the interface will probbaly be very
    unstable for quite a while. And then there's locales, and together
    with them a whole new world of problems...<br>
    <blockquote
      cite=3D"mid:097fb6c8-56f9-433e-a2f0-8c0f69609bf0@isocpp.org"
      type=3D"cite">
      <div dir=3D"ltr">
        <div><br>
        </div>
        <div>First off, all of these return bool which makes it very
          easy to check whether or not parsing failed.</div>
        <div><br>
        </div>
        <div>While the interface does not allow this idom:</div>
        <div><br>
        </div>
        <div>int x =3D atoi(s);</div>
        <div><br>
        </div>
        <div>It works with this idiom which in all of my use cases is
          much more common:</div>
        <div>int val;</div>
        <div>if(!strto(s, val, 10)) {</div>
        <div>=C2=A0 throw some_error();<br>
          }</div>
        <div>printf("We parsed %d!\n", val);</div>
        <div><br>
        </div>
        <div>Some examples:</div>
        <div><br>
        </div>
        <div>int val;</div>
        <div>string_view sv=3D "12345";</div>
        <div>assert(strto(sv, val, 10));</div>
        <div>assert(val =3D=3D 12345);</div>
        <div>sv =3D "123 456";</div>
        <div>val =3D -2;</div>
        <div>assert(!strto(sv, val, 10));</div>
        <div>assert(val =3D=3D -2);</div>
        <div>assert(strto(sv, val, 10, sv));</div>
        <div>assert(val =3D=3D 123);</div>
        <div>assert(sv =3D=3D " 456");</div>
        <div>sv.remove_prefix(1); //chop off the " ";</div>
        <div>assert(sv =3D=3D "456");</div>
        <div>assert(strto(sv, val, 10));</div>
        <div>assert(val =3D 456);</div>
        <div>val =3D 0;</div>
        <div>assert(strto(sv, val, 10, sv));</div>
        <div>assert(val =3D=3D 456);</div>
        <div>assert(sv =3D=3D "");</div>
        <div><br>
        </div>
        <div><br>
        </div>
        <div>Similarly we can define this for floating point types. We
          may also want null terminated const char* versions as
          converting a const char* to sting_view requires a call to
          strlen().=C2=A0</div>
      </div>
      -- <br>
      =C2=A0<br>
      --- <br>
      You received this message because you are subscribed to the Google
      Groups "ISO C++ Standard - Future Proposals" group.<br>
      To unsubscribe from this group and stop receiving emails from it,
      send an email to <a class=3D"moz-txt-link-abbreviated" href=3D"mailto=
:std-proposals+unsubscribe@isocpp.org">std-proposals+unsubscribe@isocpp.org=
</a>.<br>
      To post to this group, send email to <a class=3D"moz-txt-link-abbrevi=
ated" href=3D"mailto:std-proposals@isocpp.org">std-proposals@isocpp.org</a>=
..<br>
      Visit this group at <a moz-do-not-send=3D"true"
        href=3D"http://groups.google.com/a/isocpp.org/group/std-proposals/"=
>http://groups.google.com/a/isocpp.org/group/std-proposals/</a>.<br>
    </blockquote>
    <br>
  </body>
</html>

<p></p>

-- <br />
&nbsp;<br />
--- <br />
You received this message because you are subscribed to the Google Groups &=
quot;ISO C++ Standard - Future Proposals&quot; group.<br />
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to std-proposals+unsubscribe@isocpp.org.<br />
To post to this group, send email to std-proposals@isocpp.org.<br />
Visit this group at <a href=3D"http://groups.google.com/a/isocpp.org/group/=
std-proposals/">http://groups.google.com/a/isocpp.org/group/std-proposals/<=
/a>.<br />

--------------010308020707040100040002--


.


Author: Thiago Macieira <thiago@macieira.org>
Date: Sun, 26 Jan 2014 10:29:01 -0800
Raw View
On domingo, 26 de janeiro de 2014 08:25:02, Matt Fioravante wrote:
> At this point, I think we are ready to define a new set of int/float
> parsing routines.
>
> Design goals:
>
>    - Easy to use, usage is obvious.

I'm sure that everyone intended that for their functions. That survived only
until the first round of feedback or encounter with reality...

>    - No assumptions about use cases, we just want to parse strings. This
>    means none of this automatic whitespace handling.

Fair enough. It's easier to compose with other space checkers if you need to
than to remove functionality.

>    - Efficient and inline
>    - constexpr

Efficient, definitely. Inline and constexpr? Forget it, it can't be done. Have
you ever looked at the source of a string-to-double function? They're huge!
This might be left as a suggestion to compilers to implement this as an
intrinsic.

> //Attempts to parse s as an integer. The valid integer string consists of
> the following:
> //* '+' or '-' sign as the first character (- only acceptable for signed
> integral types)

But no U+2212?

> //* prefix (0) indicating octal base (applies only when base is 0 or 8)
> //* prefix (0x or 0X) indicating hexadecimal base (applies only when base
> is 16 or 0).
> //* All of the rest of the characters MUST be digits.

Where, by "digits", we understand the regular ASCII digits 0 to 9 and the
letters that compose digits on this base, both in uppercase and lowercase.

> //Returns true if an integral value was successfully parsed and stores the
> value in val,
> //otherwise returns false and leaves val unmodified.
> //Sets errno to ERANGE if the string was an integer but would overflow type
> integral.

What if it failed to parse? What's the return condition?

As others have said, using errno is too C, but then again this kind of
function should be done in conjunction with the C people. Any improvements we
need, they probably need too.

> template <typename integral>
> constexpr bool strto(string_view s, integral& val, int base);

Replace string_view with a pair of InputIterators.

Do you know what this means? Parsing char16_t, char32_t and wchar_t too.

> //Same as the previous, except that instead of trying to parse the entire
> string, we only parse the integral part.
> //The beginning of the string must be an integer as specified above. Will
> set tail to point to the end of the string after the integral part.
> template <typename integral>
> constexpr bool strto(string_view s, integral& val, int base, string_view&
> tail);

Same as above.

> First off, all of these return bool which makes it very easy to check
> whether or not parsing failed.

That's the opposite of what most people want. Most people want to get the
parsed number, not whether it succeded or failed. Maybe invert the logic?

That's what we do for {QString,QByteArray,QLocale}::to{Int,Double,etc.}. And
one suggestion I received a few weeks ago was to add the overload that returns
the end pointer and does not fail if there's more stuff after it.

--
Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org
   Software Architect - Intel Open Source Technology Center
      PGP/GPG: 0x6EF45358; fingerprint:
      E067 918B B660 DBD1 105C  966C 33F5 F005 6EF4 5358

--

---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.

.


Author: "Vicente J. Botet Escriba" <vicente.botet@wanadoo.fr>
Date: Sun, 26 Jan 2014 22:26:57 +0100
Raw View
This is a multi-part message in MIME format.
--------------090906060007080403000503
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: quoted-printable

Le 26/01/14 18:29, Miro Knejp a =E9crit :
>
> Am 26.01.2014 17:25, schrieb Matt Fioravante:
>> string to T (int, float, etc..) conversions seem like to rather easy=20
>> task (aside from floating point round trip issues), and yet for the=20
>> life of C and C++ the standard library has consistently failed to=20
>> provide a decent interface.
>>
>
>>
>> At this point, I think we are ready to define a new set of int/float=20
>> parsing routines.
>>
>> Design goals:
>>
>>   * Easy to use, usage is obvious.
>>   * No assumptions about use cases, we just want to parse strings.
>>     This means none of this automatic whitespace handling.
>>   * Efficient and inline
>>   * constexpr
>>
>> Here is a first attempt for an integer parsing routine.
>>
>> //Attempts to parse s as an integer. The valid integer string=20
>> consists of the following:
>> //* '+' or '-' sign as the first character (- only acceptable for=20
>> signed integral types)
>> //* prefix (0) indicating octal base (applies only when base is 0 or 8)
>> //* prefix (0x or 0X) indicating hexadecimal base (applies only when=20
>> base is 16 or 0).
>> //* All of the rest of the characters MUST be digits.
>> //Returns true if an integral value was successfully parsed and=20
>> stores the value in val,
>> //otherwise returns false and leaves val unmodified.
>> //Sets errno to ERANGE if the string was an integer but would=20
>> overflow type integral.
>> template <typename integral>
>> constexpr bool strto(string_view s, integral& val, int base);
> Please no ERxxx nonsense. optional, expected, exceptions, pairs,=20
> whatever but no ER codes, that's even more silly C.
+1

Vicente

--=20

---=20
You received this message because you are subscribed to the Google Groups "=
ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposa=
ls/.

--------------090906060007080403000503
Content-Type: text/html; charset=ISO-8859-1

<html>
  <head>
    <meta content="text/html; charset=ISO-8859-1"
      http-equiv="Content-Type">
  </head>
  <body bgcolor="#FFFFFF" text="#000000">
    <div class="moz-cite-prefix">Le 26/01/14 18:29, Miro Knejp a &eacute;crit&nbsp;:<br>
    </div>
    <blockquote cite="mid:52E545E5.3060802@knejp.de" type="cite">
      <meta content="text/html; charset=ISO-8859-1"
        http-equiv="Content-Type">
      <br>
      <div class="moz-cite-prefix">Am 26.01.2014 17:25, schrieb Matt
        Fioravante:<br>
      </div>
      <blockquote
        cite="mid:097fb6c8-56f9-433e-a2f0-8c0f69609bf0@isocpp.org"
        type="cite">
        <div dir="ltr">string to T (int, float, etc..) conversions seem
          like to rather easy task (aside from floating point round trip
          issues), and yet for the life of C and C++ the standard
          library has consistently failed to provide a decent interface.
          <div><br>
          </div>
        </div>
      </blockquote>
      <br>
      <blockquote
        cite="mid:097fb6c8-56f9-433e-a2f0-8c0f69609bf0@isocpp.org"
        type="cite">
        <div dir="ltr">
          <div>
            <div><br>
            </div>
          </div>
          <div>At this point, I think we are ready to define a new set
            of int/float parsing routines.</div>
          <div><br>
            Design goals:</div>
          <div>
            <ul>
              <li><span style="line-height: normal;">Easy to use, usage
                  is obvious.</span></li>
              <li><span style="line-height: normal;">No assumptions
                  about use cases, we just want to parse strings. This
                  means none of this automatic whitespace handling.</span></li>
              <li><span style="line-height: normal;">Efficient and
                  inline</span></li>
              <li><span style="line-height: normal;">constexpr</span></li>
            </ul>
            <div>Here is a first attempt for an integer parsing routine.</div>
          </div>
          <div><br>
          </div>
          <div>//Attempts to parse s as an integer. The valid integer
            string consists of the following:</div>
          <div>//* '+' or '-' sign as the first character (- only
            acceptable for signed integral types)</div>
          <div>//* prefix (0) indicating octal base (applies only when
            base is 0 or 8)</div>
          <div>//* prefix (0x or 0X) indicating hexadecimal base
            (applies only when base is 16 or 0).</div>
          <div>//* All of the rest of the characters MUST be digits.</div>
          <div>//Returns true if an integral value was successfully
            parsed and stores the value in val,</div>
          <div>//otherwise returns false and leaves val unmodified.&nbsp;</div>
          <div>//Sets errno to ERANGE if the string was an integer but
            would overflow type integral.</div>
          <div>template &lt;typename integral&gt;<br>
            constexpr bool strto(string_view s, integral&amp; val, int
            base);</div>
        </div>
      </blockquote>
      Please no ERxxx nonsense. optional, expected, exceptions, pairs,
      whatever but no ER codes, that's even more silly C.</blockquote>
    +1<br>
    <br>
    Vicente<br>
    <br>
  </body>
</html>

<p></p>

-- <br />
&nbsp;<br />
--- <br />
You received this message because you are subscribed to the Google Groups &quot;ISO C++ Standard - Future Proposals&quot; group.<br />
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.<br />
To post to this group, send email to std-proposals@isocpp.org.<br />
Visit this group at <a href="http://groups.google.com/a/isocpp.org/group/std-proposals/">http://groups.google.com/a/isocpp.org/group/std-proposals/</a>.<br />

--------------090906060007080403000503--

.


Author: Matt Fioravante <fmatthew5876@gmail.com>
Date: Sun, 26 Jan 2014 14:10:53 -0800 (PST)
Raw View
------=_Part_1392_31554110.1390774253421
Content-Type: text/plain; charset=UTF-8

On Sunday, January 26, 2014 11:54:47 AM UTC-5, dgutson . wrote:
>
>
> Why didn't you include stringstream in your review? E.g. something
> like https://code.google.com/p/mili/source/browse/mili/string_utils.h#267
>

Sure, I'll review it right now.
Stringstream is slow. It is also painful to use as you are forced to have
your string stored within a stringstream object. This makes it even slower
as you have to copy your string data into a string stream, possibly also
allocating memory. Its a non-starter.

On Sunday, January 26, 2014 12:29:09 PM UTC-5, Miro Knejp wrote:
>
> I am currently facing the same problems while working on my format
> proposal:
> https://groups.google.com/a/isocpp.org/forum/?fromgroups#!topic/std-proposals/CIlWCTOe5kc
> All the existing functions work on null-terminated strings only which is
> totally useless for my use cases
>

Agree this is a completely unacceptable restriction. One day I hope null
terminated strings will just go away and we will all use string_view.


> as I am parsing substrings in-place. I intend to design the string
> processing stuff that I'm using general enough so it can be used
> independently but for now I just want to make it work in the first place.
>
>
> Please no ERxxx nonsense. optional, expected, exceptions, pairs, whatever
> but no ER codes, that's even more silly C. I currently base mine on
> iterators and provide string_view as convenience overloads.
>

Fair enough, errno does seem to be pretty loathed.. The question then is
how do you know tell the user why it failed? Does the user even care?
Perhaps instead of returning bool, you do the old style of returning an
int. 0 means success and different non-zero values can be used to represent
different reasons for failure such as parsing errors and range errors. We
can reuse errno tags for the return value or create an enum.

if((rc = strto(s, val, 10) != 0) {
  if(rc == ERANGE) {
    printf("Out of range!\n");
 } else {
    printf("Parsing error!\n");
 }
}

Exceptions are possible but rather heavy weight. Constructing an exception
usually means also constructing a string error message. Not only do you
have to pay for the allocation of this string, it may not match the kind of
error reporting you'd like to provide to the caller, if any at all. The
exception would also need to provide a kind of error code for quickly
detecting why the conversion failed if you want to specially handle
different failure modes.

In many cases users might want to throw an exception but we should not
force that because exceptions can be too expensive for red hot parsing
code. We might want to also add strto_e() or something that simply wraps
strto() and throws an exception for convenience when throwing on parsing
error makes sense.

With iterators it could return the iterator to the first element not part
> of the integer. a pair<optional<int>, Iter> or similar is a possibility.
>

Seems rather complicated to stuff all of that into the return value no?
Returning a std::optional<int> would be ok because you can directly check
the return value with operator bool(). It still doesn't provide information
on why the failure occurred though.

I kind of like my idea of having an overload where you pass a
string_view/iterator by reference to get the end position if you want it.
Not all out parameters have be in the return value. Return values make it
easy to write expressions. Using out parameters is perfectly fine too. This
also makes it very easy to parse a string that is supposed exactly match a
number.

Certainly not the best concept but I'd prefer it to checking errno anyday
> and depending on the combination of optional's engaged and the iterator
> position you can determine whether it failed and if so, why. Well, just
> giving some spontaneous food for thought. I only very recently started with
> the number parsing part of my proposal, so the interface will probbaly be
> very unstable for quite a while. And then there's locales, and together
> with them a whole new world of problems...
>

std::isdigit() at least on gcc linux is inlined, so it looks like digits
don't require locales for the int conversions. Floating point will require
more careful handling with the comma vs period.

On Sunday, January 26, 2014 1:29:01 PM UTC-5, Thiago Macieira wrote:
>
>
> I'm sure that everyone intended that for their functions. That survived
> only
> until the first round of feedback or encounter with reality...
>

That's probably true, although I still can't imagine the thought process
that went into the 0 return value for atoi(), or gets() (but thats another
story).


> >    - No assumptions about use cases, we just want to parse strings. This
> >    means none of this automatic whitespace handling.
>
> Fair enough. It's easier to compose with other space checkers if you need
> to
> than to remove functionality.
>

Better to add functionality then have to remove stuff you don't want.

>
> >    - Efficient and inline
> >    - constexpr
>
> Efficient, definitely. Inline and constexpr? Forget it, it can't be done.
> Have
> you ever looked at the source of a string-to-double function? They're
> huge!
> This might be left as a suggestion to compilers to implement this as an
> intrinsic.
>
It might be nice to have compile time string -> double conversions, but I
agree for floating point its a huge complicated problem.
http://www.exploringbinary.com/how-strtod-works-and-sometimes-doesnt/

For int conversions, inline/constexpr might be doable (but not if we're
using errno).


> > //Attempts to parse s as an integer. The valid integer string consists
> of
> > the following:
> > //* '+' or '-' sign as the first character (- only acceptable for signed
> > integral types)
>
> But no U+2212?
>

We could consider unicode as well. That's a good question.

>
> > //* prefix (0) indicating octal base (applies only when base is 0 or 8)
> > //* prefix (0x or 0X) indicating hexadecimal base (applies only when
> base
> > is 16 or 0).
> > //* All of the rest of the characters MUST be digits.
>
> Where, by "digits", we understand the regular ASCII digits 0 to 9 and the
> letters that compose digits on this base, both in uppercase and lowercase.
>

Yes that's right.

Maybe we should add an extra boolean argument (defaulted to true) that be
used to disable the hex and octal prefixes. Sometimes you really want to
just parse a hex string without the 0x prefex. Adding an extra false to the
parameter list is nicer than doing this check for yourself. Its similar to
disabling the leading whitespace check of strtol().


> > //Returns true if an integral value was successfully parsed and stores
> the
> > value in val,
> > //otherwise returns false and leaves val unmodified.
> > //Sets errno to ERANGE if the string was an integer but would overflow
> type
> > integral.
>
> What if it failed to parse? What's the return condition?
>
> As others have said, using errno is too C, but then again this kind of
> function should be done in conjunction with the C people. Any improvements
> we
> need, they probably need too.
>

With overloading, templates, iterators, string_view, etc.. its not so C
compatible. Do we really care so much anyway? I don't like the idea of
handicapping C++ interfaces in the name of C compatibility.

As mentioned above, its a question of how to tell the user why the failure
occurred. If not through errno then it must be through the return value.

>
> > template <typename integral>
> > constexpr bool strto(string_view s, integral& val, int base);
>
> Replace string_view with a pair of InputIterators.
>

Agree, although I'd still want a string_view wrapper for convenience.


>
> Do you know what this means? Parsing char16_t, char32_t and wchar_t too.
>

Yes, but that's not so difficult.


>
> > //Same as the previous, except that instead of trying to parse the
> entire
> > string, we only parse the integral part.
> > //The beginning of the string must be an integer as specified above.
> Will
> > set tail to point to the end of the string after the integral part.
> > template <typename integral>
> > constexpr bool strto(string_view s, integral& val, int base,
> string_view&
> > tail);
>
> Same as above.
>
> > First off, all of these return bool which makes it very easy to check
> > whether or not parsing failed.
>
> That's the opposite of what most people want. Most people want to get the
> parsed number, not whether it succeded or failed. Maybe invert the logic?
>

I think that's a voting/bikeshed question (or use std::optional). I much
prefer the pass/fail in the return value because I can wrap the call to
strto and error check in a single line. Any code related to parsing is
always very heavy with error checking conditionals.

This:

int x;
if(!strto(s, x, 10)) {
  throw error;
}

vs this;

bool rc;
int x = strto(s, rc, 10);
if(!rc) {
  throw error;
}

The first version is more compact and in my opinion easier to read. Parsing
code almost always requires very careful error handling. Unless you know a
priori at compile time that the string really is an integer string and you
just need a conversion (which is an incredibly rare case).

The boolean return value also emphasizes that you must be diligent about
checking user input and I'd argue it encourages this behavior for novice
programmers. Its very easy for a beginner to just write x = atoi(s); and
move on. Later having to track down some bug that shows up somewhere else
because they didn't check the result.

The only thing you're buying with the value itself being returned is being
able to use strto() in an expression.

bool rc;
int y = (x + (5 * strto(s, rc, 10))) / 25;
if(!rc) {
  throw error;
}

I don't like the above idiom for several reasons;

   - The error checking comes after the whole expression, forcing the
   person reading the code to expend more mental energy to link the error
   check with the strto() call buried in all of that math. Multiple strto()
   calls in the same expression with multiple boolean variables is even more
   fun. Error checking should occur right after the thing being checked or
   even better within a single expression. Also it may be more efficient to
   avoid the unnecessary math in the case of the error (although the compiler
   is likely to reorder for you anyway).
   - This kind of expression is actually somewhat dangerous for floating
   point. The result coming from strto<float>()  parsing error could cause a
   signalling nan to be generated within the following expression and then
   throw an exception the programmer was not expecting. More fun debugging
   that one. You can avoid this but again that means the programmer has to
   think before using strto(). strtol() and friends already require too much
   thinking as I've already explained.




> That's what we do for {QString,QByteArray,QLocale}::to{Int,Double,etc.}.
> And
> one suggestion I received a few weeks ago was to add the overload that
> returns
> the end pointer and does not fail if there's more stuff after it.
>

That's pretty much the same as my second overload. I think both use cases
are very valid. In many situations you have a long string and you want to
parse the int at the beginning and then continue on with whatever it
supposed to be after it. Finding the position after the int of course
requires parsing out the int.


--

---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.

------=_Part_1392_31554110.1390774253421
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr">On Sunday, January 26, 2014 11:54:47 AM UTC-5, dgutson . w=
rote:<blockquote class=3D"gmail_quote" style=3D"margin: 0px 0px 0px 0.8ex; =
border-left-width: 1px; border-left-color: rgb(204, 204, 204); border-left-=
style: solid; padding-left: 1ex;"><br>Why didn't you include stringstream i=
n your review? E.g. something&nbsp;<br>like&nbsp;<a href=3D"https://code.go=
ogle.com/p/mili/source/browse/mili/string_utils.h#267" target=3D"_blank">ht=
tps://code.google.com/p/<wbr>mili/source/browse/mili/<wbr>string_utils.h#26=
7</a>&nbsp;<br></blockquote><div><br></div><div>Sure, I'll review it right =
now.</div><div>Stringstream is slow. It is also painful to use as you are f=
orced to have your string stored within a stringstream object. This makes i=
t even slower as you have to copy your string data into a string stream, po=
ssibly also allocating memory. Its a non-starter.</div><div><br></div><div>=
On Sunday, January 26, 2014 12:29:09 PM UTC-5, Miro Knejp wrote:<blockquote=
 class=3D"gmail_quote" style=3D"margin: 0px 0px 0px 0.8ex; border-left-widt=
h: 1px; border-left-color: rgb(204, 204, 204); border-left-style: solid; pa=
dding-left: 1ex;"><div text=3D"#000000" bgcolor=3D"#FFFFFF">I am currently =
facing the same problems while working on my format proposal:&nbsp;<a href=
=3D"https://groups.google.com/a/isocpp.org/forum/?fromgroups#!topic/std-pro=
posals/CIlWCTOe5kc" target=3D"_blank">https://groups.google.com/a/<wbr>isoc=
pp.org/forum/?fromgroups#!<wbr>topic/std-proposals/<wbr>CIlWCTOe5kc</a><br>=
All the existing functions work on null-terminated strings only which is to=
tally useless for my use cases</div></blockquote><div><br></div><div>Agree =
this is a completely unacceptable restriction. One day I hope null terminat=
ed strings will just go away and we will all use string_view.</div><div>&nb=
sp;</div><blockquote class=3D"gmail_quote" style=3D"margin: 0px 0px 0px 0.8=
ex; border-left-width: 1px; border-left-color: rgb(204, 204, 204); border-l=
eft-style: solid; padding-left: 1ex;"><div text=3D"#000000" bgcolor=3D"#FFF=
FFF"> as I am parsing substrings in-place. I intend to design the string pr=
ocessing stuff that I'm using general enough so it can be used independentl=
y but for now I just want to make it work in the first place.<br><blockquot=
e type=3D"cite"><div dir=3D"ltr"><div><br></div></div></blockquote>Please n=
o ERxxx nonsense. optional, expected, exceptions, pairs, whatever but no ER=
 codes, that's even more silly C. I currently base mine on iterators and pr=
ovide string_view as convenience overloads.<br></div></blockquote><div><br>=
</div><div>Fair enough, errno does seem to be pretty loathed.. The question=
 then is how do you know tell the user why it failed? Does the user even ca=
re? Perhaps instead of returning bool, you do the old style of returning an=
 int. 0 means success and different non-zero values can be used to represen=
t different reasons for failure such as parsing errors and range errors. We=
 can reuse errno tags for the return value or create an enum.</div><div><br=
></div><div>if((rc =3D strto(s, val, 10) !=3D 0) {<br>&nbsp; if(rc =3D=3D E=
RANGE) {<br>&nbsp; &nbsp; printf("Out of range!\n"); &nbsp;</div><div>&nbsp=
;} else {<br>&nbsp; &nbsp; printf("Parsing error!\n");&nbsp;</div><div>&nbs=
p;}</div><div>}</div><div><br></div><div>Exceptions are possible but rather=
 heavy weight. Constructing an exception usually means also constructing a =
string error message. Not only do you have to pay for the allocation of thi=
s string, it may not match the kind of error reporting you'd like to provid=
e to the caller, if any at all. The exception would also need to provide a =
kind of error code for quickly detecting why the conversion failed if you w=
ant to specially handle different failure modes.</div><div><br></div><div>I=
n many cases users might want to throw an exception but we should not force=
 that because exceptions can be too expensive for red hot parsing code. We =
might want to also add strto_e() or something that simply wraps strto() and=
 throws an exception for convenience when throwing on parsing error makes s=
ense.</div><div><br></div><blockquote class=3D"gmail_quote" style=3D"margin=
: 0px 0px 0px 0.8ex; border-left-width: 1px; border-left-color: rgb(204, 20=
4, 204); border-left-style: solid; padding-left: 1ex;"><div text=3D"#000000=
" bgcolor=3D"#FFFFFF">With iterators it could return the iterator to the fi=
rst element not part of the integer. a pair&lt;optional&lt;int&gt;, Iter&gt=
; or similar is a possibility. </div></blockquote><div><br></div><div>Seems=
 rather complicated to stuff all of that into the return value no? Returnin=
g a std::optional&lt;int&gt; would be ok because you can directly check the=
 return value with operator bool(). It still doesn't provide information on=
 why the failure occurred though.</div><div><br></div><div>I kind of like m=
y idea of having an overload where you pass a string_view/iterator by refer=
ence to get the end position if you want it. Not all out parameters have be=
 in the return value. Return values make it easy to write expressions. Usin=
g out parameters is perfectly fine too. This also makes it very easy to par=
se a string that is supposed exactly match a number.</div><div><br></div><b=
lockquote class=3D"gmail_quote" style=3D"margin: 0px 0px 0px 0.8ex; border-=
left-width: 1px; border-left-color: rgb(204, 204, 204); border-left-style: =
solid; padding-left: 1ex;"><div text=3D"#000000" bgcolor=3D"#FFFFFF">Certai=
nly not the best concept but I'd prefer it to checking errno anyday and dep=
ending on the combination of optional's engaged and the iterator position y=
ou can determine whether it failed and if so, why. Well, just giving some s=
pontaneous food for thought. I only very recently started with the number p=
arsing part of my proposal, so the interface will probbaly be very unstable=
 for quite a while. And then there's locales, and together with them a whol=
e new world of problems...<br></div></blockquote><div><br></div><div>std::i=
sdigit() at least on gcc linux is inlined, so it looks like digits don't re=
quire locales for the int conversions. Floating point will require more car=
eful handling with the comma vs period.</div></div><br>On Sunday, January 2=
6, 2014 1:29:01 PM UTC-5, Thiago Macieira wrote:<blockquote class=3D"gmail_=
quote" style=3D"margin: 0;margin-left: 0.8ex;border-left: 1px #ccc solid;pa=
dding-left: 1ex;">
<br>I'm sure that everyone intended that for their functions. That survived=
 only=20
<br>until the first round of feedback or encounter with reality...<br></blo=
ckquote><div><br></div><div>That's probably true, although I still can't im=
agine the thought process that went into the 0 return value for atoi(), or =
gets() (but thats another story).</div><div>&nbsp;</div><blockquote class=
=3D"gmail_quote" style=3D"margin: 0;margin-left: 0.8ex;border-left: 1px #cc=
c solid;padding-left: 1ex;">&gt; &nbsp; &nbsp;- No assumptions about use ca=
ses, we just want to parse strings. This
<br>&gt; &nbsp; &nbsp;means none of this automatic whitespace handling.
<br>
<br>Fair enough. It's easier to compose with other space checkers if you ne=
ed to=20
<br>than to remove functionality.
<br></blockquote><div><br></div><div>Better to add functionality then have =
to remove stuff you don't want.</div><blockquote class=3D"gmail_quote" styl=
e=3D"margin: 0;margin-left: 0.8ex;border-left: 1px #ccc solid;padding-left:=
 1ex;">
<br>&gt; &nbsp; &nbsp;- Efficient and inline
<br>&gt; &nbsp; &nbsp;- constexpr
<br>
<br>Efficient, definitely. Inline and constexpr? Forget it, it can't be don=
e. Have=20
<br>you ever looked at the source of a string-to-double function? They're h=
uge!=20
<br>This might be left as a suggestion to compilers to implement this as an=
=20
<br>intrinsic.
<br></blockquote><div>It might be nice to have compile time string -&gt; do=
uble conversions, but I agree for floating point its a huge complicated pro=
blem. &nbsp;</div><div><a href=3D"http://www.exploringbinary.com/how-strtod=
-works-and-sometimes-doesnt/">http://www.exploringbinary.com/how-strtod-wor=
ks-and-sometimes-doesnt/</a><br></div><div><br></div><div>For int conversio=
ns, inline/constexpr might be doable (but not if we're using errno).</div><=
div><br></div><blockquote class=3D"gmail_quote" style=3D"margin: 0;margin-l=
eft: 0.8ex;border-left: 1px #ccc solid;padding-left: 1ex;">
<br>&gt; //Attempts to parse s as an integer. The valid integer string cons=
ists of
<br>&gt; the following:
<br>&gt; //* '+' or '-' sign as the first character (- only acceptable for =
signed
<br>&gt; integral types)
<br>
<br>But no U+2212?
<br></blockquote><div><br></div><div>We could consider unicode as well. Tha=
t's a good question.&nbsp;</div><blockquote class=3D"gmail_quote" style=3D"=
margin: 0;margin-left: 0.8ex;border-left: 1px #ccc solid;padding-left: 1ex;=
">
<br>&gt; //* prefix (0) indicating octal base (applies only when base is 0 =
or 8)
<br>&gt; //* prefix (0x or 0X) indicating hexadecimal base (applies only wh=
en base
<br>&gt; is 16 or 0).
<br>&gt; //* All of the rest of the characters MUST be digits.
<br>
<br>Where, by "digits", we understand the regular ASCII digits 0 to 9 and t=
he=20
<br>letters that compose digits on this base, both in uppercase and lowerca=
se.
<br></blockquote><div><br></div><div>Yes that's right.&nbsp;</div><div><br>=
</div><div>Maybe we should add an extra boolean argument (defaulted to true=
) that be used to disable the hex and octal prefixes. Sometimes you really =
want to just parse a hex string without the 0x prefex. Adding an extra fals=
e to the parameter list is nicer than doing this check for yourself. Its si=
milar to disabling the leading whitespace check of strtol().</div><div><br>=
</div><blockquote class=3D"gmail_quote" style=3D"margin: 0;margin-left: 0.8=
ex;border-left: 1px #ccc solid;padding-left: 1ex;">
<br>&gt; //Returns true if an integral value was successfully parsed and st=
ores the
<br>&gt; value in val,
<br>&gt; //otherwise returns false and leaves val unmodified.
<br>&gt; //Sets errno to ERANGE if the string was an integer but would over=
flow type
<br>&gt; integral.
<br>
<br>What if it failed to parse? What's the return condition?
<br>
<br>As others have said, using errno is too C, but then again this kind of=
=20
<br>function should be done in conjunction with the C people. Any improveme=
nts we=20
<br>need, they probably need too.
<br></blockquote><div><br></div><div>With overloading, templates, iterators=
, string_view, etc.. its not so C compatible. Do we really care so much any=
way? I don't like the idea of handicapping C++ interfaces in the name of C =
compatibility.</div><div><br></div><div>As mentioned above, its a question =
of how to tell the user why the failure occurred. If not through errno then=
 it must be through the return value.&nbsp;</div><blockquote class=3D"gmail=
_quote" style=3D"margin: 0;margin-left: 0.8ex;border-left: 1px #ccc solid;p=
adding-left: 1ex;">
<br>&gt; template &lt;typename integral&gt;
<br>&gt; constexpr bool strto(string_view s, integral&amp; val, int base);
<br>
<br>Replace string_view with a pair of InputIterators.
<br></blockquote><div><br></div><div>Agree, although I'd still want a strin=
g_view wrapper for convenience.</div><div>&nbsp;</div><blockquote class=3D"=
gmail_quote" style=3D"margin: 0;margin-left: 0.8ex;border-left: 1px #ccc so=
lid;padding-left: 1ex;">
<br>Do you know what this means? Parsing char16_t, char32_t and wchar_t too=
..
<br></blockquote><div><br></div><div>Yes, but that's not so difficult.</div=
><div>&nbsp;</div><blockquote class=3D"gmail_quote" style=3D"margin: 0;marg=
in-left: 0.8ex;border-left: 1px #ccc solid;padding-left: 1ex;">
<br>&gt; //Same as the previous, except that instead of trying to parse the=
 entire
<br>&gt; string, we only parse the integral part.
<br>&gt; //The beginning of the string must be an integer as specified abov=
e. Will
<br>&gt; set tail to point to the end of the string after the integral part=
..
<br>&gt; template &lt;typename integral&gt;
<br>&gt; constexpr bool strto(string_view s, integral&amp; val, int base, s=
tring_view&amp;
<br>&gt; tail);
<br>
<br>Same as above.
<br>
<br>&gt; First off, all of these return bool which makes it very easy to ch=
eck
<br>&gt; whether or not parsing failed.
<br>
<br>That's the opposite of what most people want. Most people want to get t=
he=20
<br>parsed number, not whether it succeded or failed. Maybe invert the logi=
c?
<br></blockquote><div><br></div><div>I think that's a voting/bikeshed quest=
ion (or use std::optional). I much prefer the pass/fail in the return value=
 because I can wrap&nbsp;<span style=3D"font-size: 13px;">the call to strto=
 and error check in a single line. Any code related to parsing is always ve=
ry heavy with error checking conditionals.</span></div><div><br></div><div>=
This:</div><div><br></div><div>int x;</div><div>if(!strto(s, x, 10)) {<br>&=
nbsp; throw error;</div><div>}&nbsp;</div><div><br></div><div>vs this;</div=
><div><br></div><div>bool rc;</div><div>int x =3D strto(s, rc, 10);</div><d=
iv>if(!rc) {<br>&nbsp; throw error;</div><div>}</div><div><br></div><div>Th=
e first version is more compact and in my opinion easier to read. Parsing c=
ode almost always requires very careful error handling. Unless you know a p=
riori at compile time that the string really is an integer string and you j=
ust need a conversion (which is an incredibly rare case).&nbsp;</div><div><=
br></div><div>The boolean return value also emphasizes that you must be dil=
igent about checking user input and I'd argue it encourages this behavior f=
or novice programmers. Its very easy for a beginner to just write x =3D ato=
i(s); and move on. Later having to track down some bug that shows up somewh=
ere else because they didn't check the result.</div><div><br></div><div>The=
 only thing you're buying with the value itself being returned is being abl=
e to use strto() in an expression.</div><div><br></div><div>bool rc;</div><=
div>int y =3D (x + (5 * strto(s, rc, 10))) / 25;</div><div>if(!rc) {<br>&nb=
sp; throw error;</div><div>}</div><div><br></div><div>I don't like the abov=
e idiom for several reasons;</div><div><ul><li><span style=3D"line-height: =
normal;">The error checking comes after the whole expression, forcing the p=
erson reading the code to expend more mental energy to link the error check=
 with the strto() call buried in all of that math. Multiple strto() calls i=
n the same expression with multiple boolean variables is even more fun. Err=
or checking should occur right after the thing being checked or even better=
 within a single expression. Also it may be more efficient to avoid the&nbs=
p;unnecessary&nbsp;math in the case of the error (although the compiler is =
likely to reorder for you anyway).</span></li><li><span style=3D"line-heigh=
t: normal;">This kind of expression is actually somewhat dangerous for floa=
ting point. The result coming from strto&lt;float&gt;() &nbsp;parsing error=
 could cause a signalling nan to be generated within the following expressi=
on and then throw an exception the programmer was not expecting. More fun d=
ebugging that one. You can avoid this but again that means the programmer h=
as to think before using strto(). strtol() and friends already require too =
much thinking as I've already explained.</span></li></ul></div><div><br></d=
iv><div><br></div><blockquote class=3D"gmail_quote" style=3D"margin: 0;marg=
in-left: 0.8ex;border-left: 1px #ccc solid;padding-left: 1ex;">
<br>That's what we do for {QString,QByteArray,QLocale}::<wbr>to{Int,Double,=
etc.}. And=20
<br>one suggestion I received a few weeks ago was to add the overload that =
returns=20
<br>the end pointer and does not fail if there's more stuff after it.
<br></blockquote><div><br></div><div>That's pretty much the same as my seco=
nd overload. I think both use cases are very valid. In many situations you =
have a long string and you want to parse the int at the beginning and then =
continue on with whatever it supposed to be after it. Finding the position =
after the int of course requires parsing out the int.</div><div>&nbsp;</div=
></div>

<p></p>

-- <br />
&nbsp;<br />
--- <br />
You received this message because you are subscribed to the Google Groups &=
quot;ISO C++ Standard - Future Proposals&quot; group.<br />
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to std-proposals+unsubscribe@isocpp.org.<br />
To post to this group, send email to std-proposals@isocpp.org.<br />
Visit this group at <a href=3D"http://groups.google.com/a/isocpp.org/group/=
std-proposals/">http://groups.google.com/a/isocpp.org/group/std-proposals/<=
/a>.<br />

------=_Part_1392_31554110.1390774253421--

.


Author: Bengt Gustafsson <bengt.gustafsson@beamways.com>
Date: Sun, 26 Jan 2014 14:13:47 -0800 (PST)
Raw View
------=_Part_1954_31332311.1390774427129
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

What we really need for the input string is "something that begin() and=20
end() functions work for". This avoids the tedious sending of two=20
iterators, while covering vectors, string literals (I hope), strings and=20
string_views without overloading.

In the fairly high percentage of cases when we want to do skipspace we need=
=20
to have a easy way to do that. One may be:

auto skipspace(const RNG& src)->string_view<decltype(*begin(src)> { ... }

I think that string_view<T> will work for at least MOST of the stuff that=
=20
begin() and end() works for, so it is a logical return type choice,=20
although a more generic range<T> could also be used, or even=20
_impementation_dependant_ although that puts us in a place where we need to=
=20
define what this undefined type can be used for.

Anyhow, this helper allows us to write

strto(dest, skipspace(src));

Which I think is a decent syntax. (and I do think that the destination=20
should be the first parameter).

However, we still not allow trailing spaces, so maybe skipspace should be=
=20
like the classic strip instead, i.e. be able to handle both ends of the=20
string by flags or like this sketch:

strip_front(src)
strip_back(src)
strip (src) { return strip_back(strip_front(src)); }

With the begin(src)/end(src) usage of the source string the type of the=20
value indicating the final position should be of type decltype(begin(src))&=
=20
I guess. This is not a range so the reassembly of the range is up to the=20
caller, which is not optimal. But the alternative, to use the same template=
=20
parameter for s and tail. The problem with this is that we need to be able=
=20
to set its contents but the only thing we know of this type is that it has=
=20
begin() and end() defined for it...

I have no particularly good solution to this problem. I had a similar=20
proposal as this one in the works, but it also mandates a string_view as=20
the source. The main difference was that there were two different function=
=20
names, and one of them modified the string_view in situ. I think that while=
=20
this does not solve the problem with updating a generic range it gives a=20
nicer type of code at the call site for the parsing case (use only one=20
string_view as the "cursor" of parsing:

bool from_string(T& dest, const string_view<C>& src);  // requires the src=
=20
to only contain the string representation of T
bool parse_string(T& dest, string_vew<C>& src);          // updates src to=
=20
reflect how many chars were consumed converting to T


Den s=C3=B6ndagen den 26:e januari 2014 kl. 13:26:57 UTC-8 skrev Vicente J.=
=20
Botet Escriba:
>
>  Le 26/01/14 18:29, Miro Knejp a =C3=A9crit :
> =20
>
> Am 26.01.2014 17:25, schrieb Matt Fioravante:
> =20
> string to T (int, float, etc..) conversions seem like to rather easy task=
=20
> (aside from floating point round trip issues), and yet for the life of C=
=20
> and C++ the standard library has consistently failed to provide a decent=
=20
> interface.=20
>
>  =20
>  =20
>  At this point, I think we are ready to define a new set of int/float=20
> parsing routines.
>
> Design goals:
> =20
>    - Easy to use, usage is obvious.=20
>    - No assumptions about use cases, we just want to parse strings. This=
=20
>    means none of this automatic whitespace handling.=20
>    - Efficient and inline=20
>    - constexpr=20
>
> Here is a first attempt for an integer parsing routine.
> =20
>  //Attempts to parse s as an integer. The valid integer string consists=
=20
> of the following:
> //* '+' or '-' sign as the first character (- only acceptable for signed=
=20
> integral types)
> //* prefix (0) indicating octal base (applies only when base is 0 or 8)
> //* prefix (0x or 0X) indicating hexadecimal base (applies only when base=
=20
> is 16 or 0).
> //* All of the rest of the characters MUST be digits.
> //Returns true if an integral value was successfully parsed and stores th=
e=20
> value in val,
> //otherwise returns false and leaves val unmodified.=20
> //Sets errno to ERANGE if the string was an integer but would overflow=20
> type integral.
> template <typename integral>
> constexpr bool strto(string_view s, integral& val, int base);
> =20
> Please no ERxxx nonsense. optional, expected, exceptions, pairs, whatever=
=20
> but no ER codes, that's even more silly C.
>
> +1
>
> Vicente
>
>=20

--=20

---=20
You received this message because you are subscribed to the Google Groups "=
ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposa=
ls/.

------=_Part_1954_31332311.1390774427129
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr">What we really need for the input string is "something tha=
t begin() and end() functions work for". This avoids the tedious sending of=
 two iterators, while covering vectors, string literals (I hope), strings a=
nd string_views without overloading.<div><br></div><div>In the fairly high =
percentage of cases when we want to do skipspace we need to have a easy way=
 to do that. One may be:</div><div><br></div><div>auto skipspace(const RNG&=
amp; src)-&gt;string_view&lt;decltype(*begin(src)&gt;&nbsp;{ ... }</div><di=
v><br></div><div>I think that string_view&lt;T&gt; will work for at least M=
OST of the stuff that begin() and end() works for, so it is a logical retur=
n type choice, although a more generic range&lt;T&gt; could also be used, o=
r even _impementation_dependant_ although that puts us in a place where we =
need to define what this undefined type can be used for.</div><div><br></di=
v><div>Anyhow, this helper allows us to write</div><div><br></div><div>strt=
o(dest, skipspace(src));</div><div><br></div><div>Which I think is a decent=
 syntax. (and I do think that the destination should be the first parameter=
).</div><div><br></div><div>However, we still not allow trailing spaces, so=
 maybe skipspace should be like the classic strip instead, i.e. be able to =
handle both ends of the string by flags or like this sketch:</div><div><br>=
</div><div>strip_front(src)</div><div>strip_back(src)</div><div>strip (src)=
 { return strip_back(strip_front(src)); }</div><div><br></div><div>With the=
 begin(src)/end(src) usage of the source string the type of the value indic=
ating the final position should be of type decltype(begin(src))&amp; I gues=
s. This is not a range so the reassembly of the range is up to the caller, =
which is not optimal. But the alternative, to use the same template paramet=
er for s and tail. The problem with this is that we need to be able to set =
its contents but the only thing we know of this type is that it has begin()=
 and end() defined for it...</div><div><br></div><div>I have no particularl=
y good solution to this problem. I had a similar proposal as this one in th=
e works, but it also mandates a string_view as the source. The main differe=
nce was that there were two different function names, and one of them modif=
ied the string_view in situ. I think that while this does not solve the pro=
blem with updating a generic range it gives a nicer type of code at the cal=
l site for the parsing case (use only one string_view as the "cursor" of pa=
rsing:</div><div><br></div><div>bool from_string(T&amp; dest, const string_=
view&lt;C&gt;&amp; src); &nbsp;// requires the src to only contain the stri=
ng representation of T</div><div>bool parse_string(T&amp; dest, string_vew&=
lt;C&gt;&amp; src); &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;// updates src to ref=
lect how many chars were consumed converting to T</div><div><br></div><div>=
<br>Den s=C3=B6ndagen den 26:e januari 2014 kl. 13:26:57 UTC-8 skrev Vicent=
e J. Botet Escriba:<blockquote class=3D"gmail_quote" style=3D"margin: 0;mar=
gin-left: 0.8ex;border-left: 1px #ccc solid;padding-left: 1ex;">
 =20
   =20
 =20
  <div bgcolor=3D"#FFFFFF" text=3D"#000000">
    <div>Le 26/01/14 18:29, Miro Knejp a =C3=A9crit&nbsp;:<br>
    </div>
    <blockquote type=3D"cite">
     =20
      <br>
      <div>Am 26.01.2014 17:25, schrieb Matt
        Fioravante:<br>
      </div>
      <blockquote type=3D"cite">
        <div dir=3D"ltr">string to T (int, float, etc..) conversions seem
          like to rather easy task (aside from floating point round trip
          issues), and yet for the life of C and C++ the standard
          library has consistently failed to provide a decent interface.
          <div><br>
          </div>
        </div>
      </blockquote>
      <br>
      <blockquote type=3D"cite">
        <div dir=3D"ltr">
          <div>
            <div><br>
            </div>
          </div>
          <div>At this point, I think we are ready to define a new set
            of int/float parsing routines.</div>
          <div><br>
            Design goals:</div>
          <div>
            <ul>
              <li><span style=3D"line-height:normal">Easy to use, usage
                  is obvious.</span></li>
              <li><span style=3D"line-height:normal">No assumptions
                  about use cases, we just want to parse strings. This
                  means none of this automatic whitespace handling.</span><=
/li>
              <li><span style=3D"line-height:normal">Efficient and
                  inline</span></li>
              <li><span style=3D"line-height:normal">constexpr</span></li>
            </ul>
            <div>Here is a first attempt for an integer parsing routine.</d=
iv>
          </div>
          <div><br>
          </div>
          <div>//Attempts to parse s as an integer. The valid integer
            string consists of the following:</div>
          <div>//* '+' or '-' sign as the first character (- only
            acceptable for signed integral types)</div>
          <div>//* prefix (0) indicating octal base (applies only when
            base is 0 or 8)</div>
          <div>//* prefix (0x or 0X) indicating hexadecimal base
            (applies only when base is 16 or 0).</div>
          <div>//* All of the rest of the characters MUST be digits.</div>
          <div>//Returns true if an integral value was successfully
            parsed and stores the value in val,</div>
          <div>//otherwise returns false and leaves val unmodified.&nbsp;</=
div>
          <div>//Sets errno to ERANGE if the string was an integer but
            would overflow type integral.</div>
          <div>template &lt;typename integral&gt;<br>
            constexpr bool strto(string_view s, integral&amp; val, int
            base);</div>
        </div>
      </blockquote>
      Please no ERxxx nonsense. optional, expected, exceptions, pairs,
      whatever but no ER codes, that's even more silly C.</blockquote>
    +1<br>
    <br>
    Vicente<br>
    <br>
  </div>

</blockquote></div></div>

<p></p>

-- <br />
&nbsp;<br />
--- <br />
You received this message because you are subscribed to the Google Groups &=
quot;ISO C++ Standard - Future Proposals&quot; group.<br />
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to std-proposals+unsubscribe@isocpp.org.<br />
To post to this group, send email to std-proposals@isocpp.org.<br />
Visit this group at <a href=3D"http://groups.google.com/a/isocpp.org/group/=
std-proposals/">http://groups.google.com/a/isocpp.org/group/std-proposals/<=
/a>.<br />

------=_Part_1954_31332311.1390774427129--

.


Author: Matt Fioravante <fmatthew5876@gmail.com>
Date: Sun, 26 Jan 2014 14:25:39 -0800 (PST)
Raw View
------=_Part_490_25591144.1390775139325
Content-Type: text/plain; charset=UTF-8



On Sunday, January 26, 2014 5:13:47 PM UTC-5, Bengt Gustafsson wrote:
>
> What we really need for the input string is "something that begin() and
> end() functions work for". This avoids the tedious sending of two
> iterators, while covering vectors, string literals (I hope), strings and
> string_views without overloading.
>
> In the fairly high percentage of cases when we want to do skipspace we
> need to have a easy way to do that. One may be:
>
> auto skipspace(const RNG& src)->string_view<decltype(*begin(src)> { ... }
>

I think a general facility for stripping spaces, quotes etc.. would be
nice. But that's a separate proposal.

template <typename F>
string_view lstrip(string_view s, F conditional);
string_view lstrip(string_view s, char c);
//along with rstrip() and strip() for both sides.

Now you can do strto(lstrip(s, std::isspace), val, 10); if you want the
whitespace behavior.

--

---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.

------=_Part_490_25591144.1390775139325
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr"><br><br>On Sunday, January 26, 2014 5:13:47 PM UTC-5, Beng=
t Gustafsson wrote:<blockquote class=3D"gmail_quote" style=3D"margin: 0;mar=
gin-left: 0.8ex;border-left: 1px #ccc solid;padding-left: 1ex;"><div dir=3D=
"ltr">What we really need for the input string is "something that begin() a=
nd end() functions work for". This avoids the tedious sending of two iterat=
ors, while covering vectors, string literals (I hope), strings and string_v=
iews without overloading.<div><br></div><div>In the fairly high percentage =
of cases when we want to do skipspace we need to have a easy way to do that=
.. One may be:</div><div><br></div><div>auto skipspace(const RNG&amp; src)-&=
gt;string_view&lt;decltype(*<wbr>begin(src)&gt;&nbsp;{ ... }</div></div></b=
lockquote><div><br></div><div>I think a general facility for stripping spac=
es, quotes etc.. would be nice. But that's a separate proposal.</div><div><=
br></div><div>template &lt;typename F&gt;</div><div>string_view lstrip(stri=
ng_view s, F conditional);</div><div>string_view lstrip(string_view s, char=
 c);</div><div>//along with rstrip() and strip() for both sides.&nbsp;</div=
><div><br></div><div>Now you can do strto(lstrip(s, std::isspace), val, 10)=
; if you want the whitespace behavior.</div></div>

<p></p>

-- <br />
&nbsp;<br />
--- <br />
You received this message because you are subscribed to the Google Groups &=
quot;ISO C++ Standard - Future Proposals&quot; group.<br />
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to std-proposals+unsubscribe@isocpp.org.<br />
To post to this group, send email to std-proposals@isocpp.org.<br />
Visit this group at <a href=3D"http://groups.google.com/a/isocpp.org/group/=
std-proposals/">http://groups.google.com/a/isocpp.org/group/std-proposals/<=
/a>.<br />

------=_Part_490_25591144.1390775139325--

.


Author: Bengt Gustafsson <bengt.gustafsson@beamways.com>
Date: Sun, 26 Jan 2014 14:26:09 -0800 (PST)
Raw View
------=_Part_1256_13452680.1390775169654
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

I have thought about proposing an error_return class which throws in its=20
destructor unless an operator bool() has been executed. It has an ignore()=
=20
method for the case that you don't care about errors. This class could be=
=20
augmented with an error code/error string or something that indicates what=
=20
the cause of the error was. Also it could have a rethtrow() method you can=
=20
call if you explicitly want a failed conversion to throw. Use cases:

from_string(x, "123").ignore();  // Ignore error

if (!from_string(...))
    handle error;

from_string(...).rethrow();    // Ask for an exception if conversion fails.=
=20
(Bikeshed warning here!)

from_string(...);       // Throw on first call, even if conversion worked.=
=20
Or preferably (if possible to implement) generate a static_assert.

This would be a general class useful in this type of cases througout the=20
standard (and proprietary) C++ libraries. It could be templated on the=20
error information's type I guess.


Den s=C3=B6ndagen den 26:e januari 2014 kl. 14:13:47 UTC-8 skrev Bengt=20
Gustafsson:
>
> What we really need for the input string is "something that begin() and=
=20
> end() functions work for". This avoids the tedious sending of two=20
> iterators, while covering vectors, string literals (I hope), strings and=
=20
> string_views without overloading.
>
> In the fairly high percentage of cases when we want to do skipspace we=20
> need to have a easy way to do that. One may be:
>
> auto skipspace(const RNG& src)->string_view<decltype(*begin(src)> { ... }
>
> I think that string_view<T> will work for at least MOST of the stuff that=
=20
> begin() and end() works for, so it is a logical return type choice,=20
> although a more generic range<T> could also be used, or even=20
> _impementation_dependant_ although that puts us in a place where we need =
to=20
> define what this undefined type can be used for.
>
> Anyhow, this helper allows us to write
>
> strto(dest, skipspace(src));
>
> Which I think is a decent syntax. (and I do think that the destination=20
> should be the first parameter).
>
> However, we still not allow trailing spaces, so maybe skipspace should be=
=20
> like the classic strip instead, i.e. be able to handle both ends of the=
=20
> string by flags or like this sketch:
>
> strip_front(src)
> strip_back(src)
> strip (src) { return strip_back(strip_front(src)); }
>
> With the begin(src)/end(src) usage of the source string the type of the=
=20
> value indicating the final position should be of type decltype(begin(src)=
)&=20
> I guess. This is not a range so the reassembly of the range is up to the=
=20
> caller, which is not optimal. But the alternative, to use the same templa=
te=20
> parameter for s and tail. The problem with this is that we need to be abl=
e=20
> to set its contents but the only thing we know of this type is that it ha=
s=20
> begin() and end() defined for it...
>
> I have no particularly good solution to this problem. I had a similar=20
> proposal as this one in the works, but it also mandates a string_view as=
=20
> the source. The main difference was that there were two different functio=
n=20
> names, and one of them modified the string_view in situ. I think that whi=
le=20
> this does not solve the problem with updating a generic range it gives a=
=20
> nicer type of code at the call site for the parsing case (use only one=20
> string_view as the "cursor" of parsing:
>
> bool from_string(T& dest, const string_view<C>& src);  // requires the sr=
c=20
> to only contain the string representation of T
> bool parse_string(T& dest, string_vew<C>& src);          // updates src t=
o=20
> reflect how many chars were consumed converting to T
>
>
> Den s=C3=B6ndagen den 26:e januari 2014 kl. 13:26:57 UTC-8 skrev Vicente =
J.=20
> Botet Escriba:
>>
>>  Le 26/01/14 18:29, Miro Knejp a =C3=A9crit :
>> =20
>>
>> Am 26.01.2014 17:25, schrieb Matt Fioravante:
>> =20
>> string to T (int, float, etc..) conversions seem like to rather easy tas=
k=20
>> (aside from floating point round trip issues), and yet for the life of C=
=20
>> and C++ the standard library has consistently failed to provide a decent=
=20
>> interface.=20
>>
>>  =20
>>  =20
>>  At this point, I think we are ready to define a new set of int/float=20
>> parsing routines.
>>
>> Design goals:
>> =20
>>    - Easy to use, usage is obvious.=20
>>    - No assumptions about use cases, we just want to parse strings. This=
=20
>>    means none of this automatic whitespace handling.=20
>>    - Efficient and inline=20
>>    - constexpr=20
>>
>> Here is a first attempt for an integer parsing routine.
>> =20
>>  //Attempts to parse s as an integer. The valid integer string consists=
=20
>> of the following:
>> //* '+' or '-' sign as the first character (- only acceptable for signed=
=20
>> integral types)
>> //* prefix (0) indicating octal base (applies only when base is 0 or 8)
>> //* prefix (0x or 0X) indicating hexadecimal base (applies only when bas=
e=20
>> is 16 or 0).
>> //* All of the rest of the characters MUST be digits.
>> //Returns true if an integral value was successfully parsed and stores=
=20
>> the value in val,
>> //otherwise returns false and leaves val unmodified.=20
>> //Sets errno to ERANGE if the string was an integer but would overflow=
=20
>> type integral.
>> template <typename integral>
>> constexpr bool strto(string_view s, integral& val, int base);
>> =20
>> Please no ERxxx nonsense. optional, expected, exceptions, pairs, whateve=
r=20
>> but no ER codes, that's even more silly C.
>>
>> +1
>>
>> Vicente
>>
>>=20

--=20

---=20
You received this message because you are subscribed to the Google Groups "=
ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposa=
ls/.

------=_Part_1256_13452680.1390775169654
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr">I have thought about proposing an error_return class which=
 throws in its destructor unless an operator bool() has been executed. It h=
as an ignore() method for the case that you don't care about errors. This c=
lass could be augmented with an error code/error string or something that i=
ndicates what the cause of the error was. Also it could have a rethtrow() m=
ethod you can call if you explicitly want a failed conversion to throw. Use=
 cases:<div><br></div><div>from_string(x, "123").ignore(); &nbsp;// Ignore =
error</div><div><br></div><div>if (!from_string(...))</div><div>&nbsp; &nbs=
p; handle error;</div><div><br></div><div>from_string(...).rethrow(); &nbsp=
; &nbsp;// Ask for an exception if conversion fails. (Bikeshed warning here=
!)</div><div><br></div><div>from_string(...); &nbsp; &nbsp; &nbsp; // Throw=
 on first call, even if conversion worked. Or preferably (if possible to im=
plement) generate a static_assert.</div><div><br></div><div>This would be a=
 general class useful in this type of cases througout the standard (and pro=
prietary) C++ libraries. It could be templated on the error information's t=
ype I guess.</div><div><br><br>Den s=C3=B6ndagen den 26:e januari 2014 kl. =
14:13:47 UTC-8 skrev Bengt Gustafsson:<blockquote class=3D"gmail_quote" sty=
le=3D"margin: 0;margin-left: 0.8ex;border-left: 1px #ccc solid;padding-left=
: 1ex;"><div dir=3D"ltr">What we really need for the input string is "somet=
hing that begin() and end() functions work for". This avoids the tedious se=
nding of two iterators, while covering vectors, string literals (I hope), s=
trings and string_views without overloading.<div><br></div><div>In the fair=
ly high percentage of cases when we want to do skipspace we need to have a =
easy way to do that. One may be:</div><div><br></div><div>auto skipspace(co=
nst RNG&amp; src)-&gt;string_view&lt;decltype(*<wbr>begin(src)&gt;&nbsp;{ .=
... }</div><div><br></div><div>I think that string_view&lt;T&gt; will work f=
or at least MOST of the stuff that begin() and end() works for, so it is a =
logical return type choice, although a more generic range&lt;T&gt; could al=
so be used, or even _impementation_dependant_ although that puts us in a pl=
ace where we need to define what this undefined type can be used for.</div>=
<div><br></div><div>Anyhow, this helper allows us to write</div><div><br></=
div><div>strto(dest, skipspace(src));</div><div><br></div><div>Which I thin=
k is a decent syntax. (and I do think that the destination should be the fi=
rst parameter).</div><div><br></div><div>However, we still not allow traili=
ng spaces, so maybe skipspace should be like the classic strip instead, i.e=
.. be able to handle both ends of the string by flags or like this sketch:</=
div><div><br></div><div>strip_front(src)</div><div>strip_back(src)</div><di=
v>strip (src) { return strip_back(strip_front(src)); }</div><div><br></div>=
<div>With the begin(src)/end(src) usage of the source string the type of th=
e value indicating the final position should be of type decltype(begin(src)=
)&amp; I guess. This is not a range so the reassembly of the range is up to=
 the caller, which is not optimal. But the alternative, to use the same tem=
plate parameter for s and tail. The problem with this is that we need to be=
 able to set its contents but the only thing we know of this type is that i=
t has begin() and end() defined for it...</div><div><br></div><div>I have n=
o particularly good solution to this problem. I had a similar proposal as t=
his one in the works, but it also mandates a string_view as the source. The=
 main difference was that there were two different function names, and one =
of them modified the string_view in situ. I think that while this does not =
solve the problem with updating a generic range it gives a nicer type of co=
de at the call site for the parsing case (use only one string_view as the "=
cursor" of parsing:</div><div><br></div><div>bool from_string(T&amp; dest, =
const string_view&lt;C&gt;&amp; src); &nbsp;// requires the src to only con=
tain the string representation of T</div><div>bool parse_string(T&amp; dest=
, string_vew&lt;C&gt;&amp; src); &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;// updat=
es src to reflect how many chars were consumed converting to T</div><div><b=
r></div><div><br>Den s=C3=B6ndagen den 26:e januari 2014 kl. 13:26:57 UTC-8=
 skrev Vicente J. Botet Escriba:<blockquote class=3D"gmail_quote" style=3D"=
margin:0;margin-left:0.8ex;border-left:1px #ccc solid;padding-left:1ex">
 =20
   =20
 =20
  <div bgcolor=3D"#FFFFFF" text=3D"#000000">
    <div>Le 26/01/14 18:29, Miro Knejp a =C3=A9crit&nbsp;:<br>
    </div>
    <blockquote type=3D"cite">
     =20
      <br>
      <div>Am 26.01.2014 17:25, schrieb Matt
        Fioravante:<br>
      </div>
      <blockquote type=3D"cite">
        <div dir=3D"ltr">string to T (int, float, etc..) conversions seem
          like to rather easy task (aside from floating point round trip
          issues), and yet for the life of C and C++ the standard
          library has consistently failed to provide a decent interface.
          <div><br>
          </div>
        </div>
      </blockquote>
      <br>
      <blockquote type=3D"cite">
        <div dir=3D"ltr">
          <div>
            <div><br>
            </div>
          </div>
          <div>At this point, I think we are ready to define a new set
            of int/float parsing routines.</div>
          <div><br>
            Design goals:</div>
          <div>
            <ul>
              <li><span style=3D"line-height:normal">Easy to use, usage
                  is obvious.</span></li>
              <li><span style=3D"line-height:normal">No assumptions
                  about use cases, we just want to parse strings. This
                  means none of this automatic whitespace handling.</span><=
/li>
              <li><span style=3D"line-height:normal">Efficient and
                  inline</span></li>
              <li><span style=3D"line-height:normal">constexpr</span></li>
            </ul>
            <div>Here is a first attempt for an integer parsing routine.</d=
iv>
          </div>
          <div><br>
          </div>
          <div>//Attempts to parse s as an integer. The valid integer
            string consists of the following:</div>
          <div>//* '+' or '-' sign as the first character (- only
            acceptable for signed integral types)</div>
          <div>//* prefix (0) indicating octal base (applies only when
            base is 0 or 8)</div>
          <div>//* prefix (0x or 0X) indicating hexadecimal base
            (applies only when base is 16 or 0).</div>
          <div>//* All of the rest of the characters MUST be digits.</div>
          <div>//Returns true if an integral value was successfully
            parsed and stores the value in val,</div>
          <div>//otherwise returns false and leaves val unmodified.&nbsp;</=
div>
          <div>//Sets errno to ERANGE if the string was an integer but
            would overflow type integral.</div>
          <div>template &lt;typename integral&gt;<br>
            constexpr bool strto(string_view s, integral&amp; val, int
            base);</div>
        </div>
      </blockquote>
      Please no ERxxx nonsense. optional, expected, exceptions, pairs,
      whatever but no ER codes, that's even more silly C.</blockquote>
    +1<br>
    <br>
    Vicente<br>
    <br>
  </div>

</blockquote></div></div></blockquote></div></div>

<p></p>

-- <br />
&nbsp;<br />
--- <br />
You received this message because you are subscribed to the Google Groups &=
quot;ISO C++ Standard - Future Proposals&quot; group.<br />
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to std-proposals+unsubscribe@isocpp.org.<br />
To post to this group, send email to std-proposals@isocpp.org.<br />
Visit this group at <a href=3D"http://groups.google.com/a/isocpp.org/group/=
std-proposals/">http://groups.google.com/a/isocpp.org/group/std-proposals/<=
/a>.<br />

------=_Part_1256_13452680.1390775169654--

.


Author: Matt Fioravante <fmatthew5876@gmail.com>
Date: Sun, 26 Jan 2014 14:33:47 -0800 (PST)
Raw View
------=_Part_2737_30284286.1390775627840
Content-Type: text/plain; charset=UTF-8

An just to further emphasize the question of returning pass/fail vs
returning the value.

My general philosophy with parsing is to emphasize error handling first and
then the actual results second. The success or failure of the parse should
be thrown right in your face, forcing you to deal with it. This helps
remind us to write more correct code. I'd be happy to know if people agree
or not.

--

---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.

------=_Part_2737_30284286.1390775627840
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr">An just to further emphasize the question of returning pas=
s/fail vs returning the value.<br><br><div>My general philosophy with parsi=
ng is to emphasize error handling first and then the actual results second.=
 The success or failure of the parse should be thrown right in your face, f=
orcing you to deal with it. This helps remind us to write more correct code=
.. I'd be happy to know if people agree or not.</div></div>

<p></p>

-- <br />
&nbsp;<br />
--- <br />
You received this message because you are subscribed to the Google Groups &=
quot;ISO C++ Standard - Future Proposals&quot; group.<br />
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to std-proposals+unsubscribe@isocpp.org.<br />
To post to this group, send email to std-proposals@isocpp.org.<br />
Visit this group at <a href=3D"http://groups.google.com/a/isocpp.org/group/=
std-proposals/">http://groups.google.com/a/isocpp.org/group/std-proposals/<=
/a>.<br />

------=_Part_2737_30284286.1390775627840--

.


Author: Roland Bock <rbock@eudoxos.de>
Date: Sun, 26 Jan 2014 23:41:55 +0100
Raw View
This is a multi-part message in MIME format.
--------------060906060902020204020509
Content-Type: text/plain; charset=UTF-8

On 2014-01-26 23:33, Matt Fioravante wrote:
> An just to further emphasize the question of returning pass/fail vs
> returning the value.
>
> My general philosophy with parsing is to emphasize error handling
> first and then the actual results second. The success or failure of
> the parse should be thrown right in your face, forcing you to deal
> with it. This helps remind us to write more correct code. I'd be happy
> to know if people agree or not.
> --
It really depends on the use case.

 1. If 0 (or some other value) is an acceptable fall-back result, I
    don't want to litter my code with error handling.
 2. If a parse error has to be recorded or has to provoke some other
    action (e.g. ask the user to re-enter data), then I want to be
    forced to deal with the errors.


A good interface supports at least these two options, I'd say.

Just my 2t...

Cheers,

Roland

--

---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.

--------------060906060902020204020509
Content-Type: text/html; charset=UTF-8

<html>
  <head>
    <meta content="text/html; charset=UTF-8" http-equiv="Content-Type">
  </head>
  <body text="#000000" bgcolor="#FFFFFF">
    <div class="moz-cite-prefix">On 2014-01-26 23:33, Matt Fioravante
      wrote:<br>
    </div>
    <blockquote
      cite="mid:3e114458-7b14-4f1e-857c-049cb6ad58b1@isocpp.org"
      type="cite">
      <div dir="ltr">An just to further emphasize the question of
        returning pass/fail vs returning the value.<br>
        <br>
        <div>My general philosophy with parsing is to emphasize error
          handling first and then the actual results second. The success
          or failure of the parse should be thrown right in your face,
          forcing you to deal with it. This helps remind us to write
          more correct code. I'd be happy to know if people agree or
          not.</div>
      </div>
      -- <br>
    </blockquote>
    It really depends on the use case. <br>
    <br>
    <ol>
      <li>If 0 (or some other value) is an acceptable fall-back result,
        I don't want to litter my code with error handling.</li>
      <li>If a parse error has to be recorded or has to provoke some
        other action (e.g. ask the user to re-enter data), then I want
        to be forced to deal with the errors.</li>
    </ol>
    <br>
    A good interface supports at least these two options, I'd say.<br>
    <br>
    Just my 2t...<br>
    <br>
    Cheers,<br>
    <br>
    Roland<br>
  </body>
</html>

<p></p>

-- <br />
&nbsp;<br />
--- <br />
You received this message because you are subscribed to the Google Groups &quot;ISO C++ Standard - Future Proposals&quot; group.<br />
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.<br />
To post to this group, send email to std-proposals@isocpp.org.<br />
Visit this group at <a href="http://groups.google.com/a/isocpp.org/group/std-proposals/">http://groups.google.com/a/isocpp.org/group/std-proposals/</a>.<br />

--------------060906060902020204020509--

.


Author: Thiago Macieira <thiago@macieira.org>
Date: Sun, 26 Jan 2014 14:45:54 -0800
Raw View
--nextPart6254589.Q9jPc4qvuF
Content-Transfer-Encoding: 7Bit
Content-Type: text/plain; charset="us-ascii"

On domingo, 26 de janeiro de 2014 14:10:53, Matt Fioravante wrote:
> Exceptions are possible but rather heavy weight. Constructing an exception
> usually means also constructing a string error message. Not only do you
> have to pay for the allocation of this string

Exceptions are heavy weight, indeed, but allocating memory for the string
message is usually a big no-no. Rule of thumb for exceptions: don't allocate
memory in order to throw (or how would you report an OOM situation?).

> > >    - Efficient and inline
> > >    - constexpr
> >
> > Efficient, definitely. Inline and constexpr? Forget it, it can't be done.
> > Have
> > you ever looked at the source of a string-to-double function? They're
> > huge!
> > This might be left as a suggestion to compilers to implement this as an
> > intrinsic.
>
> It might be nice to have compile time string -> double conversions, but I
> agree for floating point its a huge complicated problem.
> http://www.exploringbinary.com/how-strtod-works-and-sometimes-doesnt/
>
> For int conversions, inline/constexpr might be doable (but not if we're
> using errno).

Only after we get a way to have constexpr code that is only used when
expanding constexpr arguments at compile time. The code for executing a
constexpr integer conversion will probably be larger than the optimised non-
constexpr version. Since the vast majority of the uses of this function will
be to parse strings not known at compile time, I much prefer that they be
efficient for runtime operation.

> > As others have said, using errno is too C, but then again this kind of
> > function should be done in conjunction with the C people. Any improvements
> > we
> > need, they probably need too.
>
> With overloading, templates, iterators, string_view, etc.. its not so C
> compatible. Do we really care so much anyway? I don't like the idea of
> handicapping C++ interfaces in the name of C compatibility.

C11 has generics, so that solves the problem of the templates and the
iterators.

But it might be that this C++ function get implemented with calls to strtol,
strtoul, strtoll, strtoull, strtod, etc. anyway. So maybe the C guys already
have what they need, except for the C Generic version.

And the char16_t, char32_t and wchar versions.

> > Do you know what this means? Parsing char16_t, char32_t and wchar_t too.
>
> Yes, but that's not so difficult.

That depends on whether locale parsing is performed. We need functions that
don't depend on the locale, in which case a conversion from char16_t and
char32_t to the execution charset can be done quite quickly (again something
for which the constexpr version would be slower than the runtime optimised
version). After the conversion is done, the char variant can be called.

That's how QString::to{Int,Double} is implemented: first, fast-convert from
UTF-16 to Latin 1, then call the internal strtoll / strtod.

> > That's the opposite of what most people want. Most people want to get the
> > parsed number, not whether it succeded or failed. Maybe invert the logic?
>
> I think that's a voting/bikeshed question (or use std::optional).

Agreed, which is why I'm not going to continue this part of the discussion :-)

--
Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org
   Software Architect - Intel Open Source Technology Center
      PGP/GPG: 0x6EF45358; fingerprint:
      E067 918B B660 DBD1 105C  966C 33F5 F005 6EF4 5358

--nextPart6254589.Q9jPc4qvuF
Content-Type: application/pgp-signature; name="signature.asc"
Content-Description: This is a digitally signed message part.
Content-Transfer-Encoding: 7Bit

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.19 (GNU/Linux)

iD8DBQBS5ZAzM/XwBW70U1gRAs6WAJ4hKVA+1rkvPXkNvuL8GcEeAu3AvQCdF94o
8Pr8PRFQnkGoZ6fG4qzsRGw=
=uDwK
-----END PGP SIGNATURE-----

--nextPart6254589.Q9jPc4qvuF--


.


Author: Thiago Macieira <thiago@macieira.org>
Date: Sun, 26 Jan 2014 14:51:04 -0800
Raw View
--nextPart1441441.JyaEhScNaf
Content-Transfer-Encoding: 7Bit
Content-Type: text/plain; charset="us-ascii"

On domingo, 26 de janeiro de 2014 14:33:47, Matt Fioravante wrote:
> My general philosophy with parsing is to emphasize error handling first and
> then the actual results second. The success or failure of the parse should
> be thrown right in your face, forcing you to deal with it. This helps
> remind us to write more correct code. I'd be happy to know if people agree
> or not.

You're asking for the bikeshed discussion.

I don't agree. Sometimes you already know that the data is well-formed and you
don't need the error status. Therefore, emphasising the actual data is more
important.

Given use of exceptions, the philosophy I described in the paragraph seems to
apply to the Standard Library.

--
Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org
   Software Architect - Intel Open Source Technology Center
      PGP/GPG: 0x6EF45358; fingerprint:
      E067 918B B660 DBD1 105C  966C 33F5 F005 6EF4 5358

--nextPart1441441.JyaEhScNaf
Content-Type: application/pgp-signature; name="signature.asc"
Content-Description: This is a digitally signed message part.
Content-Transfer-Encoding: 7Bit

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.19 (GNU/Linux)

iD8DBQBS5ZFYM/XwBW70U1gRAu97AKCL75R+qk59vbByLv7mDyoec9cyHQCgpAFk
/3BK20GW2HzFbjdW6le5OcY=
=Mei5
-----END PGP SIGNATURE-----

--nextPart1441441.JyaEhScNaf--


.


Author: Miro Knejp <miro@knejp.de>
Date: Mon, 27 Jan 2014 00:32:43 +0100
Raw View
This is a multi-part message in MIME format.
--------------050904050107040901050709
Content-Type: text/plain; charset=UTF-8; format=flowed


> Seems rather complicated to stuff all of that into the return value
> no? Returning a std::optional<int> would be ok because you can
> directly check the return value with operator bool(). It still doesn't
> provide information on why the failure occurred though.
It really depends on what you want to do. If you use iterators in the
first place you have very likely a scenario where you want to know the
end of you number for further processing. If you consider iterators
noise in your current use case, use the non-iterator overloads like
string_view.

Just for exposition:
optional<int> x;
tie(x, from) = string_to<int>(from, to);
if(x) ...

- or -

auto x = string_to<int>(from, to).first;
if(x) ...

If you only care whether there was a valid number or not then test x and
continue parsing/throwing/whatever. It may look weird but it's in line
with the spirit of the standard library. Afaik there is not a single std
function that uses iterators as out-arguments. They are always taken and
returned by value. Deviating from that would need good reasons. Multiple
return values would be such a nice thing right now (without tupling
everything)...
>
>
>     > //Attempts to parse s as an integer. The valid integer string
>     consists of
>     > the following:
>     > //* '+' or '-' sign as the first character (- only acceptable
>     for signed
>     > integral types)
>
>     But no U+2212?
>
>
> We could consider unicode as well. That's a good question.
>
>
>     > //* prefix (0) indicating octal base (applies only when base is
>     0 or 8)
>     > //* prefix (0x or 0X) indicating hexadecimal base (applies only
>     when base
>     > is 16 or 0).
>     > //* All of the rest of the characters MUST be digits.
>
>     Where, by "digits", we understand the regular ASCII digits 0 to 9
>     and the
>     letters that compose digits on this base, both in uppercase and
>     lowercase.
>
>
> Yes that's right.
>
> Maybe we should add an extra boolean argument (defaulted to true) that
> be used to disable the hex and octal prefixes. Sometimes you really
> want to just parse a hex string without the 0x prefex. Adding an extra
> false to the parameter list is nicer than doing this check for
> yourself. Its similar to disabling the leading whitespace check of
> strtol().
Bools aren't very desriptive. I'd prefer a more explicit syntax for
overloads. For example:

to_string<int>(s); // Use current locale
to_string<int>(s, no_locale); // Tag type: ASCII only, fast path with no
facets, no virtuals
to_string<int>(s, myLocale);

Makes sense? It's something I'm playing around with currently.

--

---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.

--------------050904050107040901050709
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

<html>
  <head>
    <meta content=3D"text/html; charset=3DUTF-8" http-equiv=3D"Content-Type=
">
  </head>
  <body text=3D"#000000" bgcolor=3D"#FFFFFF">
    <br>
    <blockquote
      cite=3D"mid:e771e446-863c-461c-bf89-28d43e15b4b8@isocpp.org"
      type=3D"cite">
      <div dir=3D"ltr">
        <div>
          <div>Seems rather complicated to stuff all of that into the
            return value no? Returning a std::optional&lt;int&gt; would
            be ok because you can directly check the return value with
            operator bool(). It still doesn't provide information on why
            the failure occurred though.</div>
        </div>
      </div>
    </blockquote>
    It really depends on what you want to do. If you use iterators in
    the first place you have very likely a scenario where you want to
    know the end of you number for further processing. If you consider
    iterators noise in your current use case, use the non-iterator
    overloads like string_view.<br>
    <br>
    Just for exposition:<br>
    optional&lt;int&gt; x;<br>
    tie(x, from) =3D string_to&lt;int&gt;(from, to);<br>
    if(x) ...<br>
    <br>
    - or -<br>
    <br>
    auto x =3D string_to&lt;int&gt;(from, to).first;<br>
    if(x) ...<br>
    <br>
    If you only care whether there was a valid number or not then test x
    and continue parsing/throwing/whatever. It may look weird but it's
    in line with the spirit of the standard library. Afaik there is not
    a single std function that uses iterators as out-arguments. They are
    always taken and returned by value. Deviating from that would need
    good reasons. Multiple return values would be such a nice thing
    right now (without tupling everything)...<br>
    <blockquote
      cite=3D"mid:e771e446-863c-461c-bf89-28d43e15b4b8@isocpp.org"
      type=3D"cite">
      <div dir=3D"ltr">
        <div><br>
        </div>
        <blockquote class=3D"gmail_quote" style=3D"margin: 0;margin-left:
          0.8ex;border-left: 1px #ccc solid;padding-left: 1ex;">
          <br>
          &gt; //Attempts to parse s as an integer. The valid integer
          string consists of
          <br>
          &gt; the following:
          <br>
          &gt; //* '+' or '-' sign as the first character (- only
          acceptable for signed
          <br>
          &gt; integral types)
          <br>
          <br>
          But no U+2212?
          <br>
        </blockquote>
        <div><br>
        </div>
        <div>We could consider unicode as well. That's a good question.=C2=
=A0</div>
        <blockquote class=3D"gmail_quote" style=3D"margin: 0;margin-left:
          0.8ex;border-left: 1px #ccc solid;padding-left: 1ex;">
          <br>
          &gt; //* prefix (0) indicating octal base (applies only when
          base is 0 or 8)
          <br>
          &gt; //* prefix (0x or 0X) indicating hexadecimal base
          (applies only when base
          <br>
          &gt; is 16 or 0).
          <br>
          &gt; //* All of the rest of the characters MUST be digits.
          <br>
          <br>
          Where, by "digits", we understand the regular ASCII digits 0
          to 9 and the <br>
          letters that compose digits on this base, both in uppercase
          and lowercase.
          <br>
        </blockquote>
        <div><br>
        </div>
        <div>Yes that's right.=C2=A0</div>
        <div><br>
        </div>
        <div>Maybe we should add an extra boolean argument (defaulted to
          true) that be used to disable the hex and octal prefixes.
          Sometimes you really want to just parse a hex string without
          the 0x prefex. Adding an extra false to the parameter list is
          nicer than doing this check for yourself. Its similar to
          disabling the leading whitespace check of strtol().</div>
      </div>
    </blockquote>
    Bools aren't very desriptive. I'd prefer a more explicit syntax for
    overloads. For example:<br>
    <br>
    to_string&lt;int&gt;(s); // Use current locale<br>
    to_string&lt;int&gt;(s, no_locale); // Tag type: ASCII only, fast
    path with no facets, no virtuals<br>
    to_string&lt;int&gt;(s, myLocale);<br>
    <br>
    Makes sense? It's something I'm playing around with currently.<br>
  </body>
</html>

<p></p>

-- <br />
&nbsp;<br />
--- <br />
You received this message because you are subscribed to the Google Groups &=
quot;ISO C++ Standard - Future Proposals&quot; group.<br />
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to std-proposals+unsubscribe@isocpp.org.<br />
To post to this group, send email to std-proposals@isocpp.org.<br />
Visit this group at <a href=3D"http://groups.google.com/a/isocpp.org/group/=
std-proposals/">http://groups.google.com/a/isocpp.org/group/std-proposals/<=
/a>.<br />

--------------050904050107040901050709--


.


Author: Matt Fioravante <fmatthew5876@gmail.com>
Date: Sun, 26 Jan 2014 15:33:55 -0800 (PST)
Raw View
------=_Part_573_15786231.1390779235080
Content-Type: text/plain; charset=UTF-8


On Sunday, January 26, 2014 5:41:55 PM UTC-5, Roland Bock wrote:
>
> On 2014-01-26 23:33, Matt Fioravante wrote:
>
> An just to further emphasize the question of returning pass/fail vs
> returning the value.
>
> My general philosophy with parsing is to emphasize error handling first
> and then the actual results second. The success or failure of the parse
> should be thrown right in your face, forcing you to deal with it. This
> helps remind us to write more correct code. I'd be happy to know if people
> agree or not.
> --
>
> It really depends on the use case.
>
>
>    1. If 0 (or some other value) is an acceptable fall-back result, I
>    don't want to litter my code with error handling.
>
> Your fallback value may be 0, that guy's fallback might be 1, the other
guys is INT_MIN, and maybe mine is -1. One interface cannot account for all
of these possibilities and assuming one of them is the same sin as parsing
white space before the string. Its easy enough to write a wrapper to suit
your specific requirements.

constexpr int kFallback = 0;

int mystrto(string_view s, int base) {
  int x;
  return strto(s, x, base) ? x : kFallback;
}

If strto() is inline, the compiler can remove the boolean return value and
the conditional in mystrto, having the resulting code return kFallback on
error, resulting in no runtime overhead for your wrapper.


>    1. If a parse error has to be recorded or has to provoke some other
>    action (e.g. ask the user to re-enter data), then I want to be forced to
>    deal with the errors.
>
>
> A good interface supports at least these two options, I'd say.
>

We have 2 differing uses cases here. Can one interface cleanly support both?


On Sunday, January 26, 2014 5:45:54 PM UTC-5, Thiago Macieira wrote:
>
> On domingo, 26 de janeiro de 2014 14:10:53, Matt Fioravante wrote:
> > Exceptions are possible but rather heavy weight. Constructing an
> exception
> > usually means also constructing a string error message. Not only do you
> > have to pay for the allocation of this string
>
> Exceptions are heavy weight, indeed, but allocating memory for the string
> message is usually a big no-no. Rule of thumb for exceptions: don't
> allocate
> memory in order to throw (or how would you report an OOM situation?).
>
> > > >    - Efficient and inline
> > > >    - constexpr
> > >
> > > Efficient, definitely. Inline and constexpr? Forget it, it can't be
> done.
> > > Have
> > > you ever looked at the source of a string-to-double function? They're
> > > huge!
> > > This might be left as a suggestion to compilers to implement this as
> an
> > > intrinsic.
> >
> > It might be nice to have compile time string -> double conversions, but
> I
> > agree for floating point its a huge complicated problem.
> > http://www.exploringbinary.com/how-strtod-works-and-sometimes-doesnt/
> >
> > For int conversions, inline/constexpr might be doable (but not if we're
> > using errno).
>
> Only after we get a way to have constexpr code that is only used when
> expanding constexpr arguments at compile time. The code for executing a
> constexpr integer conversion will probably be larger than the optimised
> non-
> constexpr version. Since the vast majority of the uses of this function
> will
> be to parse strings not known at compile time, I much prefer that they be
> efficient for runtime operation.
>

C++14 constexpr is pretty liberal, but I agree that runtime performance is
absolutely paramount. constexpr can always be tacked on later if its deemed
feasible and useful to someone. Lets forget the constexpr question for now.


> > > As others have said, using errno is too C, but then again this kind of
> > > function should be done in conjunction with the C people. Any
> improvements
> > > we
> > > need, they probably need too.
> >
> > With overloading, templates, iterators, string_view, etc.. its not so C
> > compatible. Do we really care so much anyway? I don't like the idea of
> > handicapping C++ interfaces in the name of C compatibility.
>
> C11 has generics, so that solves the problem of the templates and the
> iterators.
>
> But it might be that this C++ function get implemented with calls to
> strtol,
> strtoul, strtoll, strtoull, strtod, etc. anyway. So maybe the C guys
> already
> have what they need, except for the C Generic version.
>

I say we focus on making the best C++ interface possible. If the C
community steps up and shows interest, we can then consider changing things
for compatibility. Or they can make their own that works best for C. Like
it or not the 2 languages are diverging.



>
> And the char16_t, char32_t and wchar versions.
>
> > > Do you know what this means? Parsing char16_t, char32_t and wchar_t
> too.
> >
> > Yes, but that's not so difficult.
>
> That depends on whether locale parsing is performed. We need functions
> that
> don't depend on the locale, in which case a conversion from char16_t and
> char32_t to the execution charset can be done quite quickly (again
> something
> for which the constexpr version would be slower than the runtime optimised
> version). After the conversion is done, the char variant can be called.
>
> That's how QString::to{Int,Double} is implemented: first, fast-convert
> from
> UTF-16 to Latin 1, then call the internal strtoll / strtod.
>

constexpr aside, all QoI issues I believe.


>
> > > That's the opposite of what most people want. Most people want to get
> the
> > > parsed number, not whether it succeded or failed. Maybe invert the
> logic?
> >
> > I think that's a voting/bikeshed question (or use std::optional).
>
> Agreed, which is why I'm not going to continue this part of the discussion
> :-)
>
> --
> Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org
>    Software Architect - Intel Open Source Technology Center
>       PGP/GPG: 0x6EF45358; fingerprint:
>       E067 918B B660 DBD1 105C  966C 33F5 F005 6EF4 5358


On Sunday, January 26, 2014 5:51:04 PM UTC-5, Thiago Macieira wrote:
>
> On domingo, 26 de janeiro de 2014 14:33:47, Matt Fioravante wrote:
> > My general philosophy with parsing is to emphasize error handling first
> and
> > then the actual results second. The success or failure of the parse
> should
> > be thrown right in your face, forcing you to deal with it. This helps
> > remind us to write more correct code. I'd be happy to know if people
> agree
> > or not.
>
> You're asking for the bikeshed discussion.
>

Haha ok bring in on :)


>
> I don't agree. Sometimes you already know that the data is well-formed and
> you
> don't need the error status. Therefore, emphasizing the actual data is
> more
> important.
>

I don't know about you or others, but the majority of the time when I need
to do these conversions I am getting my string from the command line, a
file, a network socket, and/or another user input of some kind etc.. All of
those require strict error checking.


Even if you know the string is supposed to be parsed correctly, it doesn't
hurt to throw in an assert() or debug mode error message check in there in
case someone (or you yourself) made a mistake earlier and broke your
invariant.

You're a QT developer, so I suppose the use case you're mentioning is a GUI
box which already checks the input before passing it down? Self validating
UI is one very common use case, but not the only one as I've mentioned
already.

What has come out of this are 2 distinct use cases, error checking emphasis
vs results emphasis. We have 3 options:

   1. Come up with an interface that somehow satisfies both
   2. Make 2 interfaces, one for each situation
   3. Prioritize one over the other

I'm not sure how to do (1) and (2) seems like it could be confusing to have
2 interfaces that do the exact same thing with slightly different calling
conventions. So that leaves us with (3), and I obviously stand firm in the
safety camp.

Here is another project which seems to agree with my philosophy. Its a
sample library designed to teach linux kernel developers how to write good
userspace C libraries.
https://git.kernel.org/cgit/linux/kernel/git/kay/libabc.git/

Functions should return int and negative errors instead of NULL
  - Return NULL in malloc() is fine, return NULL in fopen() is not!
  - Pass allocated objects as parameter (yes, ctx_t** is OK!)
  - Returning kernel style negative <errno.h> error codes is cool in
    userspace too. Do it!


This is a C API, but the concept still translates to C++. Here they encourage returning error conditions, and stuffing the results into out parameters. While I might not do this for everything, I certainly would for error heavy routines such as parsing.



> Given use of exceptions, the philosophy I described in the paragraph seems
> to
> apply to the Standard Library.
>

Maybe so, but I'd rather come up with the safest, most expressive, most
easy to use, and most efficient interface possible. Regardless of past
precedents.


>
> --
> Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org
>    Software Architect - Intel Open Source Technology Center
>       PGP/GPG: 0x6EF45358; fingerprint:
>       E067 918B B660 DBD1 105C  966C 33F5 F005 6EF4 5358
>

--

---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.

------=_Part_573_15786231.1390779235080
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr"><br>On Sunday, January 26, 2014 5:41:55 PM UTC-5, Roland B=
ock wrote:<blockquote class=3D"gmail_quote" style=3D"margin: 0px 0px 0px 0.=
8ex; border-left-width: 1px; border-left-color: rgb(204, 204, 204); border-=
left-style: solid; padding-left: 1ex;"><div text=3D"#000000" bgcolor=3D"#FF=
FFFF"><div>On 2014-01-26 23:33, Matt Fioravante wrote:<br></div><blockquote=
 type=3D"cite"><div dir=3D"ltr">An just to further emphasize the question o=
f returning pass/fail vs returning the value.<br><br><div>My general philos=
ophy with parsing is to emphasize error handling first and then the actual =
results second. The success or failure of the parse should be thrown right =
in your face, forcing you to deal with it. This helps remind us to write mo=
re correct code. I'd be happy to know if people agree or not.</div></div>--=
&nbsp;<br></blockquote>It really depends on the use case.&nbsp;<br><br><ol>=
<li>If 0 (or some other value) is an acceptable fall-back result, I don't w=
ant to litter my code with error handling.</li></ol></div></blockquote><div=
>Your fallback value may be 0, that guy's fallback might be 1, the other gu=
ys is INT_MIN, and maybe mine is -1. One interface cannot account for all o=
f these possibilities and assuming one of them is the same sin as parsing w=
hite space before the string. Its easy enough to write a wrapper to suit yo=
ur specific requirements.&nbsp;</div><div><br></div><div>constexpr int kFal=
lback =3D 0;</div><div><br></div><div>int mystrto(string_view s, int base) =
{<br>&nbsp; int x;</div><div>&nbsp; return strto(s, x, base) ? x : kFallbac=
k;</div><div>}</div><div><br></div><div>If strto() is inline, the compiler =
can remove the boolean return value and the conditional in mystrto, having =
the resulting code return kFallback on error, resulting in no runtime overh=
ead for your wrapper.</div><div><br></div><blockquote class=3D"gmail_quote"=
 style=3D"margin: 0px 0px 0px 0.8ex; border-left-width: 1px; border-left-co=
lor: rgb(204, 204, 204); border-left-style: solid; padding-left: 1ex;"><div=
 text=3D"#000000" bgcolor=3D"#FFFFFF"><ol><li>If a parse error has to be re=
corded or has to provoke some other action (e.g. ask the user to re-enter d=
ata), then I want to be forced to deal with the errors.</li></ol><br>A good=
 interface supports at least these two options, I'd say.<br></div></blockqu=
ote><div><br></div><div>We have 2 differing uses cases here. Can one interf=
ace cleanly support both?</div><div><br></div><div><br>On Sunday, January 2=
6, 2014 5:45:54 PM UTC-5, Thiago Macieira wrote:<blockquote class=3D"gmail_=
quote" style=3D"margin: 0px 0px 0px 0.8ex; border-left-width: 1px; border-l=
eft-color: rgb(204, 204, 204); border-left-style: solid; padding-left: 1ex;=
">On domingo, 26 de janeiro de 2014 14:10:53, Matt Fioravante wrote:&nbsp;<=
br>&gt; Exceptions are possible but rather heavy weight. Constructing an ex=
ception&nbsp;<br>&gt; usually means also constructing a string error messag=
e. Not only do you&nbsp;<br>&gt; have to pay for the allocation of this str=
ing&nbsp;<br><br>Exceptions are heavy weight, indeed, but allocating memory=
 for the string&nbsp;<br>message is usually a big no-no. Rule of thumb for =
exceptions: don't allocate&nbsp;<br>memory in order to throw (or how would =
you report an OOM situation?).&nbsp;<br><br>&gt; &gt; &gt; &nbsp; &nbsp;- E=
fficient and inline&nbsp;<br>&gt; &gt; &gt; &nbsp; &nbsp;- constexpr&nbsp;<=
br>&gt; &gt;&nbsp;<br>&gt; &gt; Efficient, definitely. Inline and constexpr=
? Forget it, it can't be done.&nbsp;<br>&gt; &gt; Have&nbsp;<br>&gt; &gt; y=
ou ever looked at the source of a string-to-double function? They're&nbsp;<=
br>&gt; &gt; huge!&nbsp;<br>&gt; &gt; This might be left as a suggestion to=
 compilers to implement this as an&nbsp;<br>&gt; &gt; intrinsic.&nbsp;<br>&=
gt;&nbsp;<br>&gt; It might be nice to have compile time string -&gt; double=
 conversions, but I&nbsp;<br>&gt; agree for floating point its a huge compl=
icated problem.&nbsp;<br>&gt;&nbsp;<a href=3D"http://www.exploringbinary.co=
m/how-strtod-works-and-sometimes-doesnt/" target=3D"_blank">http://www.expl=
oringbinary.<wbr>com/how-strtod-works-and-<wbr>sometimes-doesnt/</a>&nbsp;<=
br>&gt;&nbsp;<br>&gt; For int conversions, inline/constexpr might be doable=
 (but not if we're&nbsp;<br>&gt; using errno).&nbsp;<br><br>Only after we g=
et a way to have constexpr code that is only used when&nbsp;<br>expanding c=
onstexpr arguments at compile time. The code for executing a&nbsp;<br>const=
expr integer conversion will probably be larger than the optimised non-&nbs=
p;<br>constexpr version. Since the vast majority of the uses of this functi=
on will&nbsp;<br>be to parse strings not known at compile time, I much pref=
er that they be&nbsp;<br>efficient for runtime operation.&nbsp;<br></blockq=
uote><div><br></div><div>C++14 constexpr is pretty liberal, but I agree tha=
t runtime performance is absolutely paramount. constexpr can always be tack=
ed on later if its deemed feasible and useful to someone. Lets forget the c=
onstexpr question for now.</div><div><br></div><blockquote class=3D"gmail_q=
uote" style=3D"margin: 0px 0px 0px 0.8ex; border-left-width: 1px; border-le=
ft-color: rgb(204, 204, 204); border-left-style: solid; padding-left: 1ex;"=
><br>&gt; &gt; As others have said, using errno is too C, but then again th=
is kind of&nbsp;<br>&gt; &gt; function should be done in conjunction with t=
he C people. Any improvements&nbsp;<br>&gt; &gt; we&nbsp;<br>&gt; &gt; need=
, they probably need too.&nbsp;<br>&gt;&nbsp;<br>&gt; With overloading, tem=
plates, iterators, string_view, etc.. its not so C&nbsp;<br>&gt; compatible=
.. Do we really care so much anyway? I don't like the idea of&nbsp;<br>&gt; =
handicapping C++ interfaces in the name of C compatibility.&nbsp;<br><br>C1=
1 has generics, so that solves the problem of the templates and the&nbsp;<b=
r>iterators.&nbsp;<br><br>But it might be that this C++ function get implem=
ented with calls to strtol,&nbsp;<br>strtoul, strtoll, strtoull, strtod, et=
c. anyway. So maybe the C guys already&nbsp;<br>have what they need, except=
 for the C Generic version.&nbsp;<br></blockquote><div><br></div><div>I say=
 we focus on making the best C++ interface possible. If the C community ste=
ps up and shows interest, we can then consider changing things for compatib=
ility. Or they can make their own that works best for C. Like it or not the=
 2 languages are diverging.</div><div><br></div><div>&nbsp;</div><blockquot=
e class=3D"gmail_quote" style=3D"margin: 0px 0px 0px 0.8ex; border-left-wid=
th: 1px; border-left-color: rgb(204, 204, 204); border-left-style: solid; p=
adding-left: 1ex;"><br>And the char16_t, char32_t and wchar versions.&nbsp;=
<br><br>&gt; &gt; Do you know what this means? Parsing char16_t, char32_t a=
nd wchar_t too.&nbsp;<br>&gt;&nbsp;<br>&gt; Yes, but that's not so difficul=
t.&nbsp;<br><br>That depends on whether locale parsing is performed. We nee=
d functions that&nbsp;<br>don't depend on the locale, in which case a conve=
rsion from char16_t and&nbsp;<br>char32_t to the execution charset can be d=
one quite quickly (again something&nbsp;<br>for which the constexpr version=
 would be slower than the runtime optimised&nbsp;<br>version). After the co=
nversion is done, the char variant can be called.&nbsp;<br><br>That's how Q=
String::to{Int,Double} is implemented: first, fast-convert from&nbsp;<br>UT=
F-16 to Latin 1, then call the internal strtoll / strtod.&nbsp;<br></blockq=
uote><div><br></div><div>constexpr aside, all QoI issues I believe.</div><d=
iv>&nbsp;</div><blockquote class=3D"gmail_quote" style=3D"margin: 0px 0px 0=
px 0.8ex; border-left-width: 1px; border-left-color: rgb(204, 204, 204); bo=
rder-left-style: solid; padding-left: 1ex;"><br>&gt; &gt; That's the opposi=
te of what most people want. Most people want to get the&nbsp;<br>&gt; &gt;=
 parsed number, not whether it succeded or failed. Maybe invert the logic?&=
nbsp;<br>&gt;&nbsp;<br>&gt; I think that's a voting/bikeshed question (or u=
se std::optional).&nbsp;<br><br>Agreed, which is why I'm not going to conti=
nue this part of the discussion :-)&nbsp;<br><br>--&nbsp;<br>Thiago Macieir=
a - thiago (AT)&nbsp;<a href=3D"http://macieira.info/" target=3D"_blank">ma=
cieira.info</a>&nbsp;- thiago (AT)&nbsp;<a href=3D"http://kde.org/" target=
=3D"_blank">kde.org</a>&nbsp;<br>&nbsp; &nbsp;Software Architect - Intel Op=
en Source Technology Center&nbsp;<br>&nbsp; &nbsp; &nbsp; PGP/GPG: 0x6EF453=
58; fingerprint:&nbsp;<br>&nbsp; &nbsp; &nbsp; E067 918B B660 DBD1 105C &nb=
sp;966C 33F5 F005 6EF4 5358&nbsp;</blockquote></div><div>&nbsp;</div>On Sun=
day, January 26, 2014 5:51:04 PM UTC-5, Thiago Macieira wrote:<blockquote c=
lass=3D"gmail_quote" style=3D"margin: 0;margin-left: 0.8ex;border-left: 1px=
 #ccc solid;padding-left: 1ex;">On domingo, 26 de janeiro de 2014 14:33:47,=
 Matt Fioravante wrote:
<br>&gt; My general philosophy with parsing is to emphasize error handling =
first and=20
<br>&gt; then the actual results second. The success or failure of the pars=
e should
<br>&gt; be thrown right in your face, forcing you to deal with it. This he=
lps
<br>&gt; remind us to write more correct code. I'd be happy to know if peop=
le agree
<br>&gt; or not.
<br>
<br>You're asking for the bikeshed discussion.
<br></blockquote><div>&nbsp;</div><div>Haha ok bring in on :)</div><div>&nb=
sp;</div><blockquote class=3D"gmail_quote" style=3D"margin: 0;margin-left: =
0.8ex;border-left: 1px #ccc solid;padding-left: 1ex;">
<br>I don't agree. Sometimes you already know that the data is well-formed =
and you=20
<br>don't need the error status. Therefore, emphasizing the actual data is =
more=20
<br>important.
<br></blockquote><div><br></div><div>I don't know about you or others, but =
the majority of the time when I need to do these conversions I am getting m=
y string from the command line, a file, a network socket, and/or another us=
er input of some kind etc.. All of those require strict error checking.&nbs=
p;</div><div><br></div><div><br></div><div>Even if you know the string is s=
upposed to be parsed correctly, it doesn't hurt to throw in an assert() or =
debug mode error message check in there in case someone (or you yourself) m=
ade a mistake earlier and broke your invariant.&nbsp;</div><div><br></div><=
div>You're a QT developer, so I suppose the use case you're mentioning is a=
 GUI box which already checks the input before passing it down? Self valida=
ting UI is one very common use case, but not the only one as I've mentioned=
 already.<br></div><div>&nbsp;</div><div>What has come out of this are 2 di=
stinct use cases, error checking emphasis vs results emphasis. We have 3 op=
tions:</div><div><ol><li><span style=3D"font-size: 13px;">Come up with an i=
nterface that somehow satisfies both</span><br></li><li><span style=3D"font=
-size: 13px;">Make 2 interfaces, one for each situation</span><br></li><li>=
<span style=3D"font-size: 13px;">Prioritize one over the other</span><br></=
li></ol><div>I'm not sure how to do (1) and (2) seems like it could be conf=
using to have 2 interfaces that do the exact same thing with slightly diffe=
rent calling conventions. So that leaves us with (3), and I obviously stand=
 firm in the safety camp.</div></div><div><br></div><div>Here is another pr=
oject which seems to agree with my philosophy. Its a sample library designe=
d to teach linux kernel developers how to write good userspace C libraries.=
</div><div>https://git.kernel.org/cgit/linux/kernel/git/kay/libabc.git/<br>=
</div><div><br></div><div><pre style=3D"color: rgb(0, 0, 0); font-size: 13p=
x;"><code>Functions should return int and negative errors instead of NULL
  - Return NULL in malloc() is fine, return NULL in fopen() is not!
  - Pass allocated objects as parameter (yes, ctx_t** is OK!)
  - Returning kernel style negative &lt;errno.h&gt; error codes is cool in
    userspace too. Do it!</code></pre><pre style=3D"color: rgb(0, 0, 0); fo=
nt-size: 13px;"><code><br></code></pre><pre style=3D"color: rgb(0, 0, 0); f=
ont-size: 13px;"><code><span style=3D"color: rgb(34, 34, 34); font-family: =
Arial, Helvetica, sans-serif; white-space: normal;">This is a C API, but th=
e concept still translates to C++. Here they encourage returning error cond=
itions, and stuffing the results into out parameters. While I might not do =
this for everything, I certainly would for error heavy routines such as par=
sing.</span><br></code></pre><pre style=3D"color: rgb(0, 0, 0); font-size: =
13px;"><br></pre></div><blockquote class=3D"gmail_quote" style=3D"margin: 0=
;margin-left: 0.8ex;border-left: 1px #ccc solid;padding-left: 1ex;">
<br>Given use of exceptions, the philosophy I described in the paragraph se=
ems to=20
<br>apply to the Standard Library.
<br></blockquote><div><br></div><div>Maybe so, but I'd rather come up with =
the safest, most expressive, most easy to use, and most efficient interface=
 possible. Regardless of past precedents.</div><div>&nbsp;</div><blockquote=
 class=3D"gmail_quote" style=3D"margin: 0;margin-left: 0.8ex;border-left: 1=
px #ccc solid;padding-left: 1ex;">
<br>--=20
<br>Thiago Macieira - thiago (AT) <a href=3D"http://macieira.info" target=
=3D"_blank" onmousedown=3D"this.href=3D'http://www.google.com/url?q\75http%=
3A%2F%2Fmacieira.info\46sa\75D\46sntz\0751\46usg\75AFQjCNEswDUBNCNanbu7euhq=
Ln_62FW8ag';return true;" onclick=3D"this.href=3D'http://www.google.com/url=
?q\75http%3A%2F%2Fmacieira.info\46sa\75D\46sntz\0751\46usg\75AFQjCNEswDUBNC=
Nanbu7euhqLn_62FW8ag';return true;">macieira.info</a> - thiago (AT) <a href=
=3D"http://kde.org" target=3D"_blank" onmousedown=3D"this.href=3D'http://ww=
w.google.com/url?q\75http%3A%2F%2Fkde.org\46sa\75D\46sntz\0751\46usg\75AFQj=
CNHGRJdo5_JYG1DowztwAHAKs80XSA';return true;" onclick=3D"this.href=3D'http:=
//www.google.com/url?q\75http%3A%2F%2Fkde.org\46sa\75D\46sntz\0751\46usg\75=
AFQjCNHGRJdo5_JYG1DowztwAHAKs80XSA';return true;">kde.org</a>
<br>&nbsp; &nbsp;Software Architect - Intel Open Source Technology Center
<br>&nbsp; &nbsp; &nbsp; PGP/GPG: 0x6EF45358; fingerprint:
<br>&nbsp; &nbsp; &nbsp; E067 918B B660 DBD1 105C &nbsp;966C 33F5 F005 6EF4=
 5358
<br></blockquote></div>

<p></p>

-- <br />
&nbsp;<br />
--- <br />
You received this message because you are subscribed to the Google Groups &=
quot;ISO C++ Standard - Future Proposals&quot; group.<br />
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to std-proposals+unsubscribe@isocpp.org.<br />
To post to this group, send email to std-proposals@isocpp.org.<br />
Visit this group at <a href=3D"http://groups.google.com/a/isocpp.org/group/=
std-proposals/">http://groups.google.com/a/isocpp.org/group/std-proposals/<=
/a>.<br />

------=_Part_573_15786231.1390779235080--

.


Author: Matt Fioravante <fmatthew5876@gmail.com>
Date: Sun, 26 Jan 2014 15:42:31 -0800 (PST)
Raw View
------=_Part_2770_23076967.1390779751602
Content-Type: text/plain; charset=UTF-8

Also for the UI validation case, you can reuse the same function to do the
validation, in which case the error handling becomes paramount and must be
efficient (no exceptions). You can even cache the parsed result into your
widget and then reuse it later instead of parsing the input string twice,
if that makes sense to do.

--

---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.

------=_Part_2770_23076967.1390779751602
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr">Also for the UI validation case, you can reuse the same fu=
nction to do the validation, in which case the error handling becomes param=
ount and must be efficient (no exceptions). You can even cache the parsed r=
esult into your widget and then reuse it later instead of parsing the input=
 string twice, if that makes sense to do.</div>

<p></p>

-- <br />
&nbsp;<br />
--- <br />
You received this message because you are subscribed to the Google Groups &=
quot;ISO C++ Standard - Future Proposals&quot; group.<br />
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to std-proposals+unsubscribe@isocpp.org.<br />
To post to this group, send email to std-proposals@isocpp.org.<br />
Visit this group at <a href=3D"http://groups.google.com/a/isocpp.org/group/=
std-proposals/">http://groups.google.com/a/isocpp.org/group/std-proposals/<=
/a>.<br />

------=_Part_2770_23076967.1390779751602--

.


Author: Matt Fioravante <fmatthew5876@gmail.com>
Date: Sun, 26 Jan 2014 15:46:29 -0800 (PST)
Raw View
------=_Part_2_33366436.1390779989819
Content-Type: text/plain; charset=UTF-8



On Sunday, January 26, 2014 6:32:43 PM UTC-5, Miro Knejp wrote:
>
>  Bools aren't very desriptive. I'd prefer a more explicit syntax for
> overloads. For example:
>


>
> to_string<int>(s); // Use current locale
> to_string<int>(s, no_locale); // Tag type: ASCII only, fast path with no
> facets, no virtuals
> to_string<int>(s, myLocale);
>
> Makes sense? It's something I'm playing around with currently.
>

I've had ideas in this direction as well, but thats a separate proposal. In
one instance I benchmarked a hand written
inline bool ascii::isspace(char c) which proved to be almost 30% faster
than std::isspace(). Those indirect function calls are horribly expensive.

--

---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.

------=_Part_2_33366436.1390779989819
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr"><br><br>On Sunday, January 26, 2014 6:32:43 PM UTC-5, Miro=
 Knejp wrote:<blockquote class=3D"gmail_quote" style=3D"margin: 0;margin-le=
ft: 0.8ex;border-left: 1px #ccc solid;padding-left: 1ex;"><div text=3D"#000=
000" bgcolor=3D"#FFFFFF"><blockquote type=3D"cite"><div dir=3D"ltr">
      </div>
    </blockquote>
    Bools aren't very desriptive. I'd prefer a more explicit syntax for
    overloads. For example:<br></div></blockquote><div>&nbsp;</div><blockqu=
ote class=3D"gmail_quote" style=3D"margin: 0;margin-left: 0.8ex;border-left=
: 1px #ccc solid;padding-left: 1ex;"><div text=3D"#000000" bgcolor=3D"#FFFF=
FF">
    <br>
    to_string&lt;int&gt;(s); // Use current locale<br>
    to_string&lt;int&gt;(s, no_locale); // Tag type: ASCII only, fast
    path with no facets, no virtuals<br>
    to_string&lt;int&gt;(s, myLocale);<br>
    <br>
    Makes sense? It's something I'm playing around with currently.<br></div=
></blockquote><div><br></div><div>I've had ideas in this direction as well,=
 but thats a separate proposal. In one instance I benchmarked a hand writte=
n&nbsp;</div><div>inline bool ascii::isspace(char c) which proved to be alm=
ost 30% faster than std::isspace(). Those indirect function calls are horri=
bly expensive.</div></div>

<p></p>

-- <br />
&nbsp;<br />
--- <br />
You received this message because you are subscribed to the Google Groups &=
quot;ISO C++ Standard - Future Proposals&quot; group.<br />
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to std-proposals+unsubscribe@isocpp.org.<br />
To post to this group, send email to std-proposals@isocpp.org.<br />
Visit this group at <a href=3D"http://groups.google.com/a/isocpp.org/group/=
std-proposals/">http://groups.google.com/a/isocpp.org/group/std-proposals/<=
/a>.<br />

------=_Part_2_33366436.1390779989819--

.


Author: Roland Bock <rbock@eudoxos.de>
Date: Mon, 27 Jan 2014 08:50:29 +0100
Raw View
This is a multi-part message in MIME format.
--------------010306070009000908040301
Content-Type: text/plain; charset=UTF-8

On 2014-01-27 00:33, Matt Fioravante wrote:
>
>     A good interface supports at least these two options, I'd say.
>
>
> We have 2 differing uses cases here. Can one interface cleanly support
> both?
Overloads.
One with, one without fall-back value.

Or additional methods like parse_xy_with_default()...

>
>
>
>     I don't agree. Sometimes you already know that the data is
>     well-formed and you
>     don't need the error status. Therefore, emphasizing the actual
>     data is more
>     important.
>
>
> I don't know about you or others, but the majority of the time when I
> need to do these conversions I am getting my string from the command
> line, a file, a network socket, and/or another user input of some kind
> etc.. All of those require strict error checking.
RPC input, database input might have other checks, e.g. checksums for
the whole message. In that case it makes no sense to check each conversion.
And you might /expect/ broken input and have a fall-back.
>
> Even if you know the string is supposed to be parsed correctly, it
> doesn't hurt to throw in an assert() or debug mode error message check
> in there in case someone (or you yourself) made a mistake earlier and
> broke your invariant.
Yeah, that's fine. You could add a is_parseable(...) or so to be used in
asserts().
>
> What has come out of this are 2 distinct use cases, error checking
> emphasis vs results emphasis. We have 3 options:
>
>  1. Come up with an interface that somehow satisfies both
>  2. Make 2 interfaces, one for each situation
>  3. Prioritize one over the other
>
> I'm not sure how to do (1) and (2) seems like it could be confusing to
> have 2 interfaces that do the exact same thing with slightly different
> calling conventions. So that leaves us with (3), and I obviously stand
> firm in the safety camp.
1 and 2 are basically the same. I'd go that way...
>
>     Given use of exceptions, the philosophy I described in the
>     paragraph seems to
>     apply to the Standard Library.
>
>
> Maybe so, but I'd rather come up with the safest, most expressive,
> most easy to use, and most efficient interface possible. Regardless of
> past precedents.
>
Well, you have received quite a lot of feedback, try to come up with
that :-)


--

---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.

--------------010306070009000908040301
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

<html>
  <head>
    <meta content=3D"text/html; charset=3DUTF-8" http-equiv=3D"Content-Type=
">
  </head>
  <body text=3D"#000000" bgcolor=3D"#FFFFFF">
    <div class=3D"moz-cite-prefix">On 2014-01-27 00:33, Matt Fioravante
      wrote:<br>
    </div>
    <blockquote
      cite=3D"mid:a3c21034-1745-4ed6-9c77-d07f11ca4e96@isocpp.org"
      type=3D"cite">
      <div dir=3D"ltr"><br>
        <blockquote class=3D"gmail_quote" style=3D"margin: 0px 0px 0px
          0.8ex; border-left-width: 1px; border-left-color: rgb(204,
          204, 204); border-left-style: solid; padding-left: 1ex;">
          <div text=3D"#000000" bgcolor=3D"#FFFFFF">A good interface
            supports at least these two options, I'd say.<br>
          </div>
        </blockquote>
        <div><br>
        </div>
        <div>We have 2 differing uses cases here. Can one interface
          cleanly support both?</div>
      </div>
    </blockquote>
    Overloads.<br>
    One with, one without fall-back value.<br>
    <br>
    Or additional methods like parse_xy_with_default()...<br>
    <br>
    <blockquote
      cite=3D"mid:a3c21034-1745-4ed6-9c77-d07f11ca4e96@isocpp.org"
      type=3D"cite">
      <div dir=3D"ltr">
        <div><br>
          <div><br>
          </div>
          <blockquote class=3D"gmail_quote" style=3D"margin: 0px 0px 0px
            0.8ex; border-left-width: 1px; border-left-color: rgb(204,
            204, 204); border-left-style: solid; padding-left: 1ex;"></bloc=
kquote>
          <br>
        </div>
        <blockquote class=3D"gmail_quote" style=3D"margin: 0;margin-left:
          0.8ex;border-left: 1px #ccc solid;padding-left: 1ex;">I don't
          agree. Sometimes you already know that the data is well-formed
          and you <br>
          don't need the error status. Therefore, emphasizing the actual
          data is more <br>
          important.
          <br>
        </blockquote>
        <div><br>
        </div>
        <div>I don't know about you or others, but the majority of the
          time when I need to do these conversions I am getting my
          string from the command line, a file, a network socket, and/or
          another user input of some kind etc.. All of those require
          strict error checking. <br>
        </div>
      </div>
    </blockquote>
    RPC input, database input might have other checks, e.g. checksums
    for the whole message. In that case it makes no sense to check each
    conversion.<br>
    And you might /expect/ broken input and have a fall-back.<br>
    <blockquote
      cite=3D"mid:a3c21034-1745-4ed6-9c77-d07f11ca4e96@isocpp.org"
      type=3D"cite">
      <div dir=3D"ltr"><br>
        <div>Even if you know the string is supposed to be parsed
          correctly, it doesn't hurt to throw in an assert() or debug
          mode error message check in there in case someone (or you
          yourself) made a mistake earlier and broke your invariant. <br>
        </div>
      </div>
    </blockquote>
    Yeah, that's fine. You could add a is_parseable(...) or so to be
    used in asserts().<br>
    <blockquote
      cite=3D"mid:a3c21034-1745-4ed6-9c77-d07f11ca4e96@isocpp.org"
      type=3D"cite">
      <div dir=3D"ltr"><br>
        <div>What has come out of this are 2 distinct use cases, error
          checking emphasis vs results emphasis. We have 3 options:</div>
        <div>
          <ol>
            <li><span style=3D"font-size: 13px;">Come up with an interface
                that somehow satisfies both</span><br>
            </li>
            <li><span style=3D"font-size: 13px;">Make 2 interfaces, one
                for each situation</span><br>
            </li>
            <li><span style=3D"font-size: 13px;">Prioritize one over the
                other</span><br>
            </li>
          </ol>
          <div>I'm not sure how to do (1) and (2) seems like it could be
            confusing to have 2 interfaces that do the exact same thing
            with slightly different calling conventions. So that leaves
            us with (3), and I obviously stand firm in the safety camp.</di=
v>
        </div>
      </div>
    </blockquote>
    1 and 2 are basically the same. I'd go that way...<br>
    <blockquote
      cite=3D"mid:a3c21034-1745-4ed6-9c77-d07f11ca4e96@isocpp.org"
      type=3D"cite">
      <div dir=3D"ltr"><br>
        <blockquote class=3D"gmail_quote" style=3D"margin: 0;margin-left:
          0.8ex;border-left: 1px #ccc solid;padding-left: 1ex;">Given
          use of exceptions, the philosophy I described in the paragraph
          seems to <br>
          apply to the Standard Library.
          <br>
        </blockquote>
        <div><br>
        </div>
        <div>Maybe so, but I'd rather come up with the safest, most
          expressive, most easy to use, and most efficient interface
          possible. Regardless of past precedents.</div>
        <div>=C2=A0<br>
        </div>
      </div>
    </blockquote>
    Well, you have received quite a lot of feedback, try to come up with
    that :-)<br>
    <br>
    <br>
  </body>
</html>

<p></p>

-- <br />
&nbsp;<br />
--- <br />
You received this message because you are subscribed to the Google Groups &=
quot;ISO C++ Standard - Future Proposals&quot; group.<br />
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to std-proposals+unsubscribe@isocpp.org.<br />
To post to this group, send email to std-proposals@isocpp.org.<br />
Visit this group at <a href=3D"http://groups.google.com/a/isocpp.org/group/=
std-proposals/">http://groups.google.com/a/isocpp.org/group/std-proposals/<=
/a>.<br />

--------------010306070009000908040301--

.


Author: Bjorn Reese <breese@mail1.stofanet.dk>
Date: Mon, 27 Jan 2014 10:23:45 +0100
Raw View
On 01/26/2014 05:25 PM, Matt Fioravante wrote:
> string to T (int, float, etc..) conversions seem like to rather easy
> task (aside from floating point round trip issues), and yet for the life
> of C and C++ the standard library has consistently failed to provide a
> decent interface.

You need to specify more clearly what T is. For instance, you mention
integral types but does that include bool or char? Does it include
arbitrary classes? (as handled by the proposal linked below)

What about char_traits?

What about locale?

> Lets review:

http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2006/n1973.html

--

---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.

.


Author: Matthew Woehlke <mw_triad@users.sourceforge.net>
Date: Mon, 27 Jan 2014 12:27:11 -0500
Raw View
On 2014-01-26 11:25, Matt Fioravante wrote:
> //* prefix (0x or 0X) indicating hexadecimal base (applies only when base
> is 16 or 0).

If this also works for float/double I will be a happy man :-). (I have a=20
project for which I intend to use base-16 to store double values on disk=20
(textual format) in order to avoid rounding issues.)

> //* All of the rest of the characters MUST be digits.

Do you support locale-specific digits? E.g. will you parse "=E4=BA=8C=E5=8D=
=83=E4=BA=8C=E5=8D=81=E4=BA=94"=20
in a Japanese locale? What about locale-specific digit grouping and=20
decimal separators?

> //Returns true if an integral value was successfully parsed and stores th=
e
> value in val,

Why not return a std::optional? I've never been a fan of (non-const)=20
by-ref parameters; they make it hard to impossible to store the "output"=20
value in a const local.

> //Sets errno to ERANGE if the string was an integer but would overflow ty=
pe
> integral.

Floating point can overflow also, or do you return +/- infinity in that=20
case? (Maybe do both?)

> template <typename integral>
> constexpr bool strto(string_view s, integral& val, int base);

'strto' seems to be missing something (though if you return an optional,=20
one would have to write 'strto<type>' which would be better). You might=20
also consider something like 'parse_number'; I wouldn't name it like=20
'strto' just because there are already functions named similarly.

> First off, all of these return bool which makes it very easy to check
> whether or not parsing failed.
>
> While the interface does not allow this idom:
>
> int x =3D atoi(s);

Again, if you instead returned a std::optional, I believe both of these=20
would be covered. Referring to later discussion, std::optional would=20
satisfy both the 'check the result' case and the 'I have a default that=20
I can silently fall back on' (via value_or) case.

The only downside is you can't store the result in a const *and* write=20
the call to strto inside the if() to check the result. (Though with your=20
proposal, you can't store the result in a const at all...)

--=20
Matthew

--=20

---=20
You received this message because you are subscribed to the Google Groups "=
ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposa=
ls/.

.


Author: Bengt Gustafsson <bengt.gustafsson@beamways.com>
Date: Wed, 29 Jan 2014 08:18:13 -0800 (PST)
Raw View
------=_Part_958_16039594.1391012293637
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

Regarding default values produced when the conversion fails this is another=
=20
argument for this style:

<error return type> from_string(T& dest, string_view& src)

Now the standard can specify that the function shall not touch dest if=20
conversion fails. The default value is the previous value of the variable!

The error return type can be the error_return type I mentioned above, which=
=20
actually should contain an exception_ptr:

class error_return {
public:
    error_return() m_handled(false) {}   // Ok case: No exception
    error_return(exception_ptr ex) : m_handled(false), m_exception(ex) {}
    ~error_return() {
        if (m_handled)
            return;
        if (m_exception)
            rethrow_exception(mException);
        else
            throw exception("return value not checked");
    }

    void ignore() { m_handled =3D true; }   // Use to explicitly ignore err=
ors
    void rethrow() { m_handled =3D true; rethrow_exception(m_exception); }
    operator bool() { m_handled =3D true; return !m_exception; }      // fo=
r=20
if-type check. true (good) if m_exeption is null.
private:
    bool m_handled;
    exception_ptr m_exception;
};

Usage:

error_return from_string(T& dest, string_view& src) {
    if (... could convert ...)
       return error_return();
    else
       return make_exception_ptr(exception("Could not convert"));
}


// No check required
from_string(dest, "123").ignore();

// Check using if
if (from_string(dest, "123"))
    ... handle error;

// Throw on error
from_string(dest, "123").rethrow();

// Programming error!
from_string(dest, "123");    // throws on first call even if conversion can=
=20
be made!



Den m=C3=A5ndagen den 27:e januari 2014 kl. 09:27:11 UTC-8 skrev Matthew Wo=
ehlke:
>
> On 2014-01-26 11:25, Matt Fioravante wrote:=20
> > //* prefix (0x or 0X) indicating hexadecimal base (applies only when=20
> base=20
> > is 16 or 0).=20
>
> If this also works for float/double I will be a happy man :-). (I have a=
=20
> project for which I intend to use base-16 to store double values on disk=
=20
> (textual format) in order to avoid rounding issues.)=20
>
> > //* All of the rest of the characters MUST be digits.=20
>
> Do you support locale-specific digits? E.g. will you parse "=E4=BA=8C=E5=
=8D=83=E4=BA=8C=E5=8D=81=E4=BA=94"=20
> in a Japanese locale? What about locale-specific digit grouping and=20
> decimal separators?=20
>
> > //Returns true if an integral value was successfully parsed and stores=
=20
> the=20
> > value in val,=20
>
> Why not return a std::optional? I've never been a fan of (non-const)=20
> by-ref parameters; they make it hard to impossible to store the "output"=
=20
> value in a const local.=20
>
> > //Sets errno to ERANGE if the string was an integer but would overflow=
=20
> type=20
> > integral.=20
>
> Floating point can overflow also, or do you return +/- infinity in that=
=20
> case? (Maybe do both?)=20
>
> > template <typename integral>=20
> > constexpr bool strto(string_view s, integral& val, int base);=20
>
> 'strto' seems to be missing something (though if you return an optional,=
=20
> one would have to write 'strto<type>' which would be better). You might=
=20
> also consider something like 'parse_number'; I wouldn't name it like=20
> 'strto' just because there are already functions named similarly.=20
>
> > First off, all of these return bool which makes it very easy to check=
=20
> > whether or not parsing failed.=20
> >=20
> > While the interface does not allow this idom:=20
> >=20
> > int x =3D atoi(s);=20
>
> Again, if you instead returned a std::optional, I believe both of these=
=20
> would be covered. Referring to later discussion, std::optional would=20
> satisfy both the 'check the result' case and the 'I have a default that=
=20
> I can silently fall back on' (via value_or) case.=20
>
> The only downside is you can't store the result in a const *and* write=20
> the call to strto inside the if() to check the result. (Though with your=
=20
> proposal, you can't store the result in a const at all...)=20
>
> --=20
> Matthew=20
>
>

--=20

---=20
You received this message because you are subscribed to the Google Groups "=
ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposa=
ls/.

------=_Part_958_16039594.1391012293637
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr">Regarding default values produced when the conversion fail=
s this is another argument for this style:<div><br></div><div>&lt;error ret=
urn type&gt; from_string(T&amp; dest, string_view&amp; src)</div><div><br><=
/div><div>Now the standard can specify that the function shall not touch de=
st if conversion fails. The default value is the previous value of the vari=
able!</div><div><br></div><div>The error return type can be the error_retur=
n type I mentioned above, which actually should contain an exception_ptr:</=
div><div><br></div><div class=3D"prettyprint" style=3D"background-color: rg=
b(250, 250, 250); border: 1px solid rgb(187, 187, 187); word-wrap: break-wo=
rd;"><code class=3D"prettyprint"><div class=3D"subprettyprint"><span style=
=3D"color: #008;" class=3D"styled-by-prettify">class</span><span style=3D"c=
olor: #000;" class=3D"styled-by-prettify"> error_return </span><span style=
=3D"color: #660;" class=3D"styled-by-prettify">{</span><span style=3D"color=
: #000;" class=3D"styled-by-prettify"><br></span><span style=3D"color: #008=
;" class=3D"styled-by-prettify">public</span><span style=3D"color: #660;" c=
lass=3D"styled-by-prettify">:</span><span style=3D"color: #000;" class=3D"s=
tyled-by-prettify"><br>&nbsp; &nbsp; error_return</span><span style=3D"colo=
r: #660;" class=3D"styled-by-prettify">()</span><span style=3D"color: #000;=
" class=3D"styled-by-prettify"> m_handled</span><span style=3D"color: #660;=
" class=3D"styled-by-prettify">(</span><span style=3D"color: #008;" class=
=3D"styled-by-prettify">false</span><span style=3D"color: #660;" class=3D"s=
tyled-by-prettify">)</span><span style=3D"color: #000;" class=3D"styled-by-=
prettify"> </span><span style=3D"color: #660;" class=3D"styled-by-prettify"=
>{}</span><span style=3D"color: #000;" class=3D"styled-by-prettify"> &nbsp;=
 </span><span style=3D"color: #800;" class=3D"styled-by-prettify">// Ok cas=
e: No exception</span><span style=3D"color: #000;" class=3D"styled-by-prett=
ify"><br>&nbsp; &nbsp; error_return</span><span style=3D"color: #660;" clas=
s=3D"styled-by-prettify">(</span><span style=3D"color: #000;" class=3D"styl=
ed-by-prettify">exception_ptr ex</span><span style=3D"color: #660;" class=
=3D"styled-by-prettify">)</span><span style=3D"color: #000;" class=3D"style=
d-by-prettify"> </span><span style=3D"color: #660;" class=3D"styled-by-pret=
tify">:</span><span style=3D"color: #000;" class=3D"styled-by-prettify"> m_=
handled</span><span style=3D"color: #660;" class=3D"styled-by-prettify">(</=
span><span style=3D"color: #008;" class=3D"styled-by-prettify">false</span>=
<span style=3D"color: #660;" class=3D"styled-by-prettify">),</span><span st=
yle=3D"color: #000;" class=3D"styled-by-prettify"> m_exception</span><span =
style=3D"color: #660;" class=3D"styled-by-prettify">(</span><span style=3D"=
color: #000;" class=3D"styled-by-prettify">ex</span><span style=3D"color: #=
660;" class=3D"styled-by-prettify">)</span><span style=3D"color: #000;" cla=
ss=3D"styled-by-prettify"> </span><span style=3D"color: #660;" class=3D"sty=
led-by-prettify">{}</span><span style=3D"color: #000;" class=3D"styled-by-p=
rettify"><br>&nbsp; &nbsp; </span><span style=3D"color: #660;" class=3D"sty=
led-by-prettify">~</span><span style=3D"color: #000;" class=3D"styled-by-pr=
ettify">error_return</span><span style=3D"color: #660;" class=3D"styled-by-=
prettify">()</span><span style=3D"color: #000;" class=3D"styled-by-prettify=
"> </span><span style=3D"color: #660;" class=3D"styled-by-prettify">{</span=
><span style=3D"color: #000;" class=3D"styled-by-prettify"><br>&nbsp; &nbsp=
; &nbsp; &nbsp; </span><span style=3D"color: #008;" class=3D"styled-by-pret=
tify">if</span><span style=3D"color: #000;" class=3D"styled-by-prettify"> <=
/span><span style=3D"color: #660;" class=3D"styled-by-prettify">(</span><sp=
an style=3D"color: #000;" class=3D"styled-by-prettify">m_handled</span><spa=
n style=3D"color: #660;" class=3D"styled-by-prettify">)</span><span style=
=3D"color: #000;" class=3D"styled-by-prettify"><br>&nbsp; &nbsp; &nbsp; &nb=
sp; &nbsp; &nbsp; </span><span style=3D"color: #008;" class=3D"styled-by-pr=
ettify">return</span><span style=3D"color: #660;" class=3D"styled-by-pretti=
fy">;</span><span style=3D"color: #000;" class=3D"styled-by-prettify"><br>&=
nbsp; &nbsp; &nbsp; &nbsp; </span><span style=3D"color: #008;" class=3D"sty=
led-by-prettify">if</span><span style=3D"color: #000;" class=3D"styled-by-p=
rettify"> </span><span style=3D"color: #660;" class=3D"styled-by-prettify">=
(</span><span style=3D"color: #000;" class=3D"styled-by-prettify">m_excepti=
on</span><span style=3D"color: #660;" class=3D"styled-by-prettify">)</span>=
<span style=3D"color: #000;" class=3D"styled-by-prettify"><br>&nbsp; &nbsp;=
 &nbsp; &nbsp; &nbsp; &nbsp; rethrow_exception</span><span style=3D"color: =
#660;" class=3D"styled-by-prettify">(</span><span style=3D"color: #000;" cl=
ass=3D"styled-by-prettify">mException</span><span style=3D"color: #660;" cl=
ass=3D"styled-by-prettify">);</span><span style=3D"color: #000;" class=3D"s=
tyled-by-prettify"><br>&nbsp; &nbsp; &nbsp; &nbsp; </span><span style=3D"co=
lor: #008;" class=3D"styled-by-prettify">else</span><span style=3D"color: #=
000;" class=3D"styled-by-prettify"><br>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &=
nbsp; </span><span style=3D"color: #008;" class=3D"styled-by-prettify">thro=
w</span><span style=3D"color: #000;" class=3D"styled-by-prettify"> exceptio=
n</span><span style=3D"color: #660;" class=3D"styled-by-prettify">(</span><=
span style=3D"color: #080;" class=3D"styled-by-prettify">"return value not =
checked"</span><span style=3D"color: #660;" class=3D"styled-by-prettify">);=
</span><span style=3D"color: #000;" class=3D"styled-by-prettify"><br>&nbsp;=
 &nbsp; </span><span style=3D"color: #660;" class=3D"styled-by-prettify">}<=
/span><span style=3D"color: #000;" class=3D"styled-by-prettify"><br><br>&nb=
sp; &nbsp; </span><span style=3D"color: #008;" class=3D"styled-by-prettify"=
>void</span><span style=3D"color: #000;" class=3D"styled-by-prettify"> igno=
re</span><span style=3D"color: #660;" class=3D"styled-by-prettify">()</span=
><span style=3D"color: #000;" class=3D"styled-by-prettify"> </span><span st=
yle=3D"color: #660;" class=3D"styled-by-prettify">{</span><span style=3D"co=
lor: #000;" class=3D"styled-by-prettify"> m_handled </span><span style=3D"c=
olor: #660;" class=3D"styled-by-prettify">=3D</span><span style=3D"color: #=
000;" class=3D"styled-by-prettify"> </span><span style=3D"color: #008;" cla=
ss=3D"styled-by-prettify">true</span><span style=3D"color: #660;" class=3D"=
styled-by-prettify">;</span><span style=3D"color: #000;" class=3D"styled-by=
-prettify"> </span><span style=3D"color: #660;" class=3D"styled-by-prettify=
">}</span><span style=3D"color: #000;" class=3D"styled-by-prettify"> &nbsp;=
 </span><span style=3D"color: #800;" class=3D"styled-by-prettify">// Use to=
 explicitly ignore errors</span><span style=3D"color: #000;" class=3D"style=
d-by-prettify"><br>&nbsp; &nbsp; </span><span style=3D"color: #008;" class=
=3D"styled-by-prettify">void</span><span style=3D"color: #000;" class=3D"st=
yled-by-prettify"> rethrow</span><span style=3D"color: #660;" class=3D"styl=
ed-by-prettify">()</span><span style=3D"color: #000;" class=3D"styled-by-pr=
ettify"> </span><span style=3D"color: #660;" class=3D"styled-by-prettify">{=
</span><span style=3D"color: #000;" class=3D"styled-by-prettify"> m_handled=
 </span><span style=3D"color: #660;" class=3D"styled-by-prettify">=3D</span=
><span style=3D"color: #000;" class=3D"styled-by-prettify"> </span><span st=
yle=3D"color: #008;" class=3D"styled-by-prettify">true</span><span style=3D=
"color: #660;" class=3D"styled-by-prettify">;</span><span style=3D"color: #=
000;" class=3D"styled-by-prettify"> rethrow_exception</span><span style=3D"=
color: #660;" class=3D"styled-by-prettify">(</span><span style=3D"color: #0=
00;" class=3D"styled-by-prettify">m_exception</span><span style=3D"color: #=
660;" class=3D"styled-by-prettify">);</span><span style=3D"color: #000;" cl=
ass=3D"styled-by-prettify"> </span><span style=3D"color: #660;" class=3D"st=
yled-by-prettify">}</span><span style=3D"color: #000;" class=3D"styled-by-p=
rettify"><br>&nbsp; &nbsp; </span><span style=3D"color: #008;" class=3D"sty=
led-by-prettify">operator</span><span style=3D"color: #000;" class=3D"style=
d-by-prettify"> </span><span style=3D"color: #008;" class=3D"styled-by-pret=
tify">bool</span><span style=3D"color: #660;" class=3D"styled-by-prettify">=
()</span><span style=3D"color: #000;" class=3D"styled-by-prettify"> </span>=
<span style=3D"color: #660;" class=3D"styled-by-prettify">{</span><span sty=
le=3D"color: #000;" class=3D"styled-by-prettify"> m_handled </span><span st=
yle=3D"color: #660;" class=3D"styled-by-prettify">=3D</span><span style=3D"=
color: #000;" class=3D"styled-by-prettify"> </span><span style=3D"color: #0=
08;" class=3D"styled-by-prettify">true</span><span style=3D"color: #660;" c=
lass=3D"styled-by-prettify">;</span><span style=3D"color: #000;" class=3D"s=
tyled-by-prettify"> </span><span style=3D"color: #008;" class=3D"styled-by-=
prettify">return</span><span style=3D"color: #000;" class=3D"styled-by-pret=
tify"> </span><span style=3D"color: #660;" class=3D"styled-by-prettify">!</=
span><span style=3D"color: #000;" class=3D"styled-by-prettify">m_exception<=
/span><span style=3D"color: #660;" class=3D"styled-by-prettify">;</span><sp=
an style=3D"color: #000;" class=3D"styled-by-prettify"> </span><span style=
=3D"color: #660;" class=3D"styled-by-prettify">}</span><span style=3D"color=
: #000;" class=3D"styled-by-prettify"> &nbsp; &nbsp; &nbsp;</span><span sty=
le=3D"color: #800;" class=3D"styled-by-prettify">// for if-type check. true=
 (good) if m_exeption is null.</span><span style=3D"color: #000;" class=3D"=
styled-by-prettify"><br></span><span style=3D"color: #008;" class=3D"styled=
-by-prettify">private</span><span style=3D"color: #660;" class=3D"styled-by=
-prettify">:</span><span style=3D"color: #000;" class=3D"styled-by-prettify=
"><br>&nbsp; &nbsp; </span><span style=3D"color: #008;" class=3D"styled-by-=
prettify">bool</span><span style=3D"color: #000;" class=3D"styled-by-pretti=
fy"> m_handled</span><span style=3D"color: #660;" class=3D"styled-by-pretti=
fy">;</span><span style=3D"color: #000;" class=3D"styled-by-prettify"><br>&=
nbsp; &nbsp; </span><font color=3D"#000088"><span style=3D"color: #000;" cl=
ass=3D"styled-by-prettify">exception_ptr </span></font><span style=3D"color=
: #000;" class=3D"styled-by-prettify">m_exception</span><span style=3D"colo=
r: #660;" class=3D"styled-by-prettify">;</span><span style=3D"color: #000;"=
 class=3D"styled-by-prettify"><br></span><span style=3D"color: #660;" class=
=3D"styled-by-prettify">};</span><span style=3D"color: #000;" class=3D"styl=
ed-by-prettify"><br><br></span><span style=3D"color: #606;" class=3D"styled=
-by-prettify">Usage</span><span style=3D"color: #660;" class=3D"styled-by-p=
rettify">:</span><span style=3D"color: #000;" class=3D"styled-by-prettify">=
<br><br>error_return from_string</span><span style=3D"color: #660;" class=
=3D"styled-by-prettify">(</span><span style=3D"color: #000;" class=3D"style=
d-by-prettify">T</span><span style=3D"color: #660;" class=3D"styled-by-pret=
tify">&amp;</span><span style=3D"color: #000;" class=3D"styled-by-prettify"=
> dest</span><span style=3D"color: #660;" class=3D"styled-by-prettify">,</s=
pan><span style=3D"color: #000;" class=3D"styled-by-prettify"> string_view<=
/span><span style=3D"color: #660;" class=3D"styled-by-prettify">&amp;</span=
><span style=3D"color: #000;" class=3D"styled-by-prettify"> src</span><span=
 style=3D"color: #660;" class=3D"styled-by-prettify">)</span><span style=3D=
"color: #000;" class=3D"styled-by-prettify"> </span><span style=3D"color: #=
660;" class=3D"styled-by-prettify">{</span><span style=3D"color: #000;" cla=
ss=3D"styled-by-prettify"><br>&nbsp; &nbsp; </span><span style=3D"color: #0=
08;" class=3D"styled-by-prettify">if</span><span style=3D"color: #000;" cla=
ss=3D"styled-by-prettify"> </span><span style=3D"color: #660;" class=3D"sty=
led-by-prettify">(...</span><span style=3D"color: #000;" class=3D"styled-by=
-prettify"> could convert </span><span style=3D"color: #660;" class=3D"styl=
ed-by-prettify">...)</span><span style=3D"color: #000;" class=3D"styled-by-=
prettify"><br>&nbsp; &nbsp; &nbsp; &nbsp;</span><span style=3D"color: #008;=
" class=3D"styled-by-prettify">return</span><span style=3D"color: #000;" cl=
ass=3D"styled-by-prettify"> error_return</span><span style=3D"color: #660;"=
 class=3D"styled-by-prettify">();</span><span style=3D"color: #000;" class=
=3D"styled-by-prettify"><br>&nbsp; &nbsp; </span><span style=3D"color: #008=
;" class=3D"styled-by-prettify">else</span><span style=3D"color: #000;" cla=
ss=3D"styled-by-prettify"><br>&nbsp; &nbsp; &nbsp; &nbsp;</span><span style=
=3D"color: #008;" class=3D"styled-by-prettify">return</span><span style=3D"=
color: #000;" class=3D"styled-by-prettify"> make_exception_ptr</span><span =
style=3D"color: #660;" class=3D"styled-by-prettify">(</span><span style=3D"=
color: #000;" class=3D"styled-by-prettify">exception</span><span style=3D"c=
olor: #660;" class=3D"styled-by-prettify">(</span><span style=3D"color: #08=
0;" class=3D"styled-by-prettify">"Could not convert"</span><span style=3D"c=
olor: #660;" class=3D"styled-by-prettify">));</span><span style=3D"color: #=
000;" class=3D"styled-by-prettify"><br></span><span style=3D"color: #660;" =
class=3D"styled-by-prettify">}</span><span style=3D"color: #000;" class=3D"=
styled-by-prettify"><br><br><br></span><span style=3D"color: #800;" class=
=3D"styled-by-prettify">// No check required</span><span style=3D"color: #0=
00;" class=3D"styled-by-prettify"><br>from_string</span><span style=3D"colo=
r: #660;" class=3D"styled-by-prettify">(</span><span style=3D"color: #000;"=
 class=3D"styled-by-prettify">dest</span><span style=3D"color: #660;" class=
=3D"styled-by-prettify">,</span><span style=3D"color: #000;" class=3D"style=
d-by-prettify"> </span><span style=3D"color: #080;" class=3D"styled-by-pret=
tify">"123"</span><span style=3D"color: #660;" class=3D"styled-by-prettify"=
>).</span><span style=3D"color: #000;" class=3D"styled-by-prettify">ignore<=
/span><span style=3D"color: #660;" class=3D"styled-by-prettify">();</span><=
span style=3D"color: #000;" class=3D"styled-by-prettify"><br><br></span><sp=
an style=3D"color: #800;" class=3D"styled-by-prettify">// Check using if</s=
pan><span style=3D"color: #000;" class=3D"styled-by-prettify"><br></span><s=
pan style=3D"color: #008;" class=3D"styled-by-prettify">if</span><span styl=
e=3D"color: #000;" class=3D"styled-by-prettify"> </span><span style=3D"colo=
r: #660;" class=3D"styled-by-prettify">(</span><span style=3D"color: #000;"=
 class=3D"styled-by-prettify">from_string</span><span style=3D"color: #660;=
" class=3D"styled-by-prettify">(</span><span style=3D"color: #000;" class=
=3D"styled-by-prettify">dest</span><span style=3D"color: #660;" class=3D"st=
yled-by-prettify">,</span><span style=3D"color: #000;" class=3D"styled-by-p=
rettify"> </span><span style=3D"color: #080;" class=3D"styled-by-prettify">=
"123"</span><span style=3D"color: #660;" class=3D"styled-by-prettify">))</s=
pan><span style=3D"color: #000;" class=3D"styled-by-prettify"><br>&nbsp; &n=
bsp; </span><span style=3D"color: #660;" class=3D"styled-by-prettify">...</=
span><span style=3D"color: #000;" class=3D"styled-by-prettify"> handle erro=
r</span><span style=3D"color: #660;" class=3D"styled-by-prettify">;</span><=
span style=3D"color: #000;" class=3D"styled-by-prettify"><br><br></span><sp=
an style=3D"color: #800;" class=3D"styled-by-prettify">// Throw on error</s=
pan><span style=3D"color: #000;" class=3D"styled-by-prettify"><br>from_stri=
ng</span><span style=3D"color: #660;" class=3D"styled-by-prettify">(</span>=
<span style=3D"color: #000;" class=3D"styled-by-prettify">dest</span><span =
style=3D"color: #660;" class=3D"styled-by-prettify">,</span><span style=3D"=
color: #000;" class=3D"styled-by-prettify"> </span><span style=3D"color: #0=
80;" class=3D"styled-by-prettify">"123"</span><span style=3D"color: #660;" =
class=3D"styled-by-prettify">).</span><span style=3D"color: #000;" class=3D=
"styled-by-prettify">rethrow</span><span style=3D"color: #660;" class=3D"st=
yled-by-prettify">();</span><span style=3D"color: #000;" class=3D"styled-by=
-prettify"><br><br></span><span style=3D"color: #800;" class=3D"styled-by-p=
rettify">// Programming error!</span><span style=3D"color: #000;" class=3D"=
styled-by-prettify"><br>from_string</span><span style=3D"color: #660;" clas=
s=3D"styled-by-prettify">(</span><span style=3D"color: #000;" class=3D"styl=
ed-by-prettify">dest</span><span style=3D"color: #660;" class=3D"styled-by-=
prettify">,</span><span style=3D"color: #000;" class=3D"styled-by-prettify"=
> </span><span style=3D"color: #080;" class=3D"styled-by-prettify">"123"</s=
pan><span style=3D"color: #660;" class=3D"styled-by-prettify">);</span><spa=
n style=3D"color: #000;" class=3D"styled-by-prettify"> &nbsp; &nbsp;</span>=
<span style=3D"color: #800;" class=3D"styled-by-prettify">// throws on firs=
t call even if conversion can be made!</span><span style=3D"color: #000;" c=
lass=3D"styled-by-prettify"><br><br></span></div></code></div><div><br><br>=
Den m=C3=A5ndagen den 27:e januari 2014 kl. 09:27:11 UTC-8 skrev Matthew Wo=
ehlke:<blockquote class=3D"gmail_quote" style=3D"margin: 0;margin-left: 0.8=
ex;border-left: 1px #ccc solid;padding-left: 1ex;">On 2014-01-26 11:25, Mat=
t Fioravante wrote:
<br>&gt; //* prefix (0x or 0X) indicating hexadecimal base (applies only wh=
en base
<br>&gt; is 16 or 0).
<br>
<br>If this also works for float/double I will be a happy man :-). (I have =
a=20
<br>project for which I intend to use base-16 to store double values on dis=
k=20
<br>(textual format) in order to avoid rounding issues.)
<br>
<br>&gt; //* All of the rest of the characters MUST be digits.
<br>
<br>Do you support locale-specific digits? E.g. will you parse "=E4=BA=8C=
=E5=8D=83=E4=BA=8C=E5=8D=81=E4=BA=94"=20
<br>in a Japanese locale? What about locale-specific digit grouping and=20
<br>decimal separators?
<br>
<br>&gt; //Returns true if an integral value was successfully parsed and st=
ores the
<br>&gt; value in val,
<br>
<br>Why not return a std::optional? I've never been a fan of (non-const)=20
<br>by-ref parameters; they make it hard to impossible to store the "output=
"=20
<br>value in a const local.
<br>
<br>&gt; //Sets errno to ERANGE if the string was an integer but would over=
flow type
<br>&gt; integral.
<br>
<br>Floating point can overflow also, or do you return +/- infinity in that=
=20
<br>case? (Maybe do both?)
<br>
<br>&gt; template &lt;typename integral&gt;
<br>&gt; constexpr bool strto(string_view s, integral&amp; val, int base);
<br>
<br>'strto' seems to be missing something (though if you return an optional=
,=20
<br>one would have to write 'strto&lt;type&gt;' which would be better). You=
 might=20
<br>also consider something like 'parse_number'; I wouldn't name it like=20
<br>'strto' just because there are already functions named similarly.
<br>
<br>&gt; First off, all of these return bool which makes it very easy to ch=
eck
<br>&gt; whether or not parsing failed.
<br>&gt;
<br>&gt; While the interface does not allow this idom:
<br>&gt;
<br>&gt; int x =3D atoi(s);
<br>
<br>Again, if you instead returned a std::optional, I believe both of these=
=20
<br>would be covered. Referring to later discussion, std::optional would=20
<br>satisfy both the 'check the result' case and the 'I have a default that=
=20
<br>I can silently fall back on' (via value_or) case.
<br>
<br>The only downside is you can't store the result in a const *and* write=
=20
<br>the call to strto inside the if() to check the result. (Though with you=
r=20
<br>proposal, you can't store the result in a const at all...)
<br>
<br>--=20
<br>Matthew
<br>
<br></blockquote></div></div>

<p></p>

-- <br />
&nbsp;<br />
--- <br />
You received this message because you are subscribed to the Google Groups &=
quot;ISO C++ Standard - Future Proposals&quot; group.<br />
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to std-proposals+unsubscribe@isocpp.org.<br />
To post to this group, send email to std-proposals@isocpp.org.<br />
Visit this group at <a href=3D"http://groups.google.com/a/isocpp.org/group/=
std-proposals/">http://groups.google.com/a/isocpp.org/group/std-proposals/<=
/a>.<br />

------=_Part_958_16039594.1391012293637--

.


Author: Matthew Woehlke <mw_triad@users.sourceforge.net>
Date: Wed, 29 Jan 2014 11:49:40 -0500
Raw View
On 2014-01-29 11:18, Bengt Gustafsson wrote:
> Regarding default values produced when the conversion fails this is another
> argument for this style:
>
> <error return type> from_string(T& dest, string_view& src)
>
> Now the standard can specify that the function shall not touch dest if
> conversion fails. The default value is the previous value of the variable!

So... not only can I still not assign the result to a const local, now
'dest' potentially contains uninitialized memory? I don't see how that's
an improvement.

If it is really necessary to have a description of the failure type (and
errno is not suitable; personally I find nothing wrong with using
errno), then maybe a return type that is similar to std::optional with
an additional 'why it is disengaged' could be created. (Maybe even
subclass std::optional and call it e.g. std::result?)

> // No check required
> from_string(dest, "123").ignore();

You omitted the declaration and initialization of 'dest'. IOW:

// your proposal
auto dest = int{12};
from_string(dest, "34").ignore();
foo(dest);

- vs. -

// std::optional as return type
foo(from_string<int>("34").value_or(12));

Using std::optional, I (in the above example, anyway) avoided even
having a named variable to receive the value. And if I wanted one, I
could make it const, which I couldn't do with your version.

--
Matthew

--

---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.

.


Author: Olaf van der Spek <olafvdspek@gmail.com>
Date: Mon, 3 Feb 2014 03:13:23 -0800 (PST)
Raw View
------=_Part_4010_7054658.1391426003795
Content-Type: text/plain; charset=UTF-8

On Sunday, January 26, 2014 5:25:02 PM UTC+1, Matthew Fioravante wrote:
>
> Here is a first attempt for an integer parsing routine.
>
> //Attempts to parse s as an integer. The valid integer string consists of
> the following:
> //* '+' or '-' sign as the first character (- only acceptable for signed
> integral types)
> //* prefix (0) indicating octal base (applies only when base is 0 or 8)
>

I'd prefer 07 to be parsed as 7. Most non-dev people probably expect this
as well.
Is octal still being used?


> Similarly we can define this for floating point types. We may also want
> null terminated const char* versions as converting a const char* to
> sting_view requires a call to strlen().
>

What's the problem with strlen()?

--

---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.

------=_Part_4010_7054658.1391426003795
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr">On Sunday, January 26, 2014 5:25:02 PM UTC+1, Matthew Fior=
avante wrote:<blockquote class=3D"gmail_quote" style=3D"margin: 0;margin-le=
ft: 0.8ex;border-left: 1px #ccc solid;padding-left: 1ex;"><div dir=3D"ltr">=
<div><div>Here is a first attempt for an integer parsing routine.</div></di=
v><div><br></div><div>//Attempts to parse s as an integer. The valid intege=
r string consists of the following:</div><div>//* '+' or '-' sign as the fi=
rst character (- only acceptable for signed integral types)</div><div>//* p=
refix (0) indicating octal base (applies only when base is 0 or 8)</div></d=
iv></blockquote><div><br></div><div>I'd prefer 07 to be parsed as 7. Most n=
on-dev people probably expect this as well.</div><div>Is octal still being =
used?</div><div>&nbsp;</div><blockquote class=3D"gmail_quote" style=3D"marg=
in: 0;margin-left: 0.8ex;border-left: 1px #ccc solid;padding-left: 1ex;"><d=
iv dir=3D"ltr"><div>Similarly we can define this for floating point types. =
We may also want null terminated const char* versions as converting a const=
 char* to sting_view requires a call to strlen().&nbsp;</div></div></blockq=
uote><div><br></div><div>What's the problem with strlen()?&nbsp;</div><div>=
<br></div></div>

<p></p>

-- <br />
&nbsp;<br />
--- <br />
You received this message because you are subscribed to the Google Groups &=
quot;ISO C++ Standard - Future Proposals&quot; group.<br />
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to std-proposals+unsubscribe@isocpp.org.<br />
To post to this group, send email to std-proposals@isocpp.org.<br />
Visit this group at <a href=3D"http://groups.google.com/a/isocpp.org/group/=
std-proposals/">http://groups.google.com/a/isocpp.org/group/std-proposals/<=
/a>.<br />

------=_Part_4010_7054658.1391426003795--

.


Author: Matthew Woehlke <mw_triad@users.sourceforge.net>
Date: Mon, 03 Feb 2014 11:20:34 -0500
Raw View
On 2014-02-03 06:13, Olaf van der Spek wrote:
> On Sunday, January 26, 2014 5:25:02 PM UTC+1, Matthew Fioravante wrote:
>> Here is a first attempt for an integer parsing routine.
>>
>> //Attempts to parse s as an integer. The valid integer string consists of
>> the following:
>> //* '+' or '-' sign as the first character (- only acceptable for signed
>> integral types)
>> //* prefix (0) indicating octal base (applies only when base is 0 or 8)
>
> I'd prefer 07 to be parsed as 7. Most non-dev people probably expect this
> as well.
> Is octal still being used?

You can do this by passing base = 10. '0' as a prefix only means base 8
when passing base = 0 (i.e. detect from prefix).

Or rather, I would hope/expect the above is true. Reading closer it
isn't obvious if leading 0's are permitted when the base is explicitly
specified. Probably they should be.

>> Similarly we can define this for floating point types. We may also want
>> null terminated const char* versions as converting a const char* to
>> sting_view requires a call to strlen().
>
> What's the problem with strlen()?

It requires additional execution cycles that don't provide any real
benefit. And yes, that *does* matter; there are definitely cases where
string to number conversion is a performance bottleneck, e.g. when
reading large files of data in textual format. (I say this from actual
real-world personal experience.)

--
Matthew

--

---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.

.


Author: Matthew Woehlke <mw_triad@users.sourceforge.net>
Date: Mon, 03 Feb 2014 12:36:45 -0500
Raw View
On 2014-02-03 11:25, Olaf van der Spek wrote:
> On Mon, Feb 3, 2014 at 5:20 PM, Matthew Woehlke
> <mw_triad@users.sourceforge.net> wrote:
>>> I'd prefer 07 to be parsed as 7. Most non-dev people probably expect this
>>> as well.
>>> Is octal still being used?
>>
>>
>> You can do this by passing base = 10. '0' as a prefix only means base 8 when
>> passing base = 0 (i.e. detect from prefix).
>
> What if I want dec and hex but no octal? ;)

That's a fair question :-). (But so is if we should throw out the 0
prefix as indicating octal.)

I suppose you could test if the string starts with '0x' and call with
either in,base=10 or in+2,base=16. Not saying that's ideal, though.
(Even if I suspect that performance-wise it would be similar to base=0.)

>>> What's the problem with strlen()?
>>
>> It requires additional execution cycles that don't provide any real benefit.
>> And yes, that *does* matter; there are definitely cases where string to
>> number conversion is a performance bottleneck, e.g. when reading large files
>> of data in textual format. (I say this from actual real-world personal
>> experience.)
>
> Numbers in text files are not nul-terminated.

They are if I'm using a CSV or XML parsing library that yields
NUL-terminated char*. (And I seem to recall that such do exist, i.e.
they take a char* buffer and substitute NUL at the end of "values".)

"What's the problem with strlen()" is that is has potential performance
implications given char const* input data. I'm not convinced that trying
to determine if there is any reasonable instance where one has char
const* data in a situation that is performance sensitive qualifies as
reason to disregard that.

--
Matthew

--

---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.

.


Author: Matthew Woehlke <mw_triad@users.sourceforge.net>
Date: Mon, 03 Feb 2014 13:41:27 -0500
Raw View
On 2014-02-03 12:46, Jeffrey Yasskin wrote:
> On Mon, Feb 3, 2014 at 9:36 AM, Matthew Woehlke wrote:
>> On 2014-02-03 11:25, Olaf van der Spek wrote:
>>> Numbers in text files are not nul-terminated.
>>
>> They are if I'm using a CSV or XML parsing library that yields
>> NUL-terminated char*. (And I seem to recall that such do exist, i.e. they
>> take a char* buffer and substitute NUL at the end of "values".)
>
> FWIW, your CSV or XML parsing library should be changed to return a
> string_view or equivalent.

What if it's a C library? :-)

> It has the size, and is throwing it out,
> forcing your number parser to do redundant checks for '\0', which
> slows you down.

I'm not sure it's redundant... if I pass a char*, then the
implementation must check for NUL but does not need to do any sort of
index check. OTOH if I pass a string_view, presumably it is going to
stop parsing if it finds a NUL anyway, same as it would stop for e.g.
'!', random-control-character, etc.

So actually, it may be that string_view implementation does everything
that the the char* implementation does *plus* an index check. (Which, in
fact, means that the char* implementation may be preferred if you know
that your string is NUL-terminated, even if you *have* a string_view...)

--
Matthew

--

---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.

.


Author: Matthew Woehlke <mw_triad@users.sourceforge.net>
Date: Mon, 03 Feb 2014 13:44:23 -0500
Raw View
On 2014-02-03 12:58, Thiago Macieira wrote:
> Em seg 03 fev 2014, =C3=A0s 12:36:45, Matthew Woehlke escreveu:
>>> Numbers in text files are not nul-terminated.
>>
>> They are if I'm using a CSV or XML parsing library that yields
>> NUL-terminated char*. (And I seem to recall that such do exist, i.e.
>> they take a char* buffer and substitute NUL at the end of "values".)
>
> No, they're not. None of my CSV and XML files on disk have NULs.
>
> If you're getting a NUL, it means your library actually did malloc() to
> allocate memory just so it could set a \0 there, which totally offsets th=
e cost
> of strlen. If your library is doing that, then strlen() performance is no=
t the
> issue.

I'm talking about libraries that require a mutable input buffer=C2=B9 and=
=20
replace ends-of-"values" in that buffer with NUL. (I can't think what it=20
was, offhand, but pretty sure I came across a library that did exactly=20
this.)

(=C2=B9 or is doing the file I/O itself and so has a mutable input buffer.)

--=20
Matthew

--=20

---=20
You received this message because you are subscribed to the Google Groups "=
ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposa=
ls/.

.


Author: Bengt Gustafsson <bengt.gustafsson@beamways.com>
Date: Mon, 3 Feb 2014 12:47:41 -0800 (PST)
Raw View
------=_Part_799_17631602.1391460461943
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

I know of a XML parser that sets nul in the buffer to save time (apart from=
=20
the one I made 15 years ago). Forgot the name now.

I think we should try to find a good solution to the problem of using a=20
Range-concept for input (same prerequisites as to be able to use range=20
based for) as the input and preserve
the information about how many characters were consumed.=20

Creating a range type which stops at nul is simple, for instance let end()=
=20
return nullptr and begin() return the input pointer wrapped in an object=20
whose operator++ sets the wrapped pointer to nullptr when a nul is=20
encountered. This of course only works as a forward iterator, but that's ok=
=20
for this case.

Here's an idea of how to design the from_string() without using a ref=20
parameter as the result:

The return type has three members:

a) the value
b) a RANGE representing the rest of the input
c) an exception_ptr or similar error information

The idea is if you just do:

auto v =3D from_string<int>("123");

You get an exception both if exeption_ptr is non-null and if the returned=
=20
range is non-empty. To avoid the dreaded exception in destructor these=20
checks must be done in the return objects operator T.

This however needs operator bool() to be called before operator T, and (to=
=20
handle the remaining source character case) also retrieve the remaining=20
range to show that you handle it. This does not provide for any neat call=
=20
sites.

No, I don't see how this can be done in a reasonably neat way without using=
=20
a T& dest in the parameter list. As for the destructor exception the return=
=20
value can now be just a wrapper around a exception_ptr, it should be=20
possible to design so that its dtor can throw the encapsulated=20
expception_ptr without creating problems, as there are no other members.

class exception_enforcer {
public:
     ~exception_enforcer() {
            if (ex_ptr)
                  rethrow_exception(ex_ptr);    // I think this will=20
destroy the ex_ptr while unwinding, while having taken a copy of the pointe=
e
     operator bool() { bool ret=3Dex_ptr; ex_ptr =3D nullptr; }
};

template<typename T, typename RANGE> exception_enforcer parse_string(T&=20
dest, RANGE& src);  // Note non-const ref to range. from_string instead has=
=20
a const ref and requires all of src to be eaten.



Den m=C3=A5ndagen den 3:e februari 2014 kl. 19:44:23 UTC+1 skrev Matthew Wo=
ehlke:
>
> On 2014-02-03 12:58, Thiago Macieira wrote:=20
> > Em seg 03 fev 2014, =C3=A0s 12:36:45, Matthew Woehlke escreveu:=20
> >>> Numbers in text files are not nul-terminated.=20
> >>=20
> >> They are if I'm using a CSV or XML parsing library that yields=20
> >> NUL-terminated char*. (And I seem to recall that such do exist, i.e.=
=20
> >> they take a char* buffer and substitute NUL at the end of "values".)=
=20
> >=20
> > No, they're not. None of my CSV and XML files on disk have NULs.=20
> >=20
> > If you're getting a NUL, it means your library actually did malloc() to=
=20
> > allocate memory just so it could set a \0 there, which totally offsets=
=20
> the cost=20
> > of strlen. If your library is doing that, then strlen() performance is=
=20
> not the=20
> > issue.=20
>
> I'm talking about libraries that require a mutable input buffer=C2=B9 and=
=20
> replace ends-of-"values" in that buffer with NUL. (I can't think what it=
=20
> was, offhand, but pretty sure I came across a library that did exactly=20
> this.)=20
>
> (=C2=B9 or is doing the file I/O itself and so has a mutable input buffer=
..)=20
>
> --=20
> Matthew=20
>
>

--=20

---=20
You received this message because you are subscribed to the Google Groups "=
ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposa=
ls/.

------=_Part_799_17631602.1391460461943
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr">I know of a XML parser that sets nul in the buffer to save=
 time (apart from the one I made 15 years ago). Forgot the name now.<div><b=
r></div><div>I think we should try to find a good solution to the problem o=
f using a Range-concept for input (same prerequisites as to be able to use =
range based for) as the input and preserve</div><div>the information about =
how many characters were consumed.&nbsp;</div><div><br></div><div>Creating =
a range type which stops at nul is simple, for instance let end() return nu=
llptr and begin() return the input pointer wrapped in an object whose opera=
tor++ sets the wrapped pointer to nullptr when a nul is encountered. This o=
f course only works as a forward iterator, but that's ok for this case.</di=
v><div><br></div><div>Here's an idea of how to design the from_string() wit=
hout using a ref parameter as the result:</div><div><br></div><div>The retu=
rn type has three members:</div><div><br></div><div>a) the value</div><div>=
b) a RANGE representing the rest of the input</div><div>c) an exception_ptr=
 or similar error information</div><div><br></div><div>The idea is if you j=
ust do:</div><div><br></div><div>auto v =3D from_string&lt;int&gt;("123");<=
/div><div><br></div><div>You get an exception both if exeption_ptr is non-n=
ull and if the returned range is non-empty. To avoid the dreaded exception =
in destructor these checks must be done in the return objects operator T.</=
div><div><br></div><div>This however needs operator bool() to be called bef=
ore operator T, and (to handle the remaining source character case) also re=
trieve the remaining range to show that you handle it. This does not provid=
e for any neat call sites.</div><div><br></div><div>No, I don't see how thi=
s can be done in a reasonably neat way without using a T&amp; dest in the p=
arameter list. As for the destructor exception the return value can now be =
just a wrapper around a exception_ptr, it should be possible to design so t=
hat its dtor can throw the encapsulated expception_ptr without creating pro=
blems, as there are no other members.</div><div><br></div><div>class except=
ion_enforcer {</div><div>public:</div><div>&nbsp; &nbsp; &nbsp;~exception_e=
nforcer() {</div><div>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; if (ex_ptr)=
</div><div>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; r=
ethrow_exception(ex_ptr); &nbsp; &nbsp;// I think this will destroy the ex_=
ptr while unwinding, while having taken a copy of the pointee</div><div>&nb=
sp; &nbsp; &nbsp;operator bool() { bool ret=3Dex_ptr; ex_ptr =3D nullptr; }=
</div><div>};</div><div><br></div><div>template&lt;typename T, typename RAN=
GE&gt; exception_enforcer parse_string(T&amp; dest, RANGE&amp; src); &nbsp;=
// Note non-const ref to range. from_string instead has a const ref and req=
uires all of src to be eaten.</div><div><br></div><div><br></div><div><br>D=
en m=C3=A5ndagen den 3:e februari 2014 kl. 19:44:23 UTC+1 skrev Matthew Woe=
hlke:<blockquote class=3D"gmail_quote" style=3D"margin: 0;margin-left: 0.8e=
x;border-left: 1px #ccc solid;padding-left: 1ex;">On 2014-02-03 12:58, Thia=
go Macieira wrote:
<br>&gt; Em seg 03 fev 2014, =C3=A0s 12:36:45, Matthew Woehlke escreveu:
<br>&gt;&gt;&gt; Numbers in text files are not nul-terminated.
<br>&gt;&gt;
<br>&gt;&gt; They are if I'm using a CSV or XML parsing library that yields
<br>&gt;&gt; NUL-terminated char*. (And I seem to recall that such do exist=
, i.e.
<br>&gt;&gt; they take a char* buffer and substitute NUL at the end of "val=
ues".)
<br>&gt;
<br>&gt; No, they're not. None of my CSV and XML files on disk have NULs.
<br>&gt;
<br>&gt; If you're getting a NUL, it means your library actually did malloc=
() to
<br>&gt; allocate memory just so it could set a \0 there, which totally off=
sets the cost
<br>&gt; of strlen. If your library is doing that, then strlen() performanc=
e is not the
<br>&gt; issue.
<br>
<br>I'm talking about libraries that require a mutable input buffer=C2=B9 a=
nd=20
<br>replace ends-of-"values" in that buffer with NUL. (I can't think what i=
t=20
<br>was, offhand, but pretty sure I came across a library that did exactly=
=20
<br>this.)
<br>
<br>(=C2=B9 or is doing the file I/O itself and so has a mutable input buff=
er.)
<br>
<br>--=20
<br>Matthew
<br>
<br></blockquote></div></div>

<p></p>

-- <br />
&nbsp;<br />
--- <br />
You received this message because you are subscribed to the Google Groups &=
quot;ISO C++ Standard - Future Proposals&quot; group.<br />
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to std-proposals+unsubscribe@isocpp.org.<br />
To post to this group, send email to std-proposals@isocpp.org.<br />
Visit this group at <a href=3D"http://groups.google.com/a/isocpp.org/group/=
std-proposals/">http://groups.google.com/a/isocpp.org/group/std-proposals/<=
/a>.<br />

------=_Part_799_17631602.1391460461943--

.


Author: Matthew Woehlke <mw_triad@users.sourceforge.net>
Date: Mon, 03 Feb 2014 16:23:51 -0500
Raw View
On 2014-02-03 15:48, Miro Knejp wrote:
> Then I was thinking some more about the return/error dilemma. Inspired
> by the mentioning of match_integer three use case scenarios come to mind:

FWIW... when considering Thiago's (OT) question re: Qt's API, I actually=20
ended up suggesting that the end iterator be a method/member of the=20
return type. IOW the return is "essentially" a tuple of value, status,=20
end_iter, but with convenience methods to access those rather than being=20
literally a std::tuple.

If you care about neither the status nor end_iter, you can use this like:

use(parse<type>(in).value());
-or-
use(parse<type>(in).value_or(default));

(Optional: operator T to implicitly obtain value... but conflicts with=20
operator bool to implicitly obtain success/failure.)

If you do care about the status and/or end_iter, I don't see how you=20
would possibly avoid at least one local variable declaration (unless=20
passing to a function that takes the result type, in which case the=20
result type already must contain everything). Returning everything in=20
the result type allows that local to be 'auto const' and avoids the=20
various pitfalls of a by-reference output parameter.

If using operator bool to check status, you can also store the result=20
inside an if() (assuming you only need the value and/or end_iter if=20
parsing succeeded).

After thinking about it a lot, I don't see any way for the API to be=20
simpler than that. Any other variation requires similar or additional=20
declarations. (The only minor downside I see is that the conversion type=20
must be provided as an explicit template parameter=C2=B9 rather than inferr=
ed=20
from a previous declaration. However I don't see this as being such a=20
bad thing, as it makes it more clear what is the expected output type.=20
As was mentioned elsewhere, code is written once and read many times.)

(=C2=B9 Which also means that the function must be a template rather than=
=20
overloaded, though I don't think that is an issue?)

--=20
Matthew

--=20

---=20
You received this message because you are subscribed to the Google Groups "=
ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposa=
ls/.

.


Author: Matthew Woehlke <mw_triad@users.sourceforge.net>
Date: Mon, 03 Feb 2014 17:01:05 -0500
Raw View
On 2014-02-03 15:47, Bengt Gustafsson wrote:
> I think we should try to find a good solution to the problem of using a
> Range-concept for input (same prerequisites as to be able to use range
> based for) as the input and preserve
> the information about how many characters were consumed.
>
> Creating a range type which stops at nul is simple, for instance let end()
> return nullptr and begin() return the input pointer wrapped in an object
> whose operator++ sets the wrapped pointer to nullptr when a nul is
> encountered. This of course only works as a forward iterator, but that's ok
> for this case.

Yes, I believe that would work. Next question: can you trivially create
a string_view from this? If yes, then the problem is solved :-).

(If no, maybe taking (only) string_view is not the best API? I wonder
now if there should be a string_forward_view? Or would it be sufficient
to have an iterator type whose operator== returns true for a magic end
iterator when the iterator points to NUL?)

> Here's an idea of how to design the from_string() without using a ref
> parameter as the result:
>
> The return type has three members:
>
> a) the value
> b) a RANGE representing the rest of the input
> c) an exception_ptr or similar error information

That's exactly how I would do it; see also my reply to Miro.

> This however needs operator bool() to be called before operator T, and (to
> handle the remaining source character case) also retrieve the remaining
> range to show that you handle it. This does not provide for any neat call
> sites.

Or don't provide implicit conversion operators for both the value and
status, at least in case of the conversion type == bool.

> No, I don't see how this can be done in a reasonably neat way without using
> a T& dest in the parameter list.

Why is it so horrible to write either '.okay()' or '.value()'?

If you do care about more than exactly one of the three possible output
information parts, I don't see any way to avoid having at least one
local variable. So what is wrong with:

auto const result = from_string<type>(in);
if (result)
{
   use(result.value());
   // optional: do something with result.last_consumed()
}
else
{
   // handle error
}

....?

The above uses exactly one local variable which can be declared 'const'
and provides both the status and information on consumed characters (and
the value, of course). For type != bool, '.value()' I believe that
'.value()' can even still be implicit. And it also allows direct
hand-off to a function that uses the status and/or iterator information
(which is more awkward with out params and, in case of implicit value
conversion, precludes switching between the full result and just the
value solely by changing the signature of the use function).

Compare the above to:

type out; // not const :-(
if (auto result = from_string(in, out))
{
   // handle error
}
else
{
   use(out);
}

....which is more logic (and more characters, if you don't count the
'const' in the above which is missing here, even with '.value()' above),
and requires that operator bool return the opposite of the expected
result. Besides that the 'handle the exception first' logic flow feels
unnatural to me :-).

(Or requires a language enhancement to allow 'if (!(auto result =
....))', which of course adds even more logic and characters...)

Plus, if use() changes to want also 'result', the above must be
refactored because in its current form, 'result' is unavailable at the
call site.


It's even worse if you want to know about consumed characters:

unknown_t last_consumed; // um... what's the type of this?
type out;
if (auto result = from_string(in, out, last_consumed))
{
   // handle error
}
else
{
   use(out);
   // do something with last_consumed
}

Now I have an entire additional local variable, which not only can't be
'const', but has a non-trivial type declaration that can't trivially be
written as "auto".


And of course, there's the case that we don't care about the status:

type out = default;
from_string(in, out).ignore();
use(out);

- versus -

use(from_string<type>(in).value_or(default));

--
Matthew

--

---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.

.


Author: gmisocpp@gmail.com
Date: Mon, 3 Feb 2014 14:35:26 -0800 (PST)
Raw View
------=_Part_34_19297754.1391466926759
Content-Type: text/plain; charset=UTF-8

Hi everyone

Thanks for the interesting read. My ten cents on this subject:

It seems parsing/conversion means looking for numbers/values which
sometimes might not be present, and even if they are, they may be
expectedly or unexpectedly followed by other stuff; and any value may that
is present still may be outside of the expected range. Theoretically,
something might even happen that's so unexpected that we might not even
parse anything even to be sure if a value is present or correct at all.

Input from a file, command line or configuration file is often like that:

"100"      - a value
"100;"     - a value followed by something/anything here it's a ;
"hello"    - no value, but something e.g. the "h" from "hello"
"257       - a failed value (for a given type) e.g. this is too big for an
8 bit unsigned char
"-1;2"     - failed value (say for unsigned int) followed by something, in
this case the ";" from ";2"
""         - nothing  - eof / empty range passed in.

Considering all of this, it suggest the set of all posibilities might be
this: (represented here as an enum):

enum parse_status // Ultimate outcome of converting a string to a type.
{
    got_value,                      // success, a value, nothing more
    got_something,                  // something, but not a value, probable
failure
    got_value_and_something,        // got a value and something else.
Success or failure likely determined by caller.
    got_failed_value,               // got an unusable value out of range
or whatever.
    got_failed_value_and_something, // got an unusable value and something
else too.
    got_nothing,                    // nothing, empty input range etc.
    got_error                       // worse than anything above. Invalid
argument etc.
    // anything else?
};

When parsing fails to get a value, the reason is known and it's helpful to
be able to report something detailed.
e.g. number too big, too small, not a number, nothing, not integral

Even if the callers code is wrong and they've passed an invalid argument to
the parse routine etc.

Whether parsing failed or succeeded, i often want to know how far parsing
got, so I can continue.

Putting all of this togther too, leads me to think a structure like this is
needed to report things:

struct conversion_result
{
    parse_status status;        // got nothing, a value/bad value and/or
something else or some other error.
    std::error_code   code;     // What exactly went wrong: e.g. like
ERANGE / EINVAL/ invalid arg. etc.
    InputIterator     next;     // Points to something if indicated else end
};

A key question (to me at least) seems it might be possibe to do away with
the parse_status completely, but the parser routine is aware of the exact
details anyway so is it good to throw that away and it helps if we can
examine the return value and error code as little as needed.

I'm keen to see which is more readable, looking at tests on status codes or
code that re-creates those tests by if'ing on different error code and
iterator values to (re) deduce these facts.

In conclusion, I'm was thinking an interface like exhibits some of the
traits like this one is needed:

// never throws, (image range/iterator pair versions as you see fit):

conversion_result parse( signed char& value, Range range ) noexcept;
conversion_result parse( char& value, Range range ) noexcept;
conversion_result parse( unsigned char& value, Range range ) noexcept;
conversion_result parse( int& value, tRange range ) noexcept;
conversion_result parse( unsigned int& value, Range range ) noexcept;
conversion_result parse( long& value, Range range ) noexcept;
conversion_result parse( unsigned long& value, Range range ) noexcept;
conversion_result parse( long long& value, Range range ) noexcept;
conversion_result parse( unsigned long long& value, InputRange range )
noexcept;
conversion_result parse( float& value, Range range ) noexcept;
conversion_result parse( double& value, Range range ) noexcept;
conversion_result parse( long double& value, Range range ) noexcept;
conversion_result parse( signed char& value, Range range ) noexcept;

Not everybody wants/needs to handle codes or wants to map manually map the
results to exceptions and lose info.
So I think an a exception version or mapping function is needed too, maybe
something like this:

// highly likely to throw.
enum conversion_options;
void parse_checked( int& value, InputRange range, conversion_options
check_options = allow_value);

enum conversion_options // bit mask, probably can be simplified, but
conveys the idea
{
    allow_value,
    allow_value_and_something,
    allow_failed_value,
    allow_failed_value_and_something,
    allow_errors,
    allow_nothing
};

// highly likely to throw.
void parse_checked( int& value, InputRange range, conversion_options
check_options = allow_value);

By default checking is strict, it only allows an exact value and nothing
more without an exception.
Accepting a value followed by something else or a value or nothing or an
out of range value can be done through the options.

Some open qestions I have:
* I'm looking at the template interface ideas and I can't decide if they
are genius or excessive. But that's something I always think about
templates.
* Are we really saying C can't even do this task sufficiently well. Kind of
sad! Won't "they" revist this and won't we get that later too.
* should +- symbols be rejected for unsigned types. If they aren't
necessary/useful, why accept them?
* should .00 be accepted as a float of 0.00?
* locales. is 1,100 is that 1 followed by someting (a comma), or 1100
locale. Can a user override that to pick either interpretation.

Design choices I think make sense:
* Don't skip leading whitespace as skipping tabs, lfs, crs etc. can be
surprising, unwanted and slow, and can be explicitly done easily.
* Have exception and exception free versions to allow noexcept
optimizations and routines useful in exception free environments.
* Errors should return/raise detailed and more identifiable errors than
using atoi etc.
* Return value should aid in diagnosing where to continue to parse next.
* Don't use errno etc. as it seems to raise questionable conccurency
questions and doesn't appear a clean solution anyway.
* Use overloaded names hopefully organised for more useability in generic
code and more memorable than guessing is it strtol/stroul etc.
* Takes conversion value by parameter rather than return to allow setting a
default and simplification of other things.
* Leaves Hex/octal conversion operations to other interfaces so as to
reduce interface complexity.

An interesting acid test might be see how usable such routines would be to
parse a comma delimited list of values for any locale; or a windows style
..ini file of any types and compare that to using strtoul etc.

That's my input for now. I am enjoying reading what everybody else has to
say on the subject.
It's fascinating to find out how simple or not or flexible or not the end
result will be.

Thanks.

--

---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.

------=_Part_34_19297754.1391466926759
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr"><div>Hi everyone</div><div><br></div><div>Thanks for the i=
nteresting read. My ten cents on this subject:</div><div><br></div><div>It =
seems parsing/conversion means looking for numbers/values which sometimes&n=
bsp;might not be present, and even if they are, they may be expectedly or u=
nexpectedly followed by other stuff; and&nbsp;any value&nbsp;may that is pr=
esent still may be outside of the expected range. Theoretically, something =
might even happen that's so unexpected that we might not even parse anythin=
g even to be sure if a value is present or correct at all.</div><div><br></=
div><div>Input from a file, command line or configuration file is often lik=
e that:</div><div><br></div><div>"100"&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; - a va=
lue<br>"100;"&nbsp;&nbsp;&nbsp;&nbsp; - a value followed by something/anyth=
ing here it's a ;<br>"hello"&nbsp;&nbsp;&nbsp; - no value, but something e.=
g. the "h" from "hello"<br>"257&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; - a fai=
led value (for a given type) e.g. this is too big for an 8 bit unsigned cha=
r<br>"-1;2"&nbsp;&nbsp;&nbsp;&nbsp; - failed value (say for unsigned int) f=
ollowed by something, in this case the ";" from ";2"<br>""&nbsp;&nbsp;&nbsp=
;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; - nothing&nbsp; - eof / empty range passed =
in.</div><div><br></div><div>Considering all of this, it suggest the set of=
 all posibilities might be this: (represented here as an enum):</div><div><=
br></div><div>enum parse_status // Ultimate outcome of converting a string =
to a type.<br>{<br>&nbsp;&nbsp;&nbsp; got_value,&nbsp;&nbsp;&nbsp;&nbsp;&nb=
sp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;=
&nbsp;&nbsp;&nbsp;&nbsp; // success, a value, nothing more<br>&nbsp;&nbsp;&=
nbsp; got_something,&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&=
nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; // something, but not a val=
ue, probable failure<br>&nbsp;&nbsp;&nbsp; got_value_and_something,&nbsp;&n=
bsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; // got a value and something else. Succe=
ss or failure likely determined by caller.<br>&nbsp;&nbsp;&nbsp; got_failed=
_value,&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&n=
bsp;&nbsp;&nbsp; // got an unusable value out of range or whatever.<br>&nbs=
p;&nbsp;&nbsp; got_failed_value_and_something, // got an unusable value and=
 something else too.<br>&nbsp;&nbsp;&nbsp; got_nothing,&nbsp;&nbsp;&nbsp;&n=
bsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp=
;&nbsp;&nbsp;&nbsp; // nothing, empty input range etc.<br>&nbsp;&nbsp;&nbsp=
; got_error&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbs=
p;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; // wor=
se than anything above. Invalid argument etc.<br>&nbsp;&nbsp;&nbsp; // anyt=
hing else?<br>};</div><div><br></div><div>When parsing fails to get a value=
, the reason&nbsp;is known and it's helpful to be able to report something =
detailed.<br>e.g. number too big, too small, not a number, nothing, not int=
egral</div><div><br></div><div>Even if the callers code is wrong and they'v=
e passed an invalid argument to the parse routine etc.</div><div><br></div>=
<div>Whether parsing failed or succeeded, i often want to know how far pars=
ing got, so I can continue.</div><div><br></div><div>Putting all of this to=
gther too, leads me to think a structure like this is needed to report thin=
gs:</div><div><br></div><div>struct conversion_result<br>{<br>&nbsp;&nbsp;&=
nbsp; parse_status status;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; // got=
 nothing, a value/bad value and/or something else or some other error.<br>&=
nbsp;&nbsp;&nbsp; std::error_code&nbsp;&nbsp; code;&nbsp;&nbsp;&nbsp;&nbsp;=
 // What exactly went wrong: e.g. like ERANGE / EINVAL/ invalid arg. etc.<b=
r>&nbsp;&nbsp;&nbsp; InputIterator&nbsp;&nbsp;&nbsp;&nbsp; next;&nbsp;&nbsp=
;&nbsp;&nbsp; // Points to something if indicated else end<br>};</div><div>=
<br></div><div>A key question (to me at least) seems it might be possibe to=
 do away with the parse_status completely, but the parser routine&nbsp;is a=
ware of the exact details anyway so is it good to throw that away and it he=
lps if&nbsp;we can examine the return value and error code as little as nee=
ded.</div><div><br></div><div>I'm keen to see which is more readable, looki=
ng at&nbsp;tests on status codes or code that re-creates those tests by if'=
ing on different error code and iterator values to (re) deduce these facts.=
</div><div><br></div><div>In conclusion, I'm&nbsp;was thinking&nbsp;an inte=
rface&nbsp;like exhibits some of the traits like this one is&nbsp;needed:</=
div><div><br></div><div>// never throws,&nbsp;(image range/iterator pair ve=
rsions as you see fit):</div><div><br>conversion_result parse( signed char&=
amp; value, Range range ) noexcept;<br>conversion_result parse( char&amp; v=
alue, Range range ) noexcept;<br>conversion_result parse( unsigned char&amp=
; value, Range range ) noexcept;<br>conversion_result parse( int&amp; value=
, tRange range ) noexcept;<br>conversion_result parse( unsigned int&amp; va=
lue, Range range ) noexcept;<br>conversion_result parse( long&amp; value, R=
ange range ) noexcept;<br>conversion_result parse( unsigned long&amp; value=
, Range range ) noexcept;<br>conversion_result parse( long long&amp; value,=
 Range range ) noexcept;<br>conversion_result parse( unsigned long long&amp=
; value, InputRange range ) noexcept;<br>conversion_result parse( float&amp=
; value, Range range ) noexcept;<br>conversion_result parse( double&amp; va=
lue, Range range ) noexcept;<br>conversion_result parse( long double&amp; v=
alue, Range range ) noexcept;<br>conversion_result parse( signed char&amp; =
value, Range range ) noexcept;</div><div><br></div><div>Not everybody wants=
/needs to handle codes or wants to map manually map&nbsp;the results to exc=
eptions and lose info.<br>So I think an a exception version or mapping func=
tion is needed too, maybe something like this:</div><div><br></div><div>// =
highly likely to throw.<br>enum conversion_options;<br>void parse_checked( =
int&amp; value, InputRange range, conversion_options check_options =3D allo=
w_value);</div><div><br></div><div>enum conversion_options // bit mask, pro=
bably can be simplified, but conveys the idea<br>{<br>&nbsp;&nbsp;&nbsp; al=
low_value,<br>&nbsp;&nbsp;&nbsp; allow_value_and_something,<br>&nbsp;&nbsp;=
&nbsp; allow_failed_value,<br>&nbsp;&nbsp;&nbsp; allow_failed_value_and_som=
ething,<br>&nbsp;&nbsp;&nbsp; allow_errors,<br>&nbsp;&nbsp;&nbsp; allow_not=
hing<br>};</div><div><br></div><div>// highly likely to throw.<br>void pars=
e_checked( int&amp; value, InputRange range, conversion_options check_optio=
ns =3D allow_value);</div><div><br></div><div>By default checking is strict=
, it only allows an exact value and nothing more without an exception.<br>A=
ccepting a value followed by something else or a value or nothing or an out=
 of range value can be done through the options.</div><div><br></div><div>S=
ome open qestions I have:</div><div>*&nbsp;I'm looking at the&nbsp;template=
 interface ideas and I can't decide if they are genius or excessive. But th=
at's something I always think about templates.<br>* Are we really saying C =
can't even do this task sufficiently well. Kind of sad! Won't "they" revist=
 this and won't we get that later too.<br>* should +- symbols be rejected f=
or unsigned types. If they aren't necessary/useful, why accept them?<br>* s=
hould .00 be accepted as a float of 0.00?<br>* locales. is 1,100 is that 1 =
followed by someting (a comma), or 1100 locale. Can a user override that to=
 pick either interpretation.</div><div><br></div><div>Design choices I thin=
k make sense:</div><div>* Don't skip leading whitespace as skipping tabs, l=
fs, crs etc. can be surprising, unwanted and slow, and can be explicitly do=
ne easily.<br>* Have exception and exception free versions to allow noexcep=
t optimizations and routines useful in exception free environments.<br>* Er=
rors should return/raise detailed and more identifiable errors than using a=
toi etc.<br>* Return value should aid in diagnosing where to continue to pa=
rse next.<br>* Don't use errno etc. as it seems to raise questionable concc=
urency questions and doesn't appear a clean solution anyway.<br>* Use overl=
oaded names hopefully organised for more useability in generic code and mor=
e memorable than guessing is it strtol/stroul etc.<br>* Takes conversion va=
lue by parameter rather than return to allow setting a default and simplifi=
cation of other things.<br>* Leaves Hex/octal conversion operations to othe=
r interfaces so as to reduce interface complexity.</div><div><br></div><div=
>An interesting acid test might be see how usable such routines would be to=
 parse a comma delimited list of values for any locale; or a windows style =
..ini file of any types and compare that to using strtoul etc.</div><div><br=
></div><div>That's my input for now. I am enjoying reading what everybody e=
lse has to say on the subject.</div><div>It's fascinating to find out how s=
imple or not or flexible or not the end result will be.</div><div><br></div=
><div>Thanks.&nbsp;</div><div><br></div></div>

<p></p>

-- <br />
&nbsp;<br />
--- <br />
You received this message because you are subscribed to the Google Groups &=
quot;ISO C++ Standard - Future Proposals&quot; group.<br />
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to std-proposals+unsubscribe@isocpp.org.<br />
To post to this group, send email to std-proposals@isocpp.org.<br />
Visit this group at <a href=3D"http://groups.google.com/a/isocpp.org/group/=
std-proposals/">http://groups.google.com/a/isocpp.org/group/std-proposals/<=
/a>.<br />

------=_Part_34_19297754.1391466926759--

.


Author: Matthew Woehlke <mw_triad@users.sourceforge.net>
Date: Mon, 03 Feb 2014 19:20:03 -0500
Raw View
On 2014-02-03 18:02, Olaf van der Spek wrote:
> On Mon, Feb 3, 2014 at 11:01 PM, Matthew Woehlke
> <mw_triad@users.sourceforge.net> wrote:
>> If you do care about more than exactly one of the three possible output
>> information parts, I don't see any way to avoid having at least one local
>> variable. So what is wrong with:
>
> user_t& u = ...
> if (parse(u.age, input)) // or !parse, depending on return type
>    return / throw
>
> No local var required, no type duplication

Okay. This is the first decent argument I've seen in favor of using an
out param. However I still think the out param is sub-optimal except in
this case (which I suspect is not the more common case).

Maybe we should just provide both...

--
Matthew

--

---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.

.


Author: Bengt Gustafsson <bengt.gustafsson@beamways.com>
Date: Mon, 3 Feb 2014 17:57:12 -0800 (PST)
Raw View
------=_Part_1839_14251007.1391479032744
Content-Type: text/plain; charset=UTF-8

@Matthew, regarding strlen avoidance: By RANGE I meant a template type
which has the same api as required by a range based for (which is kind of a
build in template function). This means that you don't have to create a
string_view, any type of range will do. Here, for instance, is a
char_ptr_range for this case:

template<typename T> zero_terminate_iterator {
public:
    zero_terminate_iterator() : ptr(nullptr) {}
    zero_terminate_iterator(T* p) : ptr(p) {}


    bool operator(const zero_terminate_iterator<T>& rhs) const {
        if (ptr == rhs.ptr)
            return true;    // includes the case that both are nullptr
        if (rhs.ptr == nullptr && *ptr == 0)
            return true;
        if (ptr == nullptr && *rhs.ptr == 0)
            return true;
        return false;
    }
    // etc...
private:
    T* ptr;
}

auto char_ptr_range(char* p) { return range<zero_terminate_iterator<char>>(
zero_terminate_iterator<char>(p), <zero_terminate_iterator<char>()); }  //
Uses C++14 function return type deduction.

//Now you can write:
char* p = "1234";
for(auto c : char_ptr_range(p))
    process_char(c);

// and you can write
auto x = parse<int>(char_ptr_range(p));  // or whatever API we end up with.


Note that as we are aiming for a extensible set of conversions including
user defined types, say WGS84 geospactial coordintaes there is also an open
set of error codes, so an enum or int value is not enough. (The from_string
may be wrapped in a template function which can't be expected to know the
interpretation of an int error code for any T it may be instantiated for!

Now I want to check some common use cases for the solution with a return
triplet. To complete the use cases we can also add a fourth member skipped
which is true if we had to skip spaces. I think that the smartest way to
solve this may be to provide a set of value() functions, but no cast
operators:

Tentatively I call the "states" of the return value:

strict - no spaces skipped. All of the string could be converted.
complete - space skipping ok, but no trailing junk.
ok - space skipping ok, and trailing junk.
bad - no parsing was possible, even after skipping spaces.

// The actual parsing is always the same:
const auto r = parse<int>(char_ptr_range("123"));

int x = r.strict_value(); // throws if r is not strict.
int y = r.complete_value() // throws if r is not complete
int z = r.value() // throws if r is not ok
int w = r.value_or(17); // never throws

r would also have is_complete(), is_strict(), is_ok() for those who want to
handle errors without throwing.
An iterator to the next char can be returned by r.next()
The error code can be returned by r.error().

Something like that, what do you think about that?

We need to take a look at the parse use case. For instance to parse a comma
separated list of ints stored as a nul terminated string:

const char* p = "123, 456 ,789";
auto range = char_ptr_range(p);
vector<int> numbers;

while (!range.empty()) {
    const auto r = from_string<int>(range);
    numbers.emplace_back(r.value());   // requires at least one parsable
char
    range.first = r.last();
    skipspace(range);     // Assume we have a skip function which works on
the range in situ
    if (*range.first != ',')
       throw "no comma";
    range.first++;
}

This assumes that range<T> has an empty method and that the starting
iterator is called first.

The only main wart would be that we must manually update the range's first
member. It would be possible to store the end iterator to be able to return
a RANGE.

Then I could throw in a must_be(RANGE, token) helper which performs the
last three rows, for a new appearance like this:

const char* p = "123, 456 ,789";
auto range = char_ptr_range(p);
vector<int> numbers;

while (!range.empty()) {
    const auto r = from_string<int>(range);
    numbers.emplace_back(r.value());   // requires at least one parsable
char, but allows leading white space
    range = r.rest();
    skipspace(range);     // skip spaces between number and comma
    must_be(range, ',');
}

With a lazy split this could also be written, without loosing performance:

for (auto str : lazy_split(char_ptr_range(p), ','))  // str is probably a
string_view now, but we really don't need to know that.
    numbers.emplace_back(from_string<int>(str).complete_value());

No, not really the same, now we don't allow space between number and comma.
But we can introduce a new level which allows leading and trailing space.
However, this can't be a const method of 'r' or we can't defer skipping of
trailing spaces until we ask whether there are any. BTW this mandates the
entire range to be stored in 'r' or the skipping could run wild...

Note that decltype(r) could be dependant on the type being converted (and
of course the RANGE type). It is a concept rather than a class. This means,
I guess, that the return type of error() could differ depending on T, but
using that feature of course limits usability in template code.




Den tisdagen den 4:e februari 2014 kl. 01:20:03 UTC+1 skrev Matthew Woehlke:
>
> On 2014-02-03 18:02, Olaf van der Spek wrote:
> > On Mon, Feb 3, 2014 at 11:01 PM, Matthew Woehlke
> > <mw_t...@users.sourceforge.net <javascript:>> wrote:
> >> If you do care about more than exactly one of the three possible output
> >> information parts, I don't see any way to avoid having at least one
> local
> >> variable. So what is wrong with:
> >
> > user_t& u = ...
> > if (parse(u.age, input)) // or !parse, depending on return type
> >    return / throw
> >
> > No local var required, no type duplication
>
> Okay. This is the first decent argument I've seen in favor of using an
> out param. However I still think the out param is sub-optimal except in
> this case (which I suspect is not the more common case).
>
> Maybe we should just provide both...
>
> --
> Matthew
>
>

--

---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.

------=_Part_1839_14251007.1391479032744
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr">@Matthew, regarding strlen avoidance: By RANGE I meant a t=
emplate type which has the same api as required by a range based for (which=
 is kind of a build in template function). This means that you don't have t=
o create a string_view, any type of range will do. Here, for instance, is a=
 char_ptr_range for this case:<div><br></div><div class=3D"prettyprint" sty=
le=3D"background-color: rgb(250, 250, 250); border: 1px solid rgb(187, 187,=
 187); word-wrap: break-word;"><code class=3D"prettyprint"><div class=3D"su=
bprettyprint"><span style=3D"color: #008;" class=3D"styled-by-prettify">tem=
plate</span><span style=3D"color: #660;" class=3D"styled-by-prettify">&lt;<=
/span><span style=3D"color: #008;" class=3D"styled-by-prettify">typename</s=
pan><span style=3D"color: #000;" class=3D"styled-by-prettify"> T</span><spa=
n style=3D"color: #660;" class=3D"styled-by-prettify">&gt;</span><span styl=
e=3D"color: #000;" class=3D"styled-by-prettify"> zero_terminate_iterator </=
span><span style=3D"color: #660;" class=3D"styled-by-prettify">{</span><spa=
n style=3D"color: #000;" class=3D"styled-by-prettify"><br></span><span styl=
e=3D"color: #008;" class=3D"styled-by-prettify">public</span><span style=3D=
"color: #660;" class=3D"styled-by-prettify">:</span><span style=3D"color: #=
000;" class=3D"styled-by-prettify"><br>&nbsp; &nbsp; zero_terminate_iterato=
r</span><span style=3D"color: #660;" class=3D"styled-by-prettify">()</span>=
<span style=3D"color: #000;" class=3D"styled-by-prettify"> </span><span sty=
le=3D"color: #660;" class=3D"styled-by-prettify">:</span><span style=3D"col=
or: #000;" class=3D"styled-by-prettify"> ptr</span><span style=3D"color: #6=
60;" class=3D"styled-by-prettify">(</span><span style=3D"color: #008;" clas=
s=3D"styled-by-prettify">nullptr</span><span style=3D"color: #660;" class=
=3D"styled-by-prettify">)</span><span style=3D"color: #000;" class=3D"style=
d-by-prettify"> </span><span style=3D"color: #660;" class=3D"styled-by-pret=
tify">{}</span><span style=3D"color: #000;" class=3D"styled-by-prettify"><b=
r>&nbsp; &nbsp; zero_terminate_iterator</span><span style=3D"color: #660;" =
class=3D"styled-by-prettify">(</span><span style=3D"color: #000;" class=3D"=
styled-by-prettify">T</span><span style=3D"color: #660;" class=3D"styled-by=
-prettify">*</span><span style=3D"color: #000;" class=3D"styled-by-prettify=
"> p</span><span style=3D"color: #660;" class=3D"styled-by-prettify">)</spa=
n><span style=3D"color: #000;" class=3D"styled-by-prettify"> </span><span s=
tyle=3D"color: #660;" class=3D"styled-by-prettify">:</span><span style=3D"c=
olor: #000;" class=3D"styled-by-prettify"> ptr</span><span style=3D"color: =
#660;" class=3D"styled-by-prettify">(</span><span style=3D"color: #000;" cl=
ass=3D"styled-by-prettify">p</span><span style=3D"color: #660;" class=3D"st=
yled-by-prettify">)</span><span style=3D"color: #000;" class=3D"styled-by-p=
rettify"> </span><span style=3D"color: #660;" class=3D"styled-by-prettify">=
{}</span><span style=3D"color: #000;" class=3D"styled-by-prettify"><br><br>=
<br>&nbsp; &nbsp; </span><span style=3D"color: #008;" class=3D"styled-by-pr=
ettify">bool</span><span style=3D"color: #000;" class=3D"styled-by-prettify=
"> </span><span style=3D"color: #008;" class=3D"styled-by-prettify">operato=
r</span><span style=3D"color: #660;" class=3D"styled-by-prettify">(</span><=
span style=3D"color: #008;" class=3D"styled-by-prettify">const</span><span =
style=3D"color: #000;" class=3D"styled-by-prettify"> zero_terminate_iterato=
r</span><span style=3D"color: #660;" class=3D"styled-by-prettify">&lt;</spa=
n><span style=3D"color: #000;" class=3D"styled-by-prettify">T</span><span s=
tyle=3D"color: #660;" class=3D"styled-by-prettify">&gt;&amp;</span><span st=
yle=3D"color: #000;" class=3D"styled-by-prettify"> rhs</span><span style=3D=
"color: #660;" class=3D"styled-by-prettify">)</span><span style=3D"color: #=
000;" class=3D"styled-by-prettify"> </span><span style=3D"color: #008;" cla=
ss=3D"styled-by-prettify">const</span><span style=3D"color: #000;" class=3D=
"styled-by-prettify"> </span><span style=3D"color: #660;" class=3D"styled-b=
y-prettify">{</span><span style=3D"color: #000;" class=3D"styled-by-prettif=
y"><br>&nbsp; &nbsp; &nbsp; &nbsp; </span><span style=3D"color: #008;" clas=
s=3D"styled-by-prettify">if</span><span style=3D"color: #000;" class=3D"sty=
led-by-prettify"> </span><span style=3D"color: #660;" class=3D"styled-by-pr=
ettify">(</span><span style=3D"color: #000;" class=3D"styled-by-prettify">p=
tr </span><span style=3D"color: #660;" class=3D"styled-by-prettify">=3D=3D<=
/span><span style=3D"color: #000;" class=3D"styled-by-prettify"> rhs</span>=
<span style=3D"color: #660;" class=3D"styled-by-prettify">.</span><span sty=
le=3D"color: #000;" class=3D"styled-by-prettify">ptr</span><span style=3D"c=
olor: #660;" class=3D"styled-by-prettify">)</span><span style=3D"color: #00=
0;" class=3D"styled-by-prettify"><br>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nb=
sp; </span><span style=3D"color: #008;" class=3D"styled-by-prettify">return=
</span><span style=3D"color: #000;" class=3D"styled-by-prettify"> </span><s=
pan style=3D"color: #008;" class=3D"styled-by-prettify">true</span><span st=
yle=3D"color: #660;" class=3D"styled-by-prettify">;</span><span style=3D"co=
lor: #000;" class=3D"styled-by-prettify"> &nbsp; &nbsp;</span><span style=
=3D"color: #800;" class=3D"styled-by-prettify">// includes the case that bo=
th are nullptr</span><span style=3D"color: #000;" class=3D"styled-by-pretti=
fy"><br>&nbsp; &nbsp; &nbsp; &nbsp; </span><span style=3D"color: #008;" cla=
ss=3D"styled-by-prettify">if</span><span style=3D"color: #000;" class=3D"st=
yled-by-prettify"> </span><span style=3D"color: #660;" class=3D"styled-by-p=
rettify">(</span><span style=3D"color: #000;" class=3D"styled-by-prettify">=
rhs</span><span style=3D"color: #660;" class=3D"styled-by-prettify">.</span=
><span style=3D"color: #000;" class=3D"styled-by-prettify">ptr </span><span=
 style=3D"color: #660;" class=3D"styled-by-prettify">=3D=3D</span><span sty=
le=3D"color: #000;" class=3D"styled-by-prettify"> </span><span style=3D"col=
or: #008;" class=3D"styled-by-prettify">nullptr</span><span style=3D"color:=
 #000;" class=3D"styled-by-prettify"> </span><span style=3D"color: #660;" c=
lass=3D"styled-by-prettify">&amp;&amp;</span><span style=3D"color: #000;" c=
lass=3D"styled-by-prettify"> </span><span style=3D"color: #660;" class=3D"s=
tyled-by-prettify">*</span><span style=3D"color: #000;" class=3D"styled-by-=
prettify">ptr </span><span style=3D"color: #660;" class=3D"styled-by-pretti=
fy">=3D=3D</span><span style=3D"color: #000;" class=3D"styled-by-prettify">=
 </span><span style=3D"color: #066;" class=3D"styled-by-prettify">0</span><=
span style=3D"color: #660;" class=3D"styled-by-prettify">)</span><span styl=
e=3D"color: #000;" class=3D"styled-by-prettify"><br>&nbsp; &nbsp; &nbsp; &n=
bsp; &nbsp; &nbsp; </span><span style=3D"color: #008;" class=3D"styled-by-p=
rettify">return</span><span style=3D"color: #000;" class=3D"styled-by-prett=
ify"> </span><span style=3D"color: #008;" class=3D"styled-by-prettify">true=
</span><span style=3D"color: #660;" class=3D"styled-by-prettify">;</span><s=
pan style=3D"color: #000;" class=3D"styled-by-prettify"><br>&nbsp; &nbsp; &=
nbsp; &nbsp; </span><span style=3D"color: #008;" class=3D"styled-by-prettif=
y">if</span><span style=3D"color: #000;" class=3D"styled-by-prettify"> </sp=
an><span style=3D"color: #660;" class=3D"styled-by-prettify">(</span><span =
style=3D"color: #000;" class=3D"styled-by-prettify">ptr </span><span style=
=3D"color: #660;" class=3D"styled-by-prettify">=3D=3D</span><span style=3D"=
color: #000;" class=3D"styled-by-prettify"> </span><span style=3D"color: #0=
08;" class=3D"styled-by-prettify">nullptr</span><span style=3D"color: #000;=
" class=3D"styled-by-prettify"> </span><span style=3D"color: #660;" class=
=3D"styled-by-prettify">&amp;&amp;</span><span style=3D"color: #000;" class=
=3D"styled-by-prettify"> </span><span style=3D"color: #660;" class=3D"style=
d-by-prettify">*</span><span style=3D"color: #000;" class=3D"styled-by-pret=
tify">rhs</span><span style=3D"color: #660;" class=3D"styled-by-prettify">.=
</span><span style=3D"color: #000;" class=3D"styled-by-prettify">ptr </span=
><span style=3D"color: #660;" class=3D"styled-by-prettify">=3D=3D</span><sp=
an style=3D"color: #000;" class=3D"styled-by-prettify"> </span><span style=
=3D"color: #066;" class=3D"styled-by-prettify">0</span><span style=3D"color=
: #660;" class=3D"styled-by-prettify">)</span><span style=3D"color: #000;" =
class=3D"styled-by-prettify"><br>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; =
</span><span style=3D"color: #008;" class=3D"styled-by-prettify">return</sp=
an><span style=3D"color: #000;" class=3D"styled-by-prettify"> </span><span =
style=3D"color: #008;" class=3D"styled-by-prettify">true</span><span style=
=3D"color: #660;" class=3D"styled-by-prettify">;</span><span style=3D"color=
: #000;" class=3D"styled-by-prettify"><br>&nbsp; &nbsp; &nbsp; &nbsp; </spa=
n><span style=3D"color: #008;" class=3D"styled-by-prettify">return</span><s=
pan style=3D"color: #000;" class=3D"styled-by-prettify"> </span><span style=
=3D"color: #008;" class=3D"styled-by-prettify">false</span><span style=3D"c=
olor: #660;" class=3D"styled-by-prettify">;</span><span style=3D"color: #00=
0;" class=3D"styled-by-prettify"><br>&nbsp; &nbsp; </span><span style=3D"co=
lor: #660;" class=3D"styled-by-prettify">}</span><span style=3D"color: #000=
;" class=3D"styled-by-prettify"><br>&nbsp; &nbsp; </span><span style=3D"col=
or: #800;" class=3D"styled-by-prettify">// etc...</span><span style=3D"colo=
r: #000;" class=3D"styled-by-prettify"><br></span><span style=3D"color: #00=
8;" class=3D"styled-by-prettify">private</span><span style=3D"color: #660;"=
 class=3D"styled-by-prettify">:</span><span style=3D"color: #000;" class=3D=
"styled-by-prettify"><br>&nbsp; &nbsp; T</span><span style=3D"color: #660;"=
 class=3D"styled-by-prettify">*</span><span style=3D"color: #000;" class=3D=
"styled-by-prettify"> ptr</span><span style=3D"color: #660;" class=3D"style=
d-by-prettify">;</span><span style=3D"color: #000;" class=3D"styled-by-pret=
tify"><br></span><span style=3D"color: #660;" class=3D"styled-by-prettify">=
}</span><span style=3D"color: #000;" class=3D"styled-by-prettify"><br><br><=
/span><span style=3D"color: #008;" class=3D"styled-by-prettify">auto</span>=
<span style=3D"color: #000;" class=3D"styled-by-prettify"> char_ptr_range</=
span><font color=3D"#008800"><span style=3D"color: #660;" class=3D"styled-b=
y-prettify">(</span><span style=3D"color: #008;" class=3D"styled-by-prettif=
y">char</span><span style=3D"color: #660;" class=3D"styled-by-prettify">*</=
span><span style=3D"color: #000;" class=3D"styled-by-prettify"> p</span><sp=
an style=3D"color: #660;" class=3D"styled-by-prettify">)</span><span style=
=3D"color: #000;" class=3D"styled-by-prettify"> </span><span style=3D"color=
: #660;" class=3D"styled-by-prettify">{</span><span style=3D"color: #000;" =
class=3D"styled-by-prettify"> </span><span style=3D"color: #008;" class=3D"=
styled-by-prettify">return</span><span style=3D"color: #000;" class=3D"styl=
ed-by-prettify"> </span></font><span style=3D"color: #000;" class=3D"styled=
-by-prettify">range</span><span style=3D"color: #660;" class=3D"styled-by-p=
rettify">&lt;</span><span style=3D"color: #000;" class=3D"styled-by-prettif=
y">zero_terminate_iterator</span><span style=3D"color: #080;" class=3D"styl=
ed-by-prettify">&lt;char&gt;</span><span style=3D"color: #660;" class=3D"st=
yled-by-prettify">&gt;(</span><span style=3D"color: #000;" class=3D"styled-=
by-prettify">zero_terminate_iterator</span><span style=3D"color: #080;" cla=
ss=3D"styled-by-prettify">&lt;char&gt;</span><span style=3D"color: #660;" c=
lass=3D"styled-by-prettify">(</span><span style=3D"color: #000;" class=3D"s=
tyled-by-prettify">p</span><span style=3D"color: #660;" class=3D"styled-by-=
prettify">),</span><span style=3D"color: #000;" class=3D"styled-by-prettify=
"> </span><span style=3D"color: #660;" class=3D"styled-by-prettify">&lt;</s=
pan><span style=3D"color: #000;" class=3D"styled-by-prettify">zero_terminat=
e_iterator</span><span style=3D"color: #080;" class=3D"styled-by-prettify">=
&lt;char&gt;</span><span style=3D"color: #660;" class=3D"styled-by-prettify=
">());</span><span style=3D"color: #000;" class=3D"styled-by-prettify"> </s=
pan><span style=3D"color: #660;" class=3D"styled-by-prettify">}</span><span=
 style=3D"color: #000;" class=3D"styled-by-prettify"> &nbsp;</span><span st=
yle=3D"color: #800;" class=3D"styled-by-prettify">// Uses C++14 function re=
turn type deduction.</span><span style=3D"color: #000;" class=3D"styled-by-=
prettify"><br><br></span><span style=3D"color: #800;" class=3D"styled-by-pr=
ettify">//Now you can write:</span><span style=3D"color: #000;" class=3D"st=
yled-by-prettify"><br></span><span style=3D"color: #008;" class=3D"styled-b=
y-prettify">char</span><span style=3D"color: #660;" class=3D"styled-by-pret=
tify">*</span><span style=3D"color: #000;" class=3D"styled-by-prettify"> p =
</span><span style=3D"color: #660;" class=3D"styled-by-prettify">=3D</span>=
<span style=3D"color: #000;" class=3D"styled-by-prettify"> </span><span sty=
le=3D"color: #080;" class=3D"styled-by-prettify">"1234"</span><span style=
=3D"color: #660;" class=3D"styled-by-prettify">;</span><span style=3D"color=
: #000;" class=3D"styled-by-prettify"><br></span><span style=3D"color: #008=
;" class=3D"styled-by-prettify">for</span><span style=3D"color: #660;" clas=
s=3D"styled-by-prettify">(</span><span style=3D"color: #008;" class=3D"styl=
ed-by-prettify">auto</span><span style=3D"color: #000;" class=3D"styled-by-=
prettify"> c </span><span style=3D"color: #660;" class=3D"styled-by-prettif=
y">:</span><span style=3D"color: #000;" class=3D"styled-by-prettify"> char_=
ptr_range</span><span style=3D"color: #660;" class=3D"styled-by-prettify">(=
</span><span style=3D"color: #000;" class=3D"styled-by-prettify">p</span><s=
pan style=3D"color: #660;" class=3D"styled-by-prettify">))</span><span styl=
e=3D"color: #000;" class=3D"styled-by-prettify"><br>&nbsp; &nbsp; process_c=
har</span><span style=3D"color: #660;" class=3D"styled-by-prettify">(</span=
><span style=3D"color: #000;" class=3D"styled-by-prettify">c</span><span st=
yle=3D"color: #660;" class=3D"styled-by-prettify">);</span><span style=3D"c=
olor: #000;" class=3D"styled-by-prettify"><br><br></span><span style=3D"col=
or: #800;" class=3D"styled-by-prettify">// and you can write</span><span st=
yle=3D"color: #000;" class=3D"styled-by-prettify"><br></span><span style=3D=
"color: #008;" class=3D"styled-by-prettify">auto</span><span style=3D"color=
: #000;" class=3D"styled-by-prettify"> x </span><span style=3D"color: #660;=
" class=3D"styled-by-prettify">=3D</span><span style=3D"color: #000;" class=
=3D"styled-by-prettify"> parse</span><span style=3D"color: #080;" class=3D"=
styled-by-prettify">&lt;</span><font color=3D"#000000"><span style=3D"color=
: #080;" class=3D"styled-by-prettify">int&gt;</span><span style=3D"color: #=
660;" class=3D"styled-by-prettify">(</span></font><span style=3D"color: #00=
0;" class=3D"styled-by-prettify">char_ptr_range</span><span style=3D"color:=
 #660;" class=3D"styled-by-prettify">(</span><span style=3D"color: #000;" c=
lass=3D"styled-by-prettify">p</span><span style=3D"color: #660;" class=3D"s=
tyled-by-prettify">));</span><span style=3D"color: #000;" class=3D"styled-b=
y-prettify"> &nbsp;</span><span style=3D"color: #800;" class=3D"styled-by-p=
rettify">// or whatever API we end up with.</span><span style=3D"color: #00=
0;" class=3D"styled-by-prettify"><br><br></span></div></code></div><div><br=
></div><div>Note that as we are aiming for a extensible set of conversions =
including user defined types, say WGS84 geospactial coordintaes there is al=
so an open set of error codes, so an enum or int value is not enough. (The =
from_string may be wrapped in a template function which can't be expected t=
o know the interpretation of an int error code for any T it may be instanti=
ated for!</div><div><br></div><div>Now I want to check some common use case=
s for the solution with a return triplet. To complete the use cases we can =
also add a fourth member skipped which is true if we had to skip spaces. I =
think that the smartest way to solve this may be to provide a set of value(=
) functions, but no cast operators:</div><div><br></div><div>Tentatively I =
call the "states" of the return value:</div><div><br></div><div>strict - no=
 spaces skipped. All of the string could be converted.</div><div>complete -=
 space skipping ok, but no trailing junk.</div><div>ok -&nbsp;<span style=
=3D"font-size: 13px;">space skipping ok, and trailing junk.</span></div><di=
v>bad - no parsing was possible, even after skipping spaces.</div><div><br>=
</div><div>// The actual parsing is always the same:</div><div>const auto r=
 =3D parse&lt;int&gt;(<span style=3D"background-color: rgb(250, 250, 250); =
color: rgb(0, 0, 0); font-family: monospace; font-size: 13px;">char_ptr_ran=
ge(</span><span style=3D"font-size: 13px;">"123"));</span></div><div><span =
style=3D"font-size: 13px;"><br></span></div><div>int x =3D r.strict_value()=
; // throws if r is not strict.</div><div>int y =3D r.complete_value() // t=
hrows if r is not complete</div><div>int z =3D r.value() // throws if r is =
not ok</div><div>int w =3D r.value_or(17); // never throws</div><div><br></=
div><div>r would also have is_complete(), is_strict(), is_ok() for those wh=
o want to handle errors without throwing.</div><div>An iterator to the next=
 char can be returned by r.next()</div><div>The error code can be returned =
by r.error().</div><div><br></div><div>Something like that, what do you thi=
nk about that?</div><div><br></div><div>We need to take a look at the parse=
 use case. For instance to parse a comma separated list of ints stored as a=
 nul terminated string:</div><div><br></div><div>const char* p =3D "123, 45=
6 ,789";&nbsp;<br></div><div><span class=3D"styled-by-prettify" style=3D"fo=
nt-family: monospace; background-color: rgb(250, 250, 250); color: rgb(0, 0=
, 0);">auto range =3D char_ptr_range</span><font color=3D"#008800" style=3D=
"font-family: monospace; background-color: rgb(250, 250, 250);"><span class=
=3D"styled-by-prettify" style=3D"color: rgb(102, 102, 0);">(</span><span cl=
ass=3D"styled-by-prettify" style=3D"color: rgb(0, 0, 0);">p</span><span cla=
ss=3D"styled-by-prettify" style=3D"color: rgb(102, 102, 0);">);</span></fon=
t></div><div><font color=3D"#008800" style=3D"font-family: monospace; backg=
round-color: rgb(250, 250, 250);"><span class=3D"styled-by-prettify" style=
=3D"color: rgb(102, 102, 0);">vector&lt;int&gt; numbers;</span></font></div=
><div><font color=3D"#008800" style=3D"font-family: monospace; background-c=
olor: rgb(250, 250, 250);"><span class=3D"styled-by-prettify" style=3D"colo=
r: rgb(102, 102, 0);"><br></span></font></div><div><font color=3D"#008800" =
style=3D"font-family: monospace; background-color: rgb(250, 250, 250);"><sp=
an class=3D"styled-by-prettify" style=3D"color: rgb(102, 102, 0);">while (!=
range.empty()) {</span></font></div><div><font color=3D"#008800" style=3D"f=
ont-family: monospace; background-color: rgb(250, 250, 250);"><span class=
=3D"styled-by-prettify" style=3D"color: rgb(102, 102, 0);">&nbsp; &nbsp; co=
nst auto r =3D from_string&lt;int&gt;(range);</span></font></div><div><font=
 color=3D"#008800" style=3D"font-family: monospace; background-color: rgb(2=
50, 250, 250);"><span class=3D"styled-by-prettify" style=3D"color: rgb(102,=
 102, 0);">&nbsp; &nbsp; numbers.emplace_back(r.value()); &nbsp; // require=
s at least one parsable char</span></font></div><div><font color=3D"#008800=
" style=3D"font-family: monospace; background-color: rgb(250, 250, 250);"><=
span class=3D"styled-by-prettify" style=3D"color: rgb(102, 102, 0);">&nbsp;=
 &nbsp; range.first =3D r.last();</span></font></div><div><font color=3D"#0=
08800" style=3D"font-family: monospace; background-color: rgb(250, 250, 250=
);"><span class=3D"styled-by-prettify" style=3D"color: rgb(102, 102, 0);">&=
nbsp; &nbsp; skipspace(range); &nbsp; &nbsp; // Assume we have a skip funct=
ion which works on the range in situ</span></font></div><div><font color=3D=
"#008800" style=3D"font-family: monospace; background-color: rgb(250, 250, =
250);"><span class=3D"styled-by-prettify" style=3D"color: rgb(102, 102, 0);=
">&nbsp; &nbsp; if (*range.first !=3D ',')</span></font></div><div><font co=
lor=3D"#008800" style=3D"font-family: monospace; background-color: rgb(250,=
 250, 250);"><span class=3D"styled-by-prettify" style=3D"color: rgb(102, 10=
2, 0);">&nbsp; &nbsp; &nbsp; &nbsp;throw "no comma";</span></font></div><di=
v><font color=3D"#008800" style=3D"font-family: monospace; background-color=
: rgb(250, 250, 250);"><span class=3D"styled-by-prettify" style=3D"color: r=
gb(102, 102, 0);">&nbsp; &nbsp; range.first++;</span></font></div><div><fon=
t color=3D"#008800" style=3D"font-family: monospace; background-color: rgb(=
250, 250, 250);"><span class=3D"styled-by-prettify" style=3D"color: rgb(102=
, 102, 0);">}</span></font></div><div><font color=3D"#008800" style=3D"font=
-family: monospace; background-color: rgb(250, 250, 250);"><span class=3D"s=
tyled-by-prettify" style=3D"color: rgb(102, 102, 0);"><br></span></font></d=
iv><div>This assumes that range&lt;T&gt; has an empty method and that the s=
tarting iterator is called first.</div><div><br></div><div>The only main wa=
rt would be that we must manually update the range's first member. It would=
 be possible to store the end iterator to be able to return a RANGE.</div><=
div><br></div><div>Then I could throw in a must_be(RANGE, token) helper whi=
ch performs the last three rows, for a new appearance like this:</div><div>=
<br></div><div>const char* p =3D "123, 456 ,789";&nbsp;</div><div><div><spa=
n class=3D"styled-by-prettify" style=3D"font-family: monospace; background-=
color: rgb(250, 250, 250); color: rgb(0, 0, 0);">auto range =3D char_ptr_ra=
nge</span><font color=3D"#008800" style=3D"font-family: monospace; backgrou=
nd-color: rgb(250, 250, 250);"><span class=3D"styled-by-prettify" style=3D"=
color: rgb(102, 102, 0);">(</span><span class=3D"styled-by-prettify" style=
=3D"color: rgb(0, 0, 0);">p</span><span class=3D"styled-by-prettify" style=
=3D"color: rgb(102, 102, 0);">);</span></font></div><div><font color=3D"#00=
8800" style=3D"font-family: monospace; background-color: rgb(250, 250, 250)=
;"><span class=3D"styled-by-prettify" style=3D"color: rgb(102, 102, 0);">ve=
ctor&lt;int&gt; numbers;</span></font></div><div><font color=3D"#008800" st=
yle=3D"font-family: monospace; background-color: rgb(250, 250, 250);"><span=
 class=3D"styled-by-prettify" style=3D"color: rgb(102, 102, 0);"><br></span=
></font></div><div><font color=3D"#008800" style=3D"font-family: monospace;=
 background-color: rgb(250, 250, 250);"><span class=3D"styled-by-prettify" =
style=3D"color: rgb(102, 102, 0);">while (!range.empty()) {</span></font></=
div><div><font color=3D"#008800" style=3D"font-family: monospace; backgroun=
d-color: rgb(250, 250, 250);"><span class=3D"styled-by-prettify" style=3D"c=
olor: rgb(102, 102, 0);">&nbsp; &nbsp; const auto r =3D from_string&lt;int&=
gt;(range);</span></font></div><div><font color=3D"#008800" style=3D"font-f=
amily: monospace; background-color: rgb(250, 250, 250);"><span class=3D"sty=
led-by-prettify" style=3D"color: rgb(102, 102, 0);">&nbsp; &nbsp; numbers.e=
mplace_back(r.value()); &nbsp; // requires at least one parsable char, but =
allows leading white space</span></font></div><div><font color=3D"#008800" =
style=3D"font-family: monospace; background-color: rgb(250, 250, 250);"><sp=
an class=3D"styled-by-prettify" style=3D"color: rgb(102, 102, 0);">&nbsp; &=
nbsp; range =3D r.rest();</span></font></div><div><font color=3D"#008800" s=
tyle=3D"font-family: monospace; background-color: rgb(250, 250, 250);"><spa=
n class=3D"styled-by-prettify" style=3D"color: rgb(102, 102, 0);">&nbsp; &n=
bsp; skipspace(range); &nbsp; &nbsp; // skip spaces between number and comm=
a</span></font></div><div><font color=3D"#008800" style=3D"font-family: mon=
ospace; background-color: rgb(250, 250, 250);"><span class=3D"styled-by-pre=
ttify" style=3D"color: rgb(102, 102, 0);">&nbsp; &nbsp; must_be(range, ',')=
;</span></font></div><div><font color=3D"#008800" style=3D"font-family: mon=
ospace; background-color: rgb(250, 250, 250);"><span class=3D"styled-by-pre=
ttify" style=3D"color: rgb(102, 102, 0);">}</span></font></div></div><div><=
font color=3D"#008800" style=3D"font-family: monospace; background-color: r=
gb(250, 250, 250);"><span class=3D"styled-by-prettify" style=3D"color: rgb(=
102, 102, 0);"><br></span></font></div><div><font color=3D"#008800" style=
=3D"font-family: monospace; background-color: rgb(250, 250, 250);"><span cl=
ass=3D"styled-by-prettify" style=3D"color: rgb(102, 102, 0);">With a lazy s=
plit this could also be written, without loosing performance:</span></font>=
</div><div><font color=3D"#008800" style=3D"font-family: monospace; backgro=
und-color: rgb(250, 250, 250);"><span class=3D"styled-by-prettify" style=3D=
"color: rgb(102, 102, 0);"><br></span></font></div><div><font color=3D"#008=
800" style=3D"font-family: monospace; background-color: rgb(250, 250, 250);=
"><span class=3D"styled-by-prettify" style=3D"color: rgb(102, 102, 0);">for=
 (auto str : lazy_split(</span></font><span class=3D"styled-by-prettify" st=
yle=3D"font-size: 13px; font-family: monospace; background-color: rgb(250, =
250, 250); color: rgb(0, 0, 0);">char_ptr_range</span><font color=3D"#00880=
0" style=3D"font-size: 13px; font-family: monospace; background-color: rgb(=
250, 250, 250);"><span class=3D"styled-by-prettify" style=3D"color: rgb(102=
, 102, 0);">(</span><span class=3D"styled-by-prettify" style=3D"color: rgb(=
0, 0, 0);">p</span><span class=3D"styled-by-prettify" style=3D"color: rgb(1=
02, 102, 0);">), ',')) &nbsp;// str is probably a string_view now, but we r=
eally don't need to know that.</span></font></div><div><font color=3D"#0088=
00" style=3D"font-size: 13px; font-family: monospace; background-color: rgb=
(250, 250, 250);"><span class=3D"styled-by-prettify" style=3D"color: rgb(10=
2, 102, 0);">&nbsp; &nbsp; numbers.emplace_back(</span></font><span style=
=3D"background-color: rgb(250, 250, 250); color: rgb(102, 102, 0); font-fam=
ily: monospace; font-size: 13px;">from_string&lt;int&gt;(str).complete_valu=
e());</span></div><div><span style=3D"background-color: rgb(250, 250, 250);=
 color: rgb(102, 102, 0); font-family: monospace; font-size: 13px;"><br></s=
pan></div><div><span style=3D"background-color: rgb(250, 250, 250); color: =
rgb(102, 102, 0); font-family: monospace; font-size: 13px;">No, not really =
the same, now we don't allow space between number and comma. But we can int=
roduce a new level which allows leading and trailing space. However, this c=
an't be a const method of 'r' or we&nbsp;</span><span style=3D"background-c=
olor: rgb(250, 250, 250); color: rgb(102, 102, 0); font-family: monospace; =
font-size: 13px;">can't defer skipping of trailing spaces until we ask whet=
her there are any. BTW this mandates the entire range to be stored in 'r' o=
r the skipping could run wild...</span></div><div><span style=3D"background=
-color: rgb(250, 250, 250); color: rgb(102, 102, 0); font-family: monospace=
; font-size: 13px;"><br></span></div><div><span style=3D"background-color: =
rgb(250, 250, 250); color: rgb(102, 102, 0); font-family: monospace; font-s=
ize: 13px;">Note that decltype(r) could be dependant on the type being conv=
erted (and of course the RANGE type). It is a concept rather than a class. =
This means, I guess, that the return type of error() could differ depending=
 on T, but using that feature of course limits usability in template code.<=
/span></div><div><br></div><div><br></div><div><br></div><div><br></div><di=
v>Den tisdagen den 4:e februari 2014 kl. 01:20:03 UTC+1 skrev Matthew Woehl=
ke:<blockquote class=3D"gmail_quote" style=3D"margin: 0;margin-left: 0.8ex;=
border-left: 1px #ccc solid;padding-left: 1ex;">On 2014-02-03 18:02, Olaf v=
an der Spek wrote:
<br>&gt; On Mon, Feb 3, 2014 at 11:01 PM, Matthew Woehlke
<br>&gt; &lt;<a href=3D"javascript:" target=3D"_blank" gdf-obfuscated-mailt=
o=3D"BV6J-bkBAb4J" onmousedown=3D"this.href=3D'javascript:';return true;" o=
nclick=3D"this.href=3D'javascript:';return true;">mw_t...@users.sourceforge=
..<wbr>net</a>&gt; wrote:
<br>&gt;&gt; If you do care about more than exactly one of the three possib=
le output
<br>&gt;&gt; information parts, I don't see any way to avoid having at leas=
t one local
<br>&gt;&gt; variable. So what is wrong with:
<br>&gt;
<br>&gt; user_t&amp; u =3D ...
<br>&gt; if (parse(u.age, input)) // or !parse, depending on return type
<br>&gt; &nbsp; &nbsp;return / throw
<br>&gt;
<br>&gt; No local var required, no type duplication
<br>
<br>Okay. This is the first decent argument I've seen in favor of using an=
=20
<br>out param. However I still think the out param is sub-optimal except in=
=20
<br>this case (which I suspect is not the more common case).
<br>
<br>Maybe we should just provide both...
<br>
<br>--=20
<br>Matthew
<br>
<br></blockquote></div></div>

<p></p>

-- <br />
&nbsp;<br />
--- <br />
You received this message because you are subscribed to the Google Groups &=
quot;ISO C++ Standard - Future Proposals&quot; group.<br />
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to std-proposals+unsubscribe@isocpp.org.<br />
To post to this group, send email to std-proposals@isocpp.org.<br />
Visit this group at <a href=3D"http://groups.google.com/a/isocpp.org/group/=
std-proposals/">http://groups.google.com/a/isocpp.org/group/std-proposals/<=
/a>.<br />

------=_Part_1839_14251007.1391479032744--

.


Author: Matthew Woehlke <mw_triad@users.sourceforge.net>
Date: Tue, 04 Feb 2014 13:39:54 -0500
Raw View
On 2014-02-03 17:35, gmisocpp@gmail.com wrote:
> Hi everyone
>
> Thanks for the interesting read. My ten cents on this subject:
>
> It seems parsing/conversion means looking for numbers/values which
> sometimes might not be present, and even if they are, they may be
> expectedly or unexpectedly followed by other stuff; and any value may tha=
t
> is present still may be outside of the expected range. Theoretically,
> something might even happen that's so unexpected that we might not even
> parse anything even to be sure if a value is present or correct at all.
>
> Input from a file, command line or configuration file is often like that:
>
> "100"      - a value
> "100;"     - a value followed by something/anything here it's a ;
> "hello"    - no value, but something e.g. the "h" from "hello"
> "257       - a failed value (for a given type) e.g. this is too big for a=
n
> 8 bit unsigned char
> "-1;2"     - failed value (say for unsigned int) followed by something, i=
n
> this case the ";" from ";2"
> ""         - nothing  - eof / empty range passed in.
>
> Considering all of this, it suggest the set of all posibilities might be
> this: (represented here as an enum):
>
> enum parse_status // Ultimate outcome of converting a string to a type.
> {
>      got_value,                      // success, a value, nothing more
>      got_something,                  // something, but not a value, proba=
ble
> failure
>      got_value_and_something,        // got a value and something else.
> Success or failure likely determined by caller.
>      got_failed_value,               // got an unusable value out of rang=
e
> or whatever.
>      got_failed_value_and_something, // got an unusable value and somethi=
ng
> else too.
>      got_nothing,                    // nothing, empty input range etc.
>      got_error                       // worse than anything above. Invali=
d
> argument etc.
>      // anything else?
> };

This seems like overkill. Either the text is well-formed or it isn't.=20
I'd say that should be the first check. (Probably considering "-1" for=20
unsigned as 'not well formed'.)

If the text is well-formed, then and only then would I get into other=20
reasons the parse might have failed, e.g. because the value would overflow.

This corresponds loosely with the output iterator and whether or not all=20
possible text was consumed.

Users that really care about the empty input case can check that=20
themselves easily enough.

> enum conversion_options // bit mask, probably can be simplified, but
> conveys the idea
> {
>      allow_value,
>      allow_value_and_something,
>      allow_failed_value,
>      allow_failed_value_and_something,
>      allow_errors,
>      allow_nothing
> };

While I like the idea, I don't think this set of options are all that=20
useful. Instead I would suggest:

enum class option // magic syntax to make bits? ;-)
{
   accept_whitespace,
   accept_trailing_characters,
   accept_overflow,
   // others?
}
STD_FLAGS(options, option) // ;-)

The accept_overflow flag would only be honored when the numeric type is=20
real (float, double, etc.... "numeric type" here meaning e.g. also if we=20
had a std::complex version). The effect would be to return =C2=B1inf if=20
overflow occurs.

(Vaguely related note: the real flavors should also accept e.g. "-inf"=20
and related NaN forms.)

> By default checking is strict, it only allows an exact value and nothing
> more without an exception.

....which should/would be the behavior of 'options opts =3D {}'.

> Accepting a value followed by something else or a value or nothing or an
> out of range value can be done through the options.

I don't think an empty string should ever parse successfully... what=20
would be the resulting value? I suspect anyone inclined to use that=20
feature would do better to use a default-value on any invalid input.

> * Are we really saying C can't even do this task sufficiently well. Kind =
of
> sad! Won't "they" revist this and won't we get that later too.

Inasmuch as we'd like to use iterators and string_view and such, I think=20
that might be hard :-). Maybe C will implement new API's based on these,=20
resulting in C++ standards being adopted into C for a change :-).

> * should .00 be accepted as a float of 0.00?

IMHO yes; it's valid in source code after all. (What does strtod do?)

> * locales. is 1,100 is that 1 followed by someting (a comma), or 1100
> locale.

I would implement two versions: one locale-aware and one not (equivalent=20
to using the "C" locale). The answer then depends on the locale; if ","=20
is the group separator, then "1,100" is equivalent to "1100".

> Can a user override that to pick either interpretation.

Yes, by passing a specific locale to the locale-aware version. (Note=20
that in some locales, "1,100" =3D=3D 11e-1.)

> Design choices I think make sense:
> * Don't skip leading whitespace as skipping tabs, lfs, crs etc. can be
> surprising, unwanted and slow, and can be explicitly done easily.

Yes, if accept_whitespace is not used.

> * Return value should aid in diagnosing where to continue to parse next.

I'm guessing you're talking about providing the position where parsing=20
"quit"?

> * Use overloaded names hopefully organised for more useability in generic
> code and more memorable than guessing is it strtol/stroul etc.

Speaking of "why can't C do it first"... :-)

> * Leaves Hex/octal conversion operations to other interfaces so as to
> reduce interface complexity.

Not sure about this. Which reminds me, your proposed API's don't take a=20
base parameter. IMO that is mandatory; there *will* be users that need=20
to parse a number in e.g. base 16. (Parsing bases other than 2, 8, 10=20
and 16 is probably unusual, however.)

--=20
Matthew

--=20

---=20
You received this message because you are subscribed to the Google Groups "=
ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposa=
ls/.

.


Author: Matthew Woehlke <mw_triad@users.sourceforge.net>
Date: Tue, 04 Feb 2014 17:08:31 -0500
Raw View
On 2014-02-03 20:57, Bengt Gustafsson wrote:
> Tentatively I call the "states" of the return value:
>
> strict - no spaces skipped. All of the string could be converted.
> complete - space skipping ok, but no trailing junk.
> ok - space skipping ok, and trailing junk.
> bad - no parsing was possible, even after skipping spaces.
>
> // The actual parsing is always the same:
> const auto r = parse<int>(char_ptr_range("123"));
>
> int x = r.strict_value(); // throws if r is not strict.
> int y = r.complete_value() // throws if r is not complete
> int z = r.value() // throws if r is not ok
> int w = r.value_or(17); // never throws

Unfortunately, that makes it much more awkward to say 'use the parse
value if a strict parse succeeded, else use a default value'. Sure, I
can write 'r.is_strict() ? r.value() : default', but that's much more
awkward than if I could always write 'r.value_or(default)'.

I still think it makes more sense to tell the parse up front how
tolerant it should be. Having said that, I could possibly see still
using your 'strict' and 'complete' terminology. I would probably then
name the third option something like 'relaxed'.

> An iterator to the next char can be returned by r.next()
> The error code can be returned by r.error().

Except for the above comments, I like.

Another reason to move the strict/relaxed/etc. to input parameters is
that it would allow the return type to more trivially subclass
std::optional. Basically, you would be adding error() and next() to
std::optional, without having to redefine/overload value() and related bits.

> We need to take a look at the parse use case. For instance to parse a comma
> separated list of ints stored as a nul terminated string:
> [example snipped]

With my above suggestions, I believe the code in your example is the
same except that you would also pass 'relaxed' to from_string.

(Loosely related: I realize we haven't been trying to pin down the name
of the function itself, but I would prefer something more like
'string_to', so that the name plus template type makes a natural phrase.)

> The only main wart would be that we must manually update the range's first
> member. It would be possible to store the end iterator to be able to return
> a RANGE.

Agreed, we should probably (also?) have that. (I first thought to call
it "remaining" / "remainder", but "rest" is okay too.)

>      must_be(range, ',');

Not really relevant, but I would name this something else to indicate
that it modifies the range... usually I tend to "consume" or "chomp".
(You might also consider returning the modified range rather than
modifying it in-place, just for the sake of clarity.)

> With a lazy split this could also be written, without loosing performance:
>
> for (auto str : lazy_split(char_ptr_range(p), ','))  // str is probably a
> string_view now, but we really don't need to know that.
>      numbers.emplace_back(from_string<int>(str).complete_value());
>
> No, not really the same, now we don't allow space between number and comma.

I would argue that 'complete' should also eat trailing space :-). (This
is another reason to pass the mode as an input parameter; in relaxed
mode we don't care what follows the number, but in complete mode spaced
should be consumed so that we can tag the parse as successful. This
wouldn't need to be a different mode.)

> Note that decltype(r) could be dependant on the type being converted (and
> of course the RANGE type). It is a concept rather than a class. This means,
> I guess, that the return type of error() could differ depending on T, but
> using that feature of course limits usability in template code.

Um. That could work. Or you could specify that it returns int and that
the allowed values depend on the value type. (I'm not convinced that's
an issue... it's still extensible; any one type can use millions+ of
possibly results, and the values can be declared to be dependent on the
value type, i.e. 5 for parsing an int may mean something different than
5 for parsing a user_t. Reasonable implementations would stick to some
common subset of well-known values. Template code, regardless, must also
stick to that subset or else be conditional on the value type.)

You might be able to do both; have error() return an int but allow
specializations for user types to return a subclass / specialization of
the "common" result type that adds additional members.

--
Matthew

--

---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.

.


Author: Bengt Gustafsson <bengt.gustafsson@beamways.com>
Date: Tue, 4 Feb 2014 15:30:39 -0800 (PST)
Raw View
------=_Part_5985_26026723.1391556639942
Content-Type: text/plain; charset=UTF-8

I think you are right, Matthew. The most logical choice is probably to send
the parsing options in as flags. The big bike shed regards their names,
polarities and defaults of course...

Here's a sketch:


// expect holds a value or an exception_ptr. I think this is basically the
same as boost::expect, which in turn was inspired by Andrei Alexandrescu's
idea.
// All ways of getting at the value except value_or() throws any
// pending exception inside.
// This should be refined as optional to avoid relying on a default
// ctor in the error case.
template <typename T> class expect {
public:
    expect(exception_ptr ex) : m_exception(ex) {}
    expect(const T& val) : m_value(val) {}

    operator bool() { return !m_exception; }
    exception_ptr exception() { return m_exception; }

    operator T() { return value(); }        // specialization removes this
for bool (or follow optional<bool>'s example)
    T value() {
        if (m_exception)
            rethrow_exception(m_exception);

        return m_value;
    }
    T value_or(const T& defval) {
        if (m_exception)
            return defval;

        return m_value;
    }

private:
    T m_value;
    exception_ptr m_exception;
};


enum StrToFlags {
    noleading = 1,
    notrailing = 2,
    complete = 4,
    strict = 7
};


// Maybe better to get flags as a template parameter? Or offer both
// versions? Not having to test flags on each call saves time and they
// are going to be fixed per call site 99% of the time.
template<typename T, typename RANGE> expect<T> str_to(RANGE& range,
StrToFlags flags)
{
    if (flags & noleading) {
        if (isspace(*begin(range)))
            return make_exception_ptr("No leading space allowed");
    }
    else
        skipspace(range);

    T tmp;
    ... Do the conversion into tmp and return on errors ...;

    if (flags & notrailing) {
        if (isspace(*begin(range)))
            return make_exception_ptr("No trailing space allowed");
    }
    else
        skipspace(range);

    if (flags & complete && begin(range) != end(range))
        return make_exception_ptr("Junk after value");

    return tmp;
}


// Use cases

// Throw on any error:
int val = strto<int>(char_ptr_range("123"), strict);

// Accept any error
int v2 = strto<int>(char_ptr_range("123")).value_or(17);

// Parse comma separated using lazy split:
for (auto str : char_ptr_range("123, 234 ,345,34"))
    numbers.push_back(strto<int>(str, complete));       // allow leading
and trailing space, but nothing else trailing except the comma.

// Use templated flag version to save time:
for (auto str : char_ptr_range("123, 234 ,345,34"))
    numbers.push_back(strto<int, complete>(str));       // allow leading
and trailing space, but nothing else trailing except the comma.




Den tisdagen den 4:e februari 2014 kl. 23:08:31 UTC+1 skrev Matthew Woehlke:
>
> On 2014-02-03 20:57, Bengt Gustafsson wrote:
> > Tentatively I call the "states" of the return value:
> >
> > strict - no spaces skipped. All of the string could be converted.
> > complete - space skipping ok, but no trailing junk.
> > ok - space skipping ok, and trailing junk.
> > bad - no parsing was possible, even after skipping spaces.
> >
> > // The actual parsing is always the same:
> > const auto r = parse<int>(char_ptr_range("123"));
> >
> > int x = r.strict_value(); // throws if r is not strict.
> > int y = r.complete_value() // throws if r is not complete
> > int z = r.value() // throws if r is not ok
> > int w = r.value_or(17); // never throws
>
> Unfortunately, that makes it much more awkward to say 'use the parse
> value if a strict parse succeeded, else use a default value'. Sure, I
> can write 'r.is_strict() ? r.value() : default', but that's much more
> awkward than if I could always write 'r.value_or(default)'.
>
> I still think it makes more sense to tell the parse up front how
> tolerant it should be. Having said that, I could possibly see still
> using your 'strict' and 'complete' terminology. I would probably then
> name the third option something like 'relaxed'.
>
> > An iterator to the next char can be returned by r.next()
> > The error code can be returned by r.error().
>
> Except for the above comments, I like.
>
> Another reason to move the strict/relaxed/etc. to input parameters is
> that it would allow the return type to more trivially subclass
> std::optional. Basically, you would be adding error() and next() to
> std::optional, without having to redefine/overload value() and related
> bits.
>
> > We need to take a look at the parse use case. For instance to parse a
> comma
> > separated list of ints stored as a nul terminated string:
> > [example snipped]
>
> With my above suggestions, I believe the code in your example is the
> same except that you would also pass 'relaxed' to from_string.
>
> (Loosely related: I realize we haven't been trying to pin down the name
> of the function itself, but I would prefer something more like
> 'string_to', so that the name plus template type makes a natural phrase.)
>
> > The only main wart would be that we must manually update the range's
> first
> > member. It would be possible to store the end iterator to be able to
> return
> > a RANGE.
>
> Agreed, we should probably (also?) have that. (I first thought to call
> it "remaining" / "remainder", but "rest" is okay too.)
>
> >      must_be(range, ',');
>
> Not really relevant, but I would name this something else to indicate
> that it modifies the range... usually I tend to "consume" or "chomp".
> (You might also consider returning the modified range rather than
> modifying it in-place, just for the sake of clarity.)
>
> > With a lazy split this could also be written, without loosing
> performance:
> >
> > for (auto str : lazy_split(char_ptr_range(p), ','))  // str is probably
> a
> > string_view now, but we really don't need to know that.
> >      numbers.emplace_back(from_string<int>(str).complete_value());
> >
> > No, not really the same, now we don't allow space between number and
> comma.
>
> I would argue that 'complete' should also eat trailing space :-). (This
> is another reason to pass the mode as an input parameter; in relaxed
> mode we don't care what follows the number, but in complete mode spaced
> should be consumed so that we can tag the parse as successful. This
> wouldn't need to be a different mode.)
>
> > Note that decltype(r) could be dependant on the type being converted
> (and
> > of course the RANGE type). It is a concept rather than a class. This
> means,
> > I guess, that the return type of error() could differ depending on T,
> but
> > using that feature of course limits usability in template code.
>
> Um. That could work. Or you could specify that it returns int and that
> the allowed values depend on the value type. (I'm not convinced that's
> an issue... it's still extensible; any one type can use millions+ of
> possibly results, and the values can be declared to be dependent on the
> value type, i.e. 5 for parsing an int may mean something different than
> 5 for parsing a user_t. Reasonable implementations would stick to some
> common subset of well-known values. Template code, regardless, must also
> stick to that subset or else be conditional on the value type.)
>
> You might be able to do both; have error() return an int but allow
> specializations for user types to return a subclass / specialization of
> the "common" result type that adds additional members.
>
> --
> Matthew
>
>

--

---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.

------=_Part_5985_26026723.1391556639942
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr">I think you are right, Matthew. The most logical choice is=
 probably to send the parsing options in as flags. The big bike shed regard=
s their names, polarities and defaults of course...<div><br></div><div>Here=
's a sketch:</div><div><br></div><div><div class=3D"prettyprint" style=3D"b=
ackground-color: rgb(250, 250, 250); border: 1px solid rgb(187, 187, 187); =
word-wrap: break-word;"><code class=3D"prettyprint"><div class=3D"subpretty=
print"><div class=3D"subprettyprint"><div class=3D"subprettyprint"><font co=
lor=3D"#660066"><br></font></div><div class=3D"subprettyprint"><font color=
=3D"#660066">// expect holds a value or an exception_ptr. I think this is b=
asically the same as boost::expect, which in turn was inspired by Andrei Al=
exandrescu's idea.</font></div><div class=3D"subprettyprint"><font color=3D=
"#660066">// All ways of getting at the value except value_or() throws any<=
/font></div><div class=3D"subprettyprint"><font color=3D"#660066">// pendin=
g exception inside.</font></div><div class=3D"subprettyprint"><font color=
=3D"#660066">// This should be refined as optional to avoid relying on a de=
fault</font></div><div class=3D"subprettyprint"><font color=3D"#660066">// =
ctor in the error case.</font></div><div class=3D"subprettyprint"><font col=
or=3D"#660066">template &lt;typename T&gt; class expect {</font></div><div =
class=3D"subprettyprint"><font color=3D"#660066">public:</font></div><div c=
lass=3D"subprettyprint"><font color=3D"#660066">&nbsp; &nbsp; expect(except=
ion_ptr ex) : m_exception(ex) {}</font></div><div class=3D"subprettyprint">=
<font color=3D"#660066">&nbsp; &nbsp; expect(const T&amp; val) : m_value(va=
l) {}</font></div><div class=3D"subprettyprint"><font color=3D"#660066"><br=
></font></div><div class=3D"subprettyprint"><font color=3D"#660066">&nbsp; =
&nbsp; operator bool() { return !m_exception; }</font></div><div class=3D"s=
ubprettyprint"><font color=3D"#660066">&nbsp; &nbsp; exception_ptr exceptio=
n() { return m_exception; }</font></div><div class=3D"subprettyprint"><font=
 color=3D"#660066"><br></font></div><div class=3D"subprettyprint"><font col=
or=3D"#660066">&nbsp; &nbsp; operator T() { return value(); } &nbsp; &nbsp;=
 &nbsp; &nbsp;// specialization removes this for bool (or follow optional&l=
t;bool&gt;'s example)</font></div><div class=3D"subprettyprint"><font color=
=3D"#660066">&nbsp; &nbsp; T value() {</font></div><div class=3D"subprettyp=
rint"><font color=3D"#660066">&nbsp; &nbsp; &nbsp; &nbsp; if (m_exception)<=
/font></div><div class=3D"subprettyprint"><font color=3D"#660066">&nbsp; &n=
bsp; &nbsp; &nbsp; &nbsp; &nbsp; rethrow_exception(m_exception);</font></di=
v><div class=3D"subprettyprint"><font color=3D"#660066"><br></font></div><d=
iv class=3D"subprettyprint"><font color=3D"#660066">&nbsp; &nbsp; &nbsp; &n=
bsp; return m_value;</font></div><div class=3D"subprettyprint"><font color=
=3D"#660066">&nbsp; &nbsp; }</font></div><div class=3D"subprettyprint"><fon=
t color=3D"#660066">&nbsp; &nbsp; T value_or(const T&amp; defval) {</font><=
/div><div class=3D"subprettyprint"><font color=3D"#660066">&nbsp; &nbsp; &n=
bsp; &nbsp; if (m_exception)</font></div><div class=3D"subprettyprint"><fon=
t color=3D"#660066">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; return defval=
;</font></div><div class=3D"subprettyprint"><font color=3D"#660066"><br></f=
ont></div><div class=3D"subprettyprint"><font color=3D"#660066">&nbsp; &nbs=
p; &nbsp; &nbsp; return m_value;</font></div><div class=3D"subprettyprint">=
<font color=3D"#660066">&nbsp; &nbsp; }</font></div><div class=3D"subpretty=
print"><font color=3D"#660066"><br></font></div><div class=3D"subprettyprin=
t"><font color=3D"#660066">private:</font></div><div class=3D"subprettyprin=
t"><font color=3D"#660066">&nbsp; &nbsp; T m_value;</font></div><div class=
=3D"subprettyprint"><font color=3D"#660066">&nbsp; &nbsp; exception_ptr m_e=
xception;</font></div><div class=3D"subprettyprint"><font color=3D"#660066"=
>};</font></div><div class=3D"subprettyprint"><font color=3D"#660066"><br><=
/font></div><div class=3D"subprettyprint"><font color=3D"#660066"><br></fon=
t></div><div class=3D"subprettyprint"><font color=3D"#660066">enum StrToFla=
gs {</font></div><div class=3D"subprettyprint"><font color=3D"#660066">&nbs=
p; &nbsp; noleading =3D 1,</font></div><div class=3D"subprettyprint"><font =
color=3D"#660066">&nbsp; &nbsp; notrailing =3D 2,</font></div><div class=3D=
"subprettyprint"><font color=3D"#660066">&nbsp; &nbsp; complete =3D 4,</fon=
t></div><div class=3D"subprettyprint"><font color=3D"#660066">&nbsp; &nbsp;=
 strict =3D 7</font></div><div class=3D"subprettyprint"><font color=3D"#660=
066">};</font></div><div class=3D"subprettyprint"><font color=3D"#660066"><=
br></font></div><div class=3D"subprettyprint"><font color=3D"#660066"><br><=
/font></div><div class=3D"subprettyprint"><font color=3D"#660066">// Maybe =
better to get flags as a template parameter? Or offer both</font></div><div=
 class=3D"subprettyprint"><font color=3D"#660066">// versions? Not having t=
o test flags on each call saves time and they</font></div><div class=3D"sub=
prettyprint"><font color=3D"#660066">// are going to be fixed per call site=
 99% of the time.</font></div><div class=3D"subprettyprint"><font color=3D"=
#660066">template&lt;typename T, typename RANGE&gt; expect&lt;T&gt; str_to(=
RANGE&amp; range, StrToFlags flags)</font></div><div class=3D"subprettyprin=
t"><font color=3D"#660066">{</font></div><div class=3D"subprettyprint"><fon=
t color=3D"#660066">&nbsp; &nbsp; if (flags &amp; noleading) {</font></div>=
<div class=3D"subprettyprint"><font color=3D"#660066">&nbsp; &nbsp; &nbsp; =
&nbsp; if (isspace(*begin(range)))</font></div><div class=3D"subprettyprint=
"><font color=3D"#660066">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; return =
make_exception_ptr("No leading space allowed");</font></div><div class=3D"s=
ubprettyprint"><font color=3D"#660066">&nbsp; &nbsp; }</font></div><div cla=
ss=3D"subprettyprint"><font color=3D"#660066">&nbsp; &nbsp; else</font></di=
v><div class=3D"subprettyprint"><font color=3D"#660066">&nbsp; &nbsp; &nbsp=
; &nbsp; skipspace(range);</font></div><div class=3D"subprettyprint"><font =
color=3D"#660066"><br></font></div><div class=3D"subprettyprint"><font colo=
r=3D"#660066">&nbsp; &nbsp; T tmp;</font></div><div class=3D"subprettyprint=
"><font color=3D"#660066">&nbsp; &nbsp; ... Do the conversion into tmp and =
return on errors ...;</font></div><div class=3D"subprettyprint"><font color=
=3D"#660066">&nbsp; &nbsp;&nbsp;</font></div><div class=3D"subprettyprint">=
<font color=3D"#660066">&nbsp; &nbsp; if (flags &amp; notrailing) {</font><=
/div><div class=3D"subprettyprint"><font color=3D"#660066">&nbsp; &nbsp; &n=
bsp; &nbsp; if (isspace(*begin(range)))</font></div><div class=3D"subpretty=
print"><font color=3D"#660066">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; re=
turn make_exception_ptr("No trailing space allowed");</font></div><div clas=
s=3D"subprettyprint"><font color=3D"#660066">&nbsp; &nbsp; }</font></div><d=
iv class=3D"subprettyprint"><font color=3D"#660066">&nbsp; &nbsp; else</fon=
t></div><div class=3D"subprettyprint"><font color=3D"#660066">&nbsp; &nbsp;=
 &nbsp; &nbsp; skipspace(range);</font></div><div class=3D"subprettyprint">=
<font color=3D"#660066"><br></font></div><div class=3D"subprettyprint"><fon=
t color=3D"#660066">&nbsp; &nbsp; if (flags &amp; complete &amp;&amp; begin=
(range) !=3D end(range))</font></div><div class=3D"subprettyprint"><font co=
lor=3D"#660066">&nbsp; &nbsp; &nbsp; &nbsp; return make_exception_ptr("Junk=
 after value");</font></div><div class=3D"subprettyprint"><font color=3D"#6=
60066">&nbsp; &nbsp; &nbsp; &nbsp;&nbsp;</font></div><div class=3D"subprett=
yprint"><font color=3D"#660066">&nbsp; &nbsp; return tmp;</font></div><div =
class=3D"subprettyprint"><font color=3D"#660066">}</font></div><div class=
=3D"subprettyprint"><font color=3D"#660066"><br></font></div><div class=3D"=
subprettyprint"><font color=3D"#660066"><br></font></div><div class=3D"subp=
rettyprint"><font color=3D"#660066">// Use cases</font></div><div class=3D"=
subprettyprint"><font color=3D"#660066"><br></font></div><div class=3D"subp=
rettyprint"><font color=3D"#660066">// Throw on any error:</font></div><div=
 class=3D"subprettyprint"><font color=3D"#660066">int val =3D strto&lt;int&=
gt;(char_ptr_range("123"), strict);</font></div><div class=3D"subprettyprin=
t"><font color=3D"#660066"><br></font></div><div class=3D"subprettyprint"><=
font color=3D"#660066">// Accept any error</font></div><div class=3D"subpre=
ttyprint"><font color=3D"#660066">int v2 =3D strto&lt;int&gt;(char_ptr_rang=
e("123")).value_or(17);</font></div><div class=3D"subprettyprint"><font col=
or=3D"#660066"><br></font></div><div class=3D"subprettyprint"><font color=
=3D"#660066">// Parse comma separated using lazy split:</font></div><div cl=
ass=3D"subprettyprint"><font color=3D"#660066">for (auto str : char_ptr_ran=
ge("123, 234 ,345,34"))</font></div><div class=3D"subprettyprint"><font col=
or=3D"#660066">&nbsp; &nbsp; numbers.push_back(strto&lt;int&gt;(str, comple=
te)); &nbsp; &nbsp; &nbsp; // allow leading and trailing space, but nothing=
 else trailing except the comma.</font></div><div class=3D"subprettyprint">=
<font color=3D"#660066"><br></font></div><div class=3D"subprettyprint"><fon=
t color=3D"#660066">// Use templated flag version to save time:</font></div=
><div class=3D"subprettyprint"><font color=3D"#660066">for (auto str : char=
_ptr_range("123, 234 ,345,34"))</font></div><div class=3D"subprettyprint"><=
font color=3D"#660066">&nbsp; &nbsp; numbers.push_back(strto&lt;int, comple=
te&gt;(str)); &nbsp; &nbsp; &nbsp; // allow leading and trailing space, but=
 nothing else trailing except the comma.</font></div><div class=3D"subprett=
yprint"><br></div></div></div></code></div><br></div><div><br><br>Den tisda=
gen den 4:e februari 2014 kl. 23:08:31 UTC+1 skrev Matthew Woehlke:<blockqu=
ote class=3D"gmail_quote" style=3D"margin: 0;margin-left: 0.8ex;border-left=
: 1px #ccc solid;padding-left: 1ex;">On 2014-02-03 20:57, Bengt Gustafsson =
wrote:
<br>&gt; Tentatively I call the "states" of the return value:
<br>&gt;
<br>&gt; strict - no spaces skipped. All of the string could be converted.
<br>&gt; complete - space skipping ok, but no trailing junk.
<br>&gt; ok - space skipping ok, and trailing junk.
<br>&gt; bad - no parsing was possible, even after skipping spaces.
<br>&gt;
<br>&gt; // The actual parsing is always the same:
<br>&gt; const auto r =3D parse&lt;int&gt;(char_ptr_range("<wbr>123"));
<br>&gt;
<br>&gt; int x =3D r.strict_value(); // throws if r is not strict.
<br>&gt; int y =3D r.complete_value() // throws if r is not complete
<br>&gt; int z =3D r.value() // throws if r is not ok
<br>&gt; int w =3D r.value_or(17); // never throws
<br>
<br>Unfortunately, that makes it much more awkward to say 'use the parse=20
<br>value if a strict parse succeeded, else use a default value'. Sure, I=
=20
<br>can write 'r.is_strict() ? r.value() : default', but that's much more=
=20
<br>awkward than if I could always write 'r.value_or(default)'.
<br>
<br>I still think it makes more sense to tell the parse up front how=20
<br>tolerant it should be. Having said that, I could possibly see still=20
<br>using your 'strict' and 'complete' terminology. I would probably then=
=20
<br>name the third option something like 'relaxed'.
<br>
<br>&gt; An iterator to the next char can be returned by r.next()
<br>&gt; The error code can be returned by r.error().
<br>
<br>Except for the above comments, I like.
<br>
<br>Another reason to move the strict/relaxed/etc. to input parameters is=
=20
<br>that it would allow the return type to more trivially subclass=20
<br>std::optional. Basically, you would be adding error() and next() to=20
<br>std::optional, without having to redefine/overload value() and related =
bits.
<br>
<br>&gt; We need to take a look at the parse use case. For instance to pars=
e a comma
<br>&gt; separated list of ints stored as a nul terminated string:
<br>&gt; [example snipped]
<br>
<br>With my above suggestions, I believe the code in your example is the=20
<br>same except that you would also pass 'relaxed' to from_string.
<br>
<br>(Loosely related: I realize we haven't been trying to pin down the name=
=20
<br>of the function itself, but I would prefer something more like=20
<br>'string_to', so that the name plus template type makes a natural phrase=
..)
<br>
<br>&gt; The only main wart would be that we must manually update the range=
's first
<br>&gt; member. It would be possible to store the end iterator to be able =
to return
<br>&gt; a RANGE.
<br>
<br>Agreed, we should probably (also?) have that. (I first thought to call=
=20
<br>it "remaining" / "remainder", but "rest" is okay too.)
<br>
<br>&gt; &nbsp; &nbsp; &nbsp;must_be(range, ',');
<br>
<br>Not really relevant, but I would name this something else to indicate=
=20
<br>that it modifies the range... usually I tend to "consume" or "chomp".=
=20
<br>(You might also consider returning the modified range rather than=20
<br>modifying it in-place, just for the sake of clarity.)
<br>
<br>&gt; With a lazy split this could also be written, without loosing perf=
ormance:
<br>&gt;
<br>&gt; for (auto str : lazy_split(char_ptr_range(p), ',')) &nbsp;// str i=
s probably a
<br>&gt; string_view now, but we really don't need to know that.
<br>&gt; &nbsp; &nbsp; &nbsp;numbers.emplace_back(from_<wbr>string&lt;int&g=
t;(str).complete_<wbr>value());
<br>&gt;
<br>&gt; No, not really the same, now we don't allow space between number a=
nd comma.
<br>
<br>I would argue that 'complete' should also eat trailing space :-). (This=
=20
<br>is another reason to pass the mode as an input parameter; in relaxed=20
<br>mode we don't care what follows the number, but in complete mode spaced=
=20
<br>should be consumed so that we can tag the parse as successful. This=20
<br>wouldn't need to be a different mode.)
<br>
<br>&gt; Note that decltype(r) could be dependant on the type being convert=
ed (and
<br>&gt; of course the RANGE type). It is a concept rather than a class. Th=
is means,
<br>&gt; I guess, that the return type of error() could differ depending on=
 T, but
<br>&gt; using that feature of course limits usability in template code.
<br>
<br>Um. That could work. Or you could specify that it returns int and that=
=20
<br>the allowed values depend on the value type. (I'm not convinced that's=
=20
<br>an issue... it's still extensible; any one type can use millions+ of=20
<br>possibly results, and the values can be declared to be dependent on the=
=20
<br>value type, i.e. 5 for parsing an int may mean something different than=
=20
<br>5 for parsing a user_t. Reasonable implementations would stick to some=
=20
<br>common subset of well-known values. Template code, regardless, must als=
o=20
<br>stick to that subset or else be conditional on the value type.)
<br>
<br>You might be able to do both; have error() return an int but allow=20
<br>specializations for user types to return a subclass / specialization of=
=20
<br>the "common" result type that adds additional members.
<br>
<br>--=20
<br>Matthew
<br>
<br></blockquote></div></div>

<p></p>

-- <br />
&nbsp;<br />
--- <br />
You received this message because you are subscribed to the Google Groups &=
quot;ISO C++ Standard - Future Proposals&quot; group.<br />
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to std-proposals+unsubscribe@isocpp.org.<br />
To post to this group, send email to std-proposals@isocpp.org.<br />
Visit this group at <a href=3D"http://groups.google.com/a/isocpp.org/group/=
std-proposals/">http://groups.google.com/a/isocpp.org/group/std-proposals/<=
/a>.<br />

------=_Part_5985_26026723.1391556639942--

.


Author: Paul Tessier <phernost@gmail.com>
Date: Tue, 4 Feb 2014 15:58:56 -0800 (PST)
Raw View
------=_Part_943_13167470.1391558336174
Content-Type: text/plain; charset=UTF-8



On Tuesday, February 4, 2014 5:08:31 PM UTC-5, Matthew Woehlke wrote:
>
> On 2014-02-03 20:57, Bengt Gustafsson wrote:
> > Tentatively I call the "states" of the return value:
> >
> > strict - no spaces skipped. All of the string could be converted.
> > complete - space skipping ok, but no trailing junk.
> > ok - space skipping ok, and trailing junk.
> > bad - no parsing was possible, even after skipping spaces.
> >
> > // The actual parsing is always the same:
> > const auto r = parse<int>(char_ptr_range("123"));
> >
> > int x = r.strict_value(); // throws if r is not strict.
> > int y = r.complete_value() // throws if r is not complete
> > int z = r.value() // throws if r is not ok
> > int w = r.value_or(17); // never throws
>
> Unfortunately, that makes it much more awkward to say 'use the parse
> value if a strict parse succeeded, else use a default value'. Sure, I
> can write 'r.is_strict() ? r.value() : default', but that's much more
> awkward than if I could always write 'r.value_or(default)'.
>
> I still think it makes more sense to tell the parse up front how
> tolerant it should be. Having said that, I could possibly see still
> using your 'strict' and 'complete' terminology. I would probably then
> name the third option something like 'relaxed'.
>
> > An iterator to the next char can be returned by r.next()
> > The error code can be returned by r.error().
>
> Except for the above comments, I like.
>
> Another reason to move the strict/relaxed/etc. to input parameters is
> that it would allow the return type to more trivially subclass
> std::optional. Basically, you would be adding error() and next() to
> std::optional, without having to redefine/overload value() and related
> bits.
>
> > We need to take a look at the parse use case. For instance to parse a
> comma
> > separated list of ints stored as a nul terminated string:
> > [example snipped]
>
> With my above suggestions, I believe the code in your example is the
> same except that you would also pass 'relaxed' to from_string.
>
> (Loosely related: I realize we haven't been trying to pin down the name
> of the function itself, but I would prefer something more like
> 'string_to', so that the name plus template type makes a natural phrase.)
>
> > The only main wart would be that we must manually update the range's
> first
> > member. It would be possible to store the end iterator to be able to
> return
> > a RANGE.
>
> Agreed, we should probably (also?) have that. (I first thought to call
> it "remaining" / "remainder", but "rest" is okay too.)
>
> >      must_be(range, ',');
>
> Not really relevant, but I would name this something else to indicate
> that it modifies the range... usually I tend to "consume" or "chomp".
> (You might also consider returning the modified range rather than
> modifying it in-place, just for the sake of clarity.)
>
> > With a lazy split this could also be written, without loosing
> performance:
> >
> > for (auto str : lazy_split(char_ptr_range(p), ','))  // str is probably
> a
> > string_view now, but we really don't need to know that.
> >      numbers.emplace_back(from_string<int>(str).complete_value());
> >
> > No, not really the same, now we don't allow space between number and
> comma.
>
> I would argue that 'complete' should also eat trailing space :-). (This
> is another reason to pass the mode as an input parameter; in relaxed
> mode we don't care what follows the number, but in complete mode spaced
> should be consumed so that we can tag the parse as successful. This
> wouldn't need to be a different mode.)
>
> > Note that decltype(r) could be dependant on the type being converted
> (and
> > of course the RANGE type). It is a concept rather than a class. This
> means,
> > I guess, that the return type of error() could differ depending on T,
> but
> > using that feature of course limits usability in template code.
>
> Um. That could work. Or you could specify that it returns int and that
> the allowed values depend on the value type. (I'm not convinced that's
> an issue... it's still extensible; any one type can use millions+ of
> possibly results, and the values can be declared to be dependent on the
> value type, i.e. 5 for parsing an int may mean something different than
> 5 for parsing a user_t. Reasonable implementations would stick to some
> common subset of well-known values. Template code, regardless, must also
> stick to that subset or else be conditional on the value type.)
>
> You might be able to do both; have error() return an int but allow
> specializations for user types to return a subclass / specialization of
> the "common" result type that adds additional members.
>
> --
> Matthew
>
>
It seems that the function that parses the least parses best.  It is always
possible to compose more complex parse functions from simpler building
blocks but, the reverse is not always possible, or desirable.  A regex of
one's choosing can be used to skip any kind of prefix or suffix surrounding
the section to be parsed.  Adding additional prefix elimination to the
parse routine, just complicates the behaviour and make the composition more
cumbersome.

Locales *must* be taken into consideration as "-9" and "(9)" can both mean
that same thing, similarly "510,023.34" or "51.0023,34" may be equivalent
depending upon locale chosen.

To the problems of the interface.  It would seem that out parameters would
be the best for the base upon which to build.  A interface with out
parameters can be composed into any of the other interfaces discussed so
far.  The same cannot be said for the other interfaces.   Therefore I
propose that at minimum an interface that accepts a range (iterators/actual
range), an out parameters for the parsed result, a locale (most likely
defaulted), and returns an enum/error code would solve all problems listed
to date.

Example Interface:
int parse<T,U>( range<U> r, T& value, locale loc = default_locale);

Value Returning Example:
T parse_or_zero<T,U>( range<U> r, bool skip_white = true, locale loc =
default_locale) {
  if( skip_white ) { r = skip_white_space(r); }
  T retval = 0;
  parse( r, retval, loc );
  return retval;
}

Expected Returning Example:
expected<T> parse_expected<T,U>( range<U> r, bool skip_white = true, locale
loc = default_locale) {
  if( skip_white ) { r = skip_white_space(r); }
  T retval = 0;
  expected<T> retval;
  parse_err err = parse( r, retval.value, loc );
  if( err != parse_err::success ) { retval.set_exception( some_exception(
err ) ); }
  return retval;
}

It's much easier to compose other interfaces to fit the tastes of the user,
if out parameters are available.  Whether the other interfaces should be
supplied in the standard should be debated but, at minimal the out
parameter interface must be in the standard from what has be discussed so
far to allow others to have there way too, even if not standardized, at
least available by simple composition.

--

---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.

------=_Part_943_13167470.1391558336174
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr"><br><br>On Tuesday, February 4, 2014 5:08:31 PM UTC-5, Mat=
thew Woehlke wrote:<blockquote class=3D"gmail_quote" style=3D"margin: 0;mar=
gin-left: 0.8ex;border-left: 1px #ccc solid;padding-left: 1ex;">On 2014-02-=
03 20:57, Bengt Gustafsson wrote:
<br>&gt; Tentatively I call the "states" of the return value:
<br>&gt;
<br>&gt; strict - no spaces skipped. All of the string could be converted.
<br>&gt; complete - space skipping ok, but no trailing junk.
<br>&gt; ok - space skipping ok, and trailing junk.
<br>&gt; bad - no parsing was possible, even after skipping spaces.
<br>&gt;
<br>&gt; // The actual parsing is always the same:
<br>&gt; const auto r =3D parse&lt;int&gt;(char_ptr_range("<wbr>123"));
<br>&gt;
<br>&gt; int x =3D r.strict_value(); // throws if r is not strict.
<br>&gt; int y =3D r.complete_value() // throws if r is not complete
<br>&gt; int z =3D r.value() // throws if r is not ok
<br>&gt; int w =3D r.value_or(17); // never throws
<br>
<br>Unfortunately, that makes it much more awkward to say 'use the parse=20
<br>value if a strict parse succeeded, else use a default value'. Sure, I=
=20
<br>can write 'r.is_strict() ? r.value() : default', but that's much more=
=20
<br>awkward than if I could always write 'r.value_or(default)'.
<br>
<br>I still think it makes more sense to tell the parse up front how=20
<br>tolerant it should be. Having said that, I could possibly see still=20
<br>using your 'strict' and 'complete' terminology. I would probably then=
=20
<br>name the third option something like 'relaxed'.
<br>
<br>&gt; An iterator to the next char can be returned by r.next()
<br>&gt; The error code can be returned by r.error().
<br>
<br>Except for the above comments, I like.
<br>
<br>Another reason to move the strict/relaxed/etc. to input parameters is=
=20
<br>that it would allow the return type to more trivially subclass=20
<br>std::optional. Basically, you would be adding error() and next() to=20
<br>std::optional, without having to redefine/overload value() and related =
bits.
<br>
<br>&gt; We need to take a look at the parse use case. For instance to pars=
e a comma
<br>&gt; separated list of ints stored as a nul terminated string:
<br>&gt; [example snipped]
<br>
<br>With my above suggestions, I believe the code in your example is the=20
<br>same except that you would also pass 'relaxed' to from_string.
<br>
<br>(Loosely related: I realize we haven't been trying to pin down the name=
=20
<br>of the function itself, but I would prefer something more like=20
<br>'string_to', so that the name plus template type makes a natural phrase=
..)
<br>
<br>&gt; The only main wart would be that we must manually update the range=
's first
<br>&gt; member. It would be possible to store the end iterator to be able =
to return
<br>&gt; a RANGE.
<br>
<br>Agreed, we should probably (also?) have that. (I first thought to call=
=20
<br>it "remaining" / "remainder", but "rest" is okay too.)
<br>
<br>&gt; &nbsp; &nbsp; &nbsp;must_be(range, ',');
<br>
<br>Not really relevant, but I would name this something else to indicate=
=20
<br>that it modifies the range... usually I tend to "consume" or "chomp".=
=20
<br>(You might also consider returning the modified range rather than=20
<br>modifying it in-place, just for the sake of clarity.)
<br>
<br>&gt; With a lazy split this could also be written, without loosing perf=
ormance:
<br>&gt;
<br>&gt; for (auto str : lazy_split(char_ptr_range(p), ',')) &nbsp;// str i=
s probably a
<br>&gt; string_view now, but we really don't need to know that.
<br>&gt; &nbsp; &nbsp; &nbsp;numbers.emplace_back(from_<wbr>string&lt;int&g=
t;(str).complete_<wbr>value());
<br>&gt;
<br>&gt; No, not really the same, now we don't allow space between number a=
nd comma.
<br>
<br>I would argue that 'complete' should also eat trailing space :-). (This=
=20
<br>is another reason to pass the mode as an input parameter; in relaxed=20
<br>mode we don't care what follows the number, but in complete mode spaced=
=20
<br>should be consumed so that we can tag the parse as successful. This=20
<br>wouldn't need to be a different mode.)
<br>
<br>&gt; Note that decltype(r) could be dependant on the type being convert=
ed (and
<br>&gt; of course the RANGE type). It is a concept rather than a class. Th=
is means,
<br>&gt; I guess, that the return type of error() could differ depending on=
 T, but
<br>&gt; using that feature of course limits usability in template code.
<br>
<br>Um. That could work. Or you could specify that it returns int and that=
=20
<br>the allowed values depend on the value type. (I'm not convinced that's=
=20
<br>an issue... it's still extensible; any one type can use millions+ of=20
<br>possibly results, and the values can be declared to be dependent on the=
=20
<br>value type, i.e. 5 for parsing an int may mean something different than=
=20
<br>5 for parsing a user_t. Reasonable implementations would stick to some=
=20
<br>common subset of well-known values. Template code, regardless, must als=
o=20
<br>stick to that subset or else be conditional on the value type.)
<br>
<br>You might be able to do both; have error() return an int but allow=20
<br>specializations for user types to return a subclass / specialization of=
=20
<br>the "common" result type that adds additional members.
<br>
<br>--=20
<br>Matthew
<br>
<br></blockquote><div><br>It seems that the function that parses the least =
parses best.&nbsp; It is always possible to compose more complex parse func=
tions from simpler building blocks but, the reverse is not always possible,=
 or desirable.&nbsp; A regex of one's choosing can be used to skip any kind=
 of prefix or suffix surrounding the section to be parsed.&nbsp; Adding add=
itional prefix elimination to the parse routine, just complicates the behav=
iour and make the composition more cumbersome.<br><br>Locales <b>must</b> b=
e taken into consideration as "-9" and "(9)" can both mean that same thing,=
 similarly "510,023.34" or "51.0023,34" may be equivalent depending upon lo=
cale chosen.<br><br>To the problems of the interface.&nbsp; It would seem t=
hat out parameters would be the best for the base upon which to build.&nbsp=
; A interface with out parameters can be composed into any of the other int=
erfaces discussed so far.&nbsp; The same cannot be said for the other inter=
faces.&nbsp;&nbsp; Therefore I propose that at minimum an interface that ac=
cepts a range (iterators/actual range), an out parameters for the parsed re=
sult, a locale (most likely defaulted), and returns an enum/error code woul=
d solve all problems listed to date.<br><br>Example Interface:<br>int parse=
&lt;T,U&gt;( range&lt;U&gt; r, T&amp; value, locale loc =3D default_locale)=
;<br><br>Value Returning Example:<br>T parse_or_zero&lt;T,U&gt;( range&lt;U=
&gt; r, bool skip_white =3D true, locale loc =3D default_locale) {<br>&nbsp=
; if( skip_white ) { r =3D skip_white_space(r); }<br>&nbsp; T retval =3D 0;=
<br>&nbsp; parse( r, retval, loc );<br>&nbsp; return retval;<br>}<br><br>Ex=
pected Returning Example:<br>expected&lt;T&gt; parse_expected&lt;T,U&gt;( r=
ange&lt;U&gt; r, bool skip_white =3D true, locale loc =3D default_locale) {=
<br>&nbsp; if( skip_white ) { r =3D skip_white_space(r); }<br>&nbsp; T retv=
al =3D 0;<br>&nbsp; expected&lt;T&gt; retval;<br>&nbsp; parse_err err =3D p=
arse( r, retval.value, loc );<br>&nbsp; if( err !=3D parse_err::success ) {=
 retval.set_exception( some_exception( err ) ); }<br>&nbsp; return retval;<=
br>}<br><br>It's much easier to compose other interfaces to fit the tastes =
of the user, if out parameters are available.&nbsp; Whether the other inter=
faces should be supplied in the standard should be debated but, at minimal =
the out parameter interface must be in the standard from what has be discus=
sed so far to allow others to have there way too, even if not standardized,=
 at least available by simple composition.<br><br></div></div>

<p></p>

-- <br />
&nbsp;<br />
--- <br />
You received this message because you are subscribed to the Google Groups &=
quot;ISO C++ Standard - Future Proposals&quot; group.<br />
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to std-proposals+unsubscribe@isocpp.org.<br />
To post to this group, send email to std-proposals@isocpp.org.<br />
Visit this group at <a href=3D"http://groups.google.com/a/isocpp.org/group/=
std-proposals/">http://groups.google.com/a/isocpp.org/group/std-proposals/<=
/a>.<br />

------=_Part_943_13167470.1391558336174--

.


Author: Matthew Woehlke <mw_triad@users.sourceforge.net>
Date: Tue, 04 Feb 2014 19:09:16 -0500
Raw View
On 2014-02-04 18:30, Bengt Gustafsson wrote:
> I think you are right, Matthew. The most logical choice is probably to send
> the parsing options in as flags. The big bike shed regards their names,
> polarities and defaults of course...

I may be wrong, but my impression is that it's been generally felt that
the default should be strict.

> Here's a sketch:
> [snip definition of expect]

Qt seems to be in the process of implementing a similar API. In that
case, I've suggested resolving the ambiguity issues in case value_t ==
bool by simply not providing implicit conversion to value_t, but relying
entirely on operator* instead. The overhead is only one character ('*')
to access the value, but it avoids all sorts of issues.

(Actually looking at std::optional now, it looks like that does actually
do as above.)

If possible I would encourage subclassing std::optional. This will
provide a bunch of needed functionality 'for free' and allow the result
to be passed to something expecting a std::optional.

> enum StrToFlags {
>      noleading = 1,
>      notrailing = 2,
>      complete = 4,
>      strict = 7
> };

Per first comment, 'strict = 0', additional bits relax constraints.

> // Maybe better to get flags as a template parameter? Or offer both
> // versions? Not having to test flags on each call saves time and they
> // are going to be fixed per call site 99% of the time.

Good points. I wouldn't object to that. You might even be able to write
the implementation as if they were a runtime parameter and rely on the
compiler to optimize out irrelevant code.

> template<typename T, typename RANGE> expect<T> str_to(RANGE& range,
> StrToFlags flags)
> {
>      if (flags & noleading) {
>          if (isspace(*begin(range)))
>              return make_exception_ptr("No leading space allowed");

I wonder if this isn't overkill... if leading space not allowed, just
continue to parsing the value and fail when the leading space isn't a
valid character?

>      if (flags & notrailing) {
>          if (isspace(*begin(range)))
>              return make_exception_ptr("No trailing space allowed");

Related to above, I don't think this is quite right. I think first you'd
be checking if the entire input was consumed, and acting accordingly
depending on if the flags allowed trailing "stuff".

Anyway, these are implementation details that aren't critical.

Except that I miss the remaining range in the result type, I don't think
I'm seeing anything in the API I don't like that I haven't commented on
above. (Mostly swapping the meaning of the flags...)

> // Use cases
> [snipped]

....all look good :-)

--
Matthew

--

---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.

.


Author: Matthew Woehlke <mw_triad@users.sourceforge.net>
Date: Tue, 04 Feb 2014 19:29:00 -0500
Raw View
On 2014-02-04 18:58, Paul Tessier wrote:
> Locales *must* be taken into consideration as "-9" and "(9)" can both mea=
n
> that same thing, similarly "510,023.34" or "51.0023,34" may be equivalent
> depending upon locale chosen.

Definitely agreed. However I don't think there is much that needs to be=20
discussed here besides bikeshedding the actual name of the method.

There should be a "C" locale version. We should get that API right=20
first. We want this for performance reasons, as C locale is going to be=20
a common use case, and being able to ignore locale issues likely has a=20
non-trivial impact on performance.

Then there should be an l_foo=C2=B9/foo_l version of the same that takes an=
=20
optional locale and is locale-aware. What "locale aware" means is I=20
think mainly an implementation detail that doesn't affect the API.

(=C2=B9 While foo_l would be the usual convention, that breaks reading in=
=20
case of e.g. string_to<int>, where the _l suffix would awkwardly break=20
into what is otherwise a natural phrase. Plus in that case, "locale=20
string" fits the natural phrasing. Other options: lfoo, locale_foo, etc.)

> To the problems of the interface.  It would seem that out parameters woul=
d
> be the best for the base upon which to build.  A interface with out
> parameters can be composed into any of the other interfaces discussed so
> far.  The same cannot be said for the other interfaces.

By "composed", do you mean I can write a returns-everything version from=20
an out-param version? (If yes, I assure you I can do the converse as=20
well; see below.)

> Therefore I propose that at minimum an interface that accepts a range
> (iterators/actual range), an out parameters for the parsed result, a
> locale (most likely defaulted), and returns an enum/error code would
> solve all problems listed to date.

Does not allow the value to be assigned to a const or passed directly to=20
a user. And output parameters are just awkward to work with in general;=20
making the one parameter that will ALWAYS be used an out parameter is=20
IMHO the worst possible design that's been proposed.

Conversely, there are all sorts of advantages to returning a (subclass=20
of) std::optional...

I do see you proposed to provide both, which is good. If so, I think it=20
should be left as an implementation detail which is the 'real'=20
implementation and which are just wrappers. And I expect the=20
everything-via-return is going to be most used. (Definitely it's the=20
only one *I* would use...)

> Example Interface:
> int parse<T,U>( range<U> r, T& value, locale loc =3D default_locale);

int parse<T, U>(range<U> r, T& value, locale loc =3D default_locale)
{
   auto const result =3D parse<T>(r, loc);
   if (result) value =3D *result;
   return // um... int? from whence do I get an int?
}

That wasn't so hard.

That said... I notice here that there is no way to return the remaining=20
range. But this is fixed if the return type is parse_result<void>. (Then=20
we just need a clever way to move-construct the parse_result<T> into=20
parse_result<void> and we're good... maybe parse_result<void> could have=20
a specialized ctor to take any other parse_result and just throw out the=20
value.)

--=20
Matthew

--=20

---=20
You received this message because you are subscribed to the Google Groups "=
ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposa=
ls/.

.


Author: Matthew Woehlke <mw_triad@users.sourceforge.net>
Date: Tue, 04 Feb 2014 19:40:30 -0500
Raw View
On 2014-02-04 19:07, Miro Knejp wrote:
> Am 04.02.2014 16:33, schrieb Bengt Gustafsson:
>> Yes, leading whitespace is always consumed in parse. If you don't
>> allow this you loose some performance as you actually convert the
>> number when you could know that it was an error as soon as you saw the
>> first space. I don't think this is a big problem. The idea is to skip
>> the space and set a flag in the return value if there was some space
>> to skip. The strict_value() function checks this flag and throws if it
>> was set.
>
> Well what if my use case doesn't allow leading whitespaces? Wasn't that
> one of the initial concerns that started this whole thing? If it's a
> single pass input iterator this is a big deal and the parser must not
> consume any invalid characters I didn't tell it to as they are forever
> lost to the caller.

Do you mean you actually have a real example of an iterator that cannot
be dereferenced more than once? (What on earth would create such a thing?)

> The actual whitespace
> content is lost. What if I needed this information to increment a line
> counter after parsing the number?

....then don't tell the parser to eat whitespace. (Note: the parser
*must* have a strict mode... so I agree with you there. A mode that eats
everything possible and then tells you how far it got may also be
required. Anything else is probably in the 'nice to have' category.)

--
Matthew

--

---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.

.


Author: Paul Tessier <phernost@gmail.com>
Date: Tue, 4 Feb 2014 16:55:57 -0800 (PST)
Raw View
------=_Part_3666_26799527.1391561757114
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable



On Tuesday, February 4, 2014 7:29:00 PM UTC-5, Matthew Woehlke wrote:
>
> On 2014-02-04 18:58, Paul Tessier wrote:=20
> > Locales *must* be taken into consideration as "-9" and "(9)" can both=
=20
> mean=20
> > that same thing, similarly "510,023.34" or "51.0023,34" may be=20
> equivalent=20
> > depending upon locale chosen.=20
>
> Definitely agreed. However I don't think there is much that needs to be=
=20
> discussed here besides bikeshedding the actual name of the method.=20
>
> There should be a "C" locale version. We should get that API right=20
> first. We want this for performance reasons, as C locale is going to be=
=20
> a common use case, and being able to ignore locale issues likely has a=20
> non-trivial impact on performance.=20
>
> Then there should be an l_foo=C2=B9/foo_l version of the same that takes =
an=20
> optional locale and is locale-aware. What "locale aware" means is I=20
> think mainly an implementation detail that doesn't affect the API.=20
>
> (=C2=B9 While foo_l would be the usual convention, that breaks reading in=
=20
> case of e.g. string_to<int>, where the _l suffix would awkwardly break=20
> into what is otherwise a natural phrase. Plus in that case, "locale=20
> string" fits the natural phrasing. Other options: lfoo, locale_foo, etc.)=
=20
>
> > To the problems of the interface.  It would seem that out parameters=20
> would=20
> > be the best for the base upon which to build.  A interface with out=20
> > parameters can be composed into any of the other interfaces discussed s=
o=20
> > far.  The same cannot be said for the other interfaces.=20
>
> By "composed", do you mean I can write a returns-everything version from=
=20
> an out-param version? (If yes, I assure you I can do the converse as=20
> well; see below.)=20
>
> > Therefore I propose that at minimum an interface that accepts a range=
=20
> > (iterators/actual range), an out parameters for the parsed result, a=20
> > locale (most likely defaulted), and returns an enum/error code would=20
> > solve all problems listed to date.=20
>
> Does not allow the value to be assigned to a const or passed directly to=
=20
> a user. And output parameters are just awkward to work with in general;=
=20
> making the one parameter that will ALWAYS be used an out parameter is=20
> IMHO the worst possible design that's been proposed.=20
>
> Conversely, there are all sorts of advantages to returning a (subclass=20
> of) std::optional...=20
>
> I do see you proposed to provide both, which is good. If so, I think it=
=20
> should be left as an implementation detail which is the 'real'=20
> implementation and which are just wrappers. And I expect the=20
> everything-via-return is going to be most used. (Definitely it's the=20
> only one *I* would use...)=20
>
> > Example Interface:=20
> > int parse<T,U>( range<U> r, T& value, locale loc =3D default_locale);=
=20
>
> int parse<T, U>(range<U> r, T& value, locale loc =3D default_locale)=20
> {=20
>    auto const result =3D parse<T>(r, loc);=20
>    if (result) value =3D *result;=20
>    return // um... int? from whence do I get an int?=20
> }=20
>
> That wasn't so hard.=20
>
> That said... I notice here that there is no way to return the remaining=
=20
> range. But this is fixed if the return type is parse_result<void>. (Then=
=20
> we just need a clever way to move-construct the parse_result<T> into=20
> parse_result<void> and we're good... maybe parse_result<void> could have=
=20
> a specialized ctor to take any other parse_result and just throw out the=
=20
> value.)=20
>
> --=20
> Matthew=20
>
>
Except that with an out parameter no copies need be made, which depending=
=20
on cost of copying said type, this may be a bottle neck.  Your version of=
=20
an out parameter composed of a value returning version forces a copy=20
regardless of the need for one.  The opposite cannot be said for the=20
reverse composition.

For const, initializing a const object from another object always requires=
=20
a copy.

big_int cache;
parse<big_int>( r, cache ); // avoids copying

big_int const val =3D parse_expected<big_int>( r ).value(); // copies=20
regardless

These two versions do the least amount of work.  Including both, is=20
perfectly fine but, the assumption of equal composition is false.  In the=
=20
above, supplying parse<T> for any custom type could be seen as the base for=
=20
the other versions and therefore only requirement for supporting the other=
=20
versions.  This is helpful if constructing and copying T is expensive, as=
=20
this expense can be avoided by only using the out parameter version.  If=20
this is not the case then, these is no way to avoid the cost and still use=
=20
the structure provided by the standard, and will require extra=20
implementations for the other versions to achieve performance.

--=20

---=20
You received this message because you are subscribed to the Google Groups "=
ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposa=
ls/.

------=_Part_3666_26799527.1391561757114
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr"><br><br>On Tuesday, February 4, 2014 7:29:00 PM UTC-5, Mat=
thew Woehlke wrote:<blockquote class=3D"gmail_quote" style=3D"margin: 0;mar=
gin-left: 0.8ex;border-left: 1px #ccc solid;padding-left: 1ex;">On 2014-02-=
04 18:58, Paul Tessier wrote:
<br>&gt; Locales *must* be taken into consideration as "-9" and "(9)" can b=
oth mean
<br>&gt; that same thing, similarly "510,023.34" or "51.0023,34" may be equ=
ivalent
<br>&gt; depending upon locale chosen.
<br>
<br>Definitely agreed. However I don't think there is much that needs to be=
=20
<br>discussed here besides bikeshedding the actual name of the method.
<br>
<br>There should be a "C" locale version. We should get that API right=20
<br>first. We want this for performance reasons, as C locale is going to be=
=20
<br>a common use case, and being able to ignore locale issues likely has a=
=20
<br>non-trivial impact on performance.
<br>
<br>Then there should be an l_foo=C2=B9/foo_l version of the same that take=
s an=20
<br>optional locale and is locale-aware. What "locale aware" means is I=20
<br>think mainly an implementation detail that doesn't affect the API.
<br>
<br>(=C2=B9 While foo_l would be the usual convention, that breaks reading =
in=20
<br>case of e.g. string_to&lt;int&gt;, where the _l suffix would awkwardly =
break=20
<br>into what is otherwise a natural phrase. Plus in that case, "locale=20
<br>string" fits the natural phrasing. Other options: lfoo, locale_foo, etc=
..)
<br>
<br>&gt; To the problems of the interface. &nbsp;It would seem that out par=
ameters would
<br>&gt; be the best for the base upon which to build. &nbsp;A interface wi=
th out
<br>&gt; parameters can be composed into any of the other interfaces discus=
sed so
<br>&gt; far. &nbsp;The same cannot be said for the other interfaces.
<br>
<br>By "composed", do you mean I can write a returns-everything version fro=
m=20
<br>an out-param version? (If yes, I assure you I can do the converse as=20
<br>well; see below.)
<br>
<br>&gt; Therefore I propose that at minimum an interface that accepts a ra=
nge
<br>&gt; (iterators/actual range), an out parameters for the parsed result,=
 a
<br>&gt; locale (most likely defaulted), and returns an enum/error code wou=
ld
<br>&gt; solve all problems listed to date.
<br>
<br>Does not allow the value to be assigned to a const or passed directly t=
o=20
<br>a user. And output parameters are just awkward to work with in general;=
=20
<br>making the one parameter that will ALWAYS be used an out parameter is=
=20
<br>IMHO the worst possible design that's been proposed.
<br>
<br>Conversely, there are all sorts of advantages to returning a (subclass=
=20
<br>of) std::optional...
<br>
<br>I do see you proposed to provide both, which is good. If so, I think it=
=20
<br>should be left as an implementation detail which is the 'real'=20
<br>implementation and which are just wrappers. And I expect the=20
<br>everything-via-return is going to be most used. (Definitely it's the=20
<br>only one *I* would use...)
<br>
<br>&gt; Example Interface:
<br>&gt; int parse&lt;T,U&gt;( range&lt;U&gt; r, T&amp; value, locale loc =
=3D default_locale);
<br>
<br>int parse&lt;T, U&gt;(range&lt;U&gt; r, T&amp; value, locale loc =3D de=
fault_locale)
<br>{
<br>&nbsp; &nbsp;auto const result =3D parse&lt;T&gt;(r, loc);
<br>&nbsp; &nbsp;if (result) value =3D *result;
<br>&nbsp; &nbsp;return // um... int? from whence do I get an int?
<br>}
<br>
<br>That wasn't so hard.
<br>
<br>That said... I notice here that there is no way to return the remaining=
=20
<br>range. But this is fixed if the return type is parse_result&lt;void&gt;=
.. (Then=20
<br>we just need a clever way to move-construct the parse_result&lt;T&gt; i=
nto=20
<br>parse_result&lt;void&gt; and we're good... maybe parse_result&lt;void&g=
t; could have=20
<br>a specialized ctor to take any other parse_result and just throw out th=
e=20
<br>value.)
<br>
<br>--=20
<br>Matthew
<br>
<br></blockquote><div><br>Except that with an out parameter no copies need =
be made, which depending on cost of copying said type, this may be a bottle=
 neck.&nbsp; Your version of an out parameter composed of a value returning=
 version forces a copy regardless of the need for one.&nbsp; The opposite c=
annot be said for the reverse composition.<br><br>For const, initializing a=
 const object from another object always requires a copy.<br><br>big_int ca=
che;<br>parse&lt;big_int&gt;( r, cache ); // avoids copying<br><br>big_int =
const val =3D parse_expected&lt;big_int&gt;( r ).value(); // copies regardl=
ess<br><br>These two versions do the least amount of work.&nbsp; Including =
both, is perfectly fine but, the assumption of equal composition is false.&=
nbsp; In the above, supplying parse&lt;T&gt; for any custom type could be s=
een as the base for the other versions and therefore only requirement for s=
upporting the other versions.&nbsp; This is helpful if constructing and cop=
ying T is expensive, as this expense can be avoided by only using the out p=
arameter version.&nbsp; If this is not the case then, these is no way to av=
oid the cost and still use the structure provided by the standard, and will=
 require extra implementations for the other versions to achieve performanc=
e.<br><br></div></div>

<p></p>

-- <br />
&nbsp;<br />
--- <br />
You received this message because you are subscribed to the Google Groups &=
quot;ISO C++ Standard - Future Proposals&quot; group.<br />
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to std-proposals+unsubscribe@isocpp.org.<br />
To post to this group, send email to std-proposals@isocpp.org.<br />
Visit this group at <a href=3D"http://groups.google.com/a/isocpp.org/group/=
std-proposals/">http://groups.google.com/a/isocpp.org/group/std-proposals/<=
/a>.<br />

------=_Part_3666_26799527.1391561757114--

.


Author: Matthew Woehlke <mw_triad@users.sourceforge.net>
Date: Tue, 04 Feb 2014 20:41:13 -0500
Raw View
On 2014-02-04 19:55, Paul Tessier wrote:
>> int parse<T, U>(range<U> r, T& value, locale loc = default_locale)
>> {
>>     auto const result = parse<T>(r, loc);
>>     if (result) value = *result;
>>     return // um... int? from whence do I get an int?
>> }
>
> Except that with an out parameter no copies need be made, which depending
> on cost of copying said type, this may be a bottle neck.  Your version of
> an out parameter composed of a value returning version forces a copy
> regardless of the need for one.

Where?

A "good" implementation would emplace in the return value. And it should
be possible to tweak the assignment to be a move-assignment (hmm, should
add a take() to std::optional). No copies there?

> The opposite cannot be said for the reverse composition.

Yours enforces a default construction and, at best, some less-expensive
form of changing 'value'. Conversely, if I want everything-via-return, a
good implementation will do either something similar or an initialized
construction, which will be at least as cheap.

If anything, it seems to me that the 'real' implementation being
everything-via-return is going to be least expensive.

> For const, initializing a const object from another object always requires
> a copy.
>
> big_int const val = parse_expected<big_int>( r ).value(); // copies
> regardless

Eh? Your return type is 'parse_result<T> const'? Why? (In fact, why
would you *ever* use 'const' on a return type?)

(Even if it is, a '&' will avoid the copy. Actually, 'const&' is
probably preferred anyway.)

--
Matthew

--

---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.

.


Author: Paul Tessier <phernost@gmail.com>
Date: Tue, 4 Feb 2014 18:16:53 -0800 (PST)
Raw View
------=_Part_4612_2108710.1391566613493
Content-Type: text/plain; charset=UTF-8



On Tuesday, February 4, 2014 8:41:13 PM UTC-5, Matthew Woehlke wrote:
>
> On 2014-02-04 19:55, Paul Tessier wrote:
> >> int parse<T, U>(range<U> r, T& value, locale loc = default_locale)
> >> {
> >>     auto const result = parse<T>(r, loc);
> >>     if (result) value = *result;
> >>     return // um... int? from whence do I get an int?
> >> }
> >
> > Except that with an out parameter no copies need be made, which
> depending
> > on cost of copying said type, this may be a bottle neck.  Your version
> of
> > an out parameter composed of a value returning version forces a copy
> > regardless of the need for one.
>
> Where?
>
> A "good" implementation would emplace in the return value. And it should
> be possible to tweak the assignment to be a move-assignment (hmm, should
> add a take() to std::optional). No copies there?
>
> > The opposite cannot be said for the reverse composition.
>
> Yours enforces a default construction and, at best, some less-expensive
> form of changing 'value'. Conversely, if I want everything-via-return, a
> good implementation will do either something similar or an initialized
> construction, which will be at least as cheap.
>
> If anything, it seems to me that the 'real' implementation being
> everything-via-return is going to be least expensive.
>
> > For const, initializing a const object from another object always
> requires
> > a copy.
> >
> > big_int const val = parse_expected<big_int>( r ).value(); // copies
> > regardless
>
> Eh? Your return type is 'parse_result<T> const'? Why? (In fact, why
> would you *ever* use 'const' on a return type?)
>
> (Even if it is, a '&' will avoid the copy. Actually, 'const&' is
> probably preferred anyway.)
>
> --
> Matthew
>

Assume that big_int requires the heap to allow for very big int's, say 10
to 2000 digits, a value returning version has no way to avoid allocating at
each parse, regardless of move-assignment or RVO.  A parameter out version
can reuse the same big_int and therefore potentially avoid the cost of new
allocations at each parse.

It is always possible to take *any* snippet of code and replace it with a
function that takes in and out parameters, the reverse cannot be said for
value returning functions.  If your arguments are stylistic, I have no
objections to that.  I also find the readability of a version that returns
some kind of expected<T> to be better.  I only stand by my position that to
allow all points of contention to be solved the out parameter version is
required, and as such should be the base for the other versions.  Whether
all version are supplied, I cannot say.  I would prefer at least in
addition to the out parameter version, that parse_or_zero or something
similar be included.  Whether consensus can be reached for a version
returning an expected<T> or optional<T>, I find doubtful, until such time
as those things are already part of the standard.

I sorry if I wasn't more concise about const.  It had been mentioned that
out parameters could not initialize a const value, to which I used the
parse_expected version from earlier to show that version solved that
problem.  I had not meant to imply that the return itself was const.

I would also propose that the range provided to be parsed should be,
exactly what should be parsed and not more therefore, eliminating the need
to modified/return the range used.  Providing a correct range should fall
to the responsibility of a regex or similar facility before parsing to a
type occurs.  This alleviates the need for parse to do more work than is
necessary.

--

---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.

------=_Part_4612_2108710.1391566613493
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr"><br><br>On Tuesday, February 4, 2014 8:41:13 PM UTC-5, Mat=
thew Woehlke wrote:<blockquote class=3D"gmail_quote" style=3D"margin: 0;mar=
gin-left: 0.8ex;border-left: 1px #ccc solid;padding-left: 1ex;">On 2014-02-=
04 19:55, Paul Tessier wrote:
<br>&gt;&gt; int parse&lt;T, U&gt;(range&lt;U&gt; r, T&amp; value, locale l=
oc =3D default_locale)
<br>&gt;&gt; {
<br>&gt;&gt; &nbsp; &nbsp; auto const result =3D parse&lt;T&gt;(r, loc);
<br>&gt;&gt; &nbsp; &nbsp; if (result) value =3D *result;
<br>&gt;&gt; &nbsp; &nbsp; return // um... int? from whence do I get an int=
?
<br>&gt;&gt; }
<br>&gt;
<br>&gt; Except that with an out parameter no copies need be made, which de=
pending
<br>&gt; on cost of copying said type, this may be a bottle neck. &nbsp;You=
r version of
<br>&gt; an out parameter composed of a value returning version forces a co=
py
<br>&gt; regardless of the need for one.
<br>
<br>Where?
<br>
<br>A "good" implementation would emplace in the return value. And it shoul=
d=20
<br>be possible to tweak the assignment to be a move-assignment (hmm, shoul=
d=20
<br>add a take() to std::optional). No copies there?
<br>
<br>&gt; The opposite cannot be said for the reverse composition.
<br>
<br>Yours enforces a default construction and, at best, some less-expensive=
=20
<br>form of changing 'value'. Conversely, if I want everything-via-return, =
a=20
<br>good implementation will do either something similar or an initialized=
=20
<br>construction, which will be at least as cheap.
<br>
<br>If anything, it seems to me that the 'real' implementation being=20
<br>everything-via-return is going to be least expensive.
<br>
<br>&gt; For const, initializing a const object from another object always =
requires
<br>&gt; a copy.
<br>&gt;
<br>&gt; big_int const val =3D parse_expected&lt;big_int&gt;( r ).value(); =
// copies
<br>&gt; regardless
<br>
<br>Eh? Your return type is 'parse_result&lt;T&gt; const'? Why? (In fact, w=
hy=20
<br>would you *ever* use 'const' on a return type?)
<br>
<br>(Even if it is, a '&amp;' will avoid the copy. Actually, 'const&amp;' i=
s=20
<br>probably preferred anyway.)
<br>
<br>--=20
<br>Matthew
<br></blockquote><div><br>Assume that big_int requires the heap to allow fo=
r very big int's, say 10 to 2000 digits, a value returning version has no w=
ay to avoid allocating at each parse, regardless of move-assignment or RVO.=
&nbsp; A parameter out version can reuse the same big_int and therefore pot=
entially avoid the cost of new allocations at each parse.<br><br>It is alwa=
ys possible to take <b>any</b> snippet of code and replace it with a functi=
on that takes in and out parameters, the reverse cannot be said for value r=
eturning functions.&nbsp; If your arguments are stylistic, I have no object=
ions to that.&nbsp; I also find the readability of a version that returns s=
ome kind of expected&lt;T&gt; to be better.&nbsp; I only stand by my positi=
on that to allow all points of contention to be solved the out parameter ve=
rsion is required, and as such should be the base for the other versions.&n=
bsp; Whether all version are supplied, I cannot say.&nbsp; I would prefer a=
t least in addition to the out parameter version, that parse_or_zero or som=
ething similar be included.&nbsp; Whether consensus can be reached for a ve=
rsion returning an expected&lt;T&gt; or optional&lt;T&gt;, I find doubtful,=
 until such time as those things are already part of the standard.<br><br>I=
 sorry if I wasn't more concise about const.&nbsp; It had been mentioned th=
at out parameters could not initialize a const value, to which I used the p=
arse_expected version from earlier to show that version solved that problem=
..&nbsp; I had not meant to imply that the return itself was const.<br><br>I=
 would also propose that the range provided to be parsed should be, exactly=
 what should be parsed and not more therefore, eliminating the need to modi=
fied/return the range used.&nbsp; Providing a correct range should fall to =
the responsibility of a regex or similar facility before parsing to a type =
occurs.&nbsp; This alleviates the need for parse to do more work than is ne=
cessary.<br><br></div></div>

<p></p>

-- <br />
&nbsp;<br />
--- <br />
You received this message because you are subscribed to the Google Groups &=
quot;ISO C++ Standard - Future Proposals&quot; group.<br />
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to std-proposals+unsubscribe@isocpp.org.<br />
To post to this group, send email to std-proposals@isocpp.org.<br />
Visit this group at <a href=3D"http://groups.google.com/a/isocpp.org/group/=
std-proposals/">http://groups.google.com/a/isocpp.org/group/std-proposals/<=
/a>.<br />

------=_Part_4612_2108710.1391566613493--

.


Author: Matthew Woehlke <mw_triad@users.sourceforge.net>
Date: Wed, 05 Feb 2014 12:17:37 -0500
Raw View
On 2014-02-05 07:34, Miro Knejp wrote:
>> Do you mean you actually have a real example of an iterator that
>> cannot be dereferenced more than once? (What on earth would create
>> such a thing?)
>
> That's not what I described. If I pass a single-pass InputIterator, for
> example istreambuf_iterator, to parse() it chews away N whitespaces and
> then fails to recognize a number then any information on the whitespaces
> is gone as I cannot go back and re-iterate the range.

This iterator is non-copyable? And/or incrementing it is destructive to
copies of the iterator? (I would hope not the latter, as that is
terrible API.)

If not, there should not be a problem. (Okay, given it is
istreambuf_iterator, I suppose I can imagine one or both of the above
being true. It's not obvious to me from either cplusplus.com or
cppreference.com if istreambuf_iterator is or is not copyable...)

> And I am on the same track as Matthew F. in that parse() should have one
> responsibility and one only: convert the textual representation of a
> value to a value, and nothing else.

I can live with that. (I'm not sure I ever felt handling whitespace was
*necessary*, just that I don't object to it as strongly as you.)

I do think we need at least one parsing option; whether or not to allow
trailing characters.

> What happens with the out parameter when parsing fails? Is it in an
> undefined state? Or left unmodified? If the latter then parse() had to
> create a temporary and the entire allocation prevention and no-copy
> argument is down the drain.

IMO it shall leave it unmodified. One of the arguments for an output
parameter was to implement defaults like:

int value = default_value;
parse(in, value);

That said, in defense of that argument, I could conceive of a
specialization that stores parts of the value in cheap-to-create values
and doesn't have to build the expensive type until it knows the parse is
okay. But I agree that that's tenuous, and more likely your point will
be true.

--
Matthew

--

---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.

.


Author: Matthew Fioravante <fmatthew5876@gmail.com>
Date: Wed, 5 Feb 2014 09:43:27 -0800 (PST)
Raw View
------=_Part_4517_12851344.1391622207378
Content-Type: text/plain; charset=UTF-8



On Wednesday, February 5, 2014 12:17:37 PM UTC-5, Matthew Woehlke wrote:
>
> On 2014-02-05 07:34, Miro Knejp wrote:
> >> Do you mean you actually have a real example of an iterator that
> >> cannot be dereferenced more than once? (What on earth would create
> >> such a thing?)
> >
> > That's not what I described. If I pass a single-pass InputIterator, for
> > example istreambuf_iterator, to parse() it chews away N whitespaces and
> > then fails to recognize a number then any information on the whitespaces
> > is gone as I cannot go back and re-iterate the range.
>
> This iterator is non-copyable? And/or incrementing it is destructive to
> copies of the iterator? (I would hope not the latter, as that is
> terrible API.)
>

its the later. Each copy of an input iterator may point to the same state
(current character). But as soon as you increment one, it invalidates all
of the others. It seems terrible but that is the consequence of abstracting
one pass things like file io with iterators. This interface while dangerous
allows them to be efficient.

I've used input iterators for other IO abstractions such as asynchronous
IO. The operator++() blocks until the background thread has data available
and then keeps returning the next data point until it has to block again.
There is no way to "rewind" in this situation or maintain a copy of the
previous state without a lot of expensive reference counting or some other
complexity.

Its a useful abstraction, but requiring operator++(int) for input iterators
is ridiculous and dangerous.

>
> If not, there should not be a problem. (Okay, given it is
> istreambuf_iterator, I suppose I can imagine one or both of the above
> being true. It's not obvious to me from either cplusplus.com or
> cppreference.com if istreambuf_iterator is or is not copyable...)
>

input iterators must be copyable because they implement operator++(int).
Maybe the standard should be amended to relax these restrictions?


> > And I am on the same track as Matthew F. in that parse() should have one
> > responsibility and one only: convert the textual representation of a
> > value to a value, and nothing else.
>
> I can live with that. (I'm not sure I ever felt handling whitespace was
> *necessary*, just that I don't object to it as strongly as you.)
>
> I do think we need at least one parsing option; whether or not to allow
> trailing characters.
>

Trailing characters must be supported for efficiency along with the .next()
method returning iterator/string view in the return object. Many times you
have a number in your string followed by other stuff. You must parse the
number to know where the number characters end. The efficient way is to
parse the number and move the character iterator at the same time.



>
> > What happens with the out parameter when parsing fails? Is it in an
> > undefined state? Or left unmodified? If the latter then parse() had to
> > create a temporary and the entire allocation prevention and no-copy
> > argument is down the drain.
>
> IMO it shall leave it unmodified. One of the arguments for an output
> parameter was to implement defaults like:
>
> int value = default_value;
> parse(in, value);
>
> That said, in defense of that argument, I could conceive of a
> specialization that stores parts of the value in cheap-to-create values
> and doesn't have to build the expensive type until it knows the parse is
> okay. But I agree that that's tenuous, and more likely your point will
> be true.
>

Leaving an expensive type unmodified requires either creating a copy or
parsing twice. I'm not convinced we want to pay either performance cost for
the relatively minor convenience of the "unmodified on failure" behavior.

--

---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.

------=_Part_4517_12851344.1391622207378
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr"><br><br>On Wednesday, February 5, 2014 12:17:37 PM UTC-5, =
Matthew Woehlke wrote:<blockquote class=3D"gmail_quote" style=3D"margin: 0;=
margin-left: 0.8ex;border-left: 1px #ccc solid;padding-left: 1ex;">On 2014-=
02-05 07:34, Miro Knejp wrote:
<br>&gt;&gt; Do you mean you actually have a real example of an iterator th=
at
<br>&gt;&gt; cannot be dereferenced more than once? (What on earth would cr=
eate
<br>&gt;&gt; such a thing?)
<br>&gt;
<br>&gt; That's not what I described. If I pass a single-pass InputIterator=
, for
<br>&gt; example istreambuf_iterator, to parse() it chews away N whitespace=
s and
<br>&gt; then fails to recognize a number then any information on the white=
spaces
<br>&gt; is gone as I cannot go back and re-iterate the range.
<br>
<br>This iterator is non-copyable? And/or incrementing it is destructive to=
=20
<br>copies of the iterator? (I would hope not the latter, as that is=20
<br>terrible API.)
<br></blockquote><div><br></div><div>its the later. Each copy of an input i=
terator may point to the same state (current character). But as soon as you=
 increment one, it invalidates all of the others. It seems terrible but tha=
t is the consequence of abstracting one pass things like file io with itera=
tors. This interface while dangerous allows them to be efficient.&nbsp;</di=
v><div><br></div><div>I've used input iterators for other IO abstractions s=
uch as asynchronous IO. The operator++() blocks until the background thread=
 has data available and then keeps returning the next data point until it h=
as to block again. There is no way to "rewind" in this situation or maintai=
n a copy of the previous state without a lot of expensive reference countin=
g or some other complexity.</div><div><br></div><div>Its a useful abstracti=
on, but<font size=3D"2">&nbsp;requiring operator++(int) for input iterators=
 is&nbsp;</font>ridiculous<font size=3D"2">&nbsp;and dangerous.</font></div=
><blockquote class=3D"gmail_quote" style=3D"margin: 0;margin-left: 0.8ex;bo=
rder-left: 1px #ccc solid;padding-left: 1ex;">
<br>If not, there should not be a problem. (Okay, given it is=20
<br>istreambuf_iterator, I suppose I can imagine one or both of the above=
=20
<br>being true. It's not obvious to me from either <a href=3D"http://cplusp=
lus.com" target=3D"_blank" onmousedown=3D"this.href=3D'http://www.google.co=
m/url?q\75http%3A%2F%2Fcplusplus.com\46sa\75D\46sntz\0751\46usg\75AFQjCNF4v=
o9LWgcpuWjyLarvfvueQ3viKw';return true;" onclick=3D"this.href=3D'http://www=
..google.com/url?q\75http%3A%2F%2Fcplusplus.com\46sa\75D\46sntz\0751\46usg\7=
5AFQjCNF4vo9LWgcpuWjyLarvfvueQ3viKw';return true;">cplusplus.com</a> or=20
<br><a href=3D"http://cppreference.com" target=3D"_blank" onmousedown=3D"th=
is.href=3D'http://www.google.com/url?q\75http%3A%2F%2Fcppreference.com\46sa=
\75D\46sntz\0751\46usg\75AFQjCNFfMEVsriPdiSLWG49XWK9zX_c5Ug';return true;" =
onclick=3D"this.href=3D'http://www.google.com/url?q\75http%3A%2F%2Fcpprefer=
ence.com\46sa\75D\46sntz\0751\46usg\75AFQjCNFfMEVsriPdiSLWG49XWK9zX_c5Ug';r=
eturn true;">cppreference.com</a> if istreambuf_iterator is or is not copya=
ble...)
<br></blockquote><div><br></div><div>input iterators must be copyable becau=
se they implement operator++(int). Maybe the standard should be amended to =
relax these restrictions?</div><div><br></div><blockquote class=3D"gmail_qu=
ote" style=3D"margin: 0;margin-left: 0.8ex;border-left: 1px #ccc solid;padd=
ing-left: 1ex;">
<br>&gt; And I am on the same track as Matthew F. in that parse() should ha=
ve one
<br>&gt; responsibility and one only: convert the textual representation of=
 a
<br>&gt; value to a value, and nothing else.
<br>
<br>I can live with that. (I'm not sure I ever felt handling whitespace was=
=20
<br>*necessary*, just that I don't object to it as strongly as you.)
<br>
<br>I do think we need at least one parsing option; whether or not to allow=
=20
<br>trailing characters.
<br></blockquote><div><br></div><div>Trailing characters must be supported =
for efficiency along with the .next() method returning iterator/string view=
 in the return object. Many times you have a number in your string followed=
 by other stuff. You must parse the number to know where the number charact=
ers end. The efficient way is to parse the number and move the character it=
erator at the same time.</div><div><br></div><div>&nbsp;</div><blockquote c=
lass=3D"gmail_quote" style=3D"margin: 0;margin-left: 0.8ex;border-left: 1px=
 #ccc solid;padding-left: 1ex;">
<br>&gt; What happens with the out parameter when parsing fails? Is it in a=
n
<br>&gt; undefined state? Or left unmodified? If the latter then parse() ha=
d to
<br>&gt; create a temporary and the entire allocation prevention and no-cop=
y
<br>&gt; argument is down the drain.
<br>
<br>IMO it shall leave it unmodified. One of the arguments for an output=20
<br>parameter was to implement defaults like:
<br>
<br>int value =3D default_value;
<br>parse(in, value);
<br>
<br>That said, in defense of that argument, I could conceive of a=20
<br>specialization that stores parts of the value in cheap-to-create values=
=20
<br>and doesn't have to build the expensive type until it knows the parse i=
s=20
<br>okay. But I agree that that's tenuous, and more likely your point will=
=20
<br>be true.
<br></blockquote><div><br></div><div>Leaving an expensive type unmodified r=
equires either creating a copy or parsing twice. I'm not convinced we want =
to pay either performance cost for the relatively minor convenience of the =
"unmodified on failure" behavior.</div><div><br></div></div>

<p></p>

-- <br />
&nbsp;<br />
--- <br />
You received this message because you are subscribed to the Google Groups &=
quot;ISO C++ Standard - Future Proposals&quot; group.<br />
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to std-proposals+unsubscribe@isocpp.org.<br />
To post to this group, send email to std-proposals@isocpp.org.<br />
Visit this group at <a href=3D"http://groups.google.com/a/isocpp.org/group/=
std-proposals/">http://groups.google.com/a/isocpp.org/group/std-proposals/<=
/a>.<br />

------=_Part_4517_12851344.1391622207378--

.


Author: Matthew Woehlke <mw_triad@users.sourceforge.net>
Date: Wed, 05 Feb 2014 15:49:49 -0500
Raw View
On 2014-02-05 15:20, Miro Knejp wrote:
>> I do think we need at least one parsing option; whether or not to
>> allow trailing characters.
>
> I don't think that is required. the parser can just stop when it reaches
> an invalid character and signal success if the input to that point was
> sufficient to create a value. You can inspect the returned iterator
> whether the end of the input was reached or not and act accordingly.

My impression is that this is a sufficiently common use case (and it is)
that users should not have to endlessly rewrite that code.

In fact, I expect it is more common to consider any unparsable
characters to be an error than otherwise. The latter case only happens
when you're implementing your own parsing of an input stream that is
expected to contain multiple logical values. The former is the case any
time your input has already been value delimited (or represents a single
value, as in e.g. an input widget).

> This way parse() is very flexible and can work at the core of more
> advanced interfaces. There was already mentioning of a match_integer
> method requiring the entire source to represent the value and once you
> have parse() with it's one well defined reponsibility it is trivial to
> implement such a match_X method on top of it.

I suppose you could have the "real implementation" version always accept
extra characters and return the end position, and provide a wrapper that
implements the aforementioned check. But that wrapper is important, as
that's what is going to be used more often than not. I would consider
the proposal incomplete if it does not provide that API.

--
Matthew

--

---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.

.


Author: Matthew Fioravante <fmatthew5876@gmail.com>
Date: Wed, 5 Feb 2014 12:56:18 -0800 (PST)
Raw View
------=_Part_2341_13168357.1391633778169
Content-Type: text/plain; charset=UTF-8



On Wednesday, February 5, 2014 3:49:49 PM UTC-5, Matthew Woehlke wrote:
>
> I suppose you could have the "real implementation" version always accept
> extra characters and return the end position, and provide a wrapper that
> implements the aforementioned check. But that wrapper is important, as
> that's what is going to be used more often than not. I would consider
> the proposal incomplete if it does not provide that API.
>
>
I agree completely, both versions must be provided. One which parses as
much as it can and returns an end iterator, and another (simple wrapper)
which requires that all characters are part of the value.

I would have a lot of use cases for both of these.

--

---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.

------=_Part_2341_13168357.1391633778169
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr"><br><br>On Wednesday, February 5, 2014 3:49:49 PM UTC-5, M=
atthew Woehlke wrote:<blockquote class=3D"gmail_quote" style=3D"margin: 0;m=
argin-left: 0.8ex;border-left: 1px #ccc solid;padding-left: 1ex;">I suppose=
 you could have the "real implementation" version always accept=20
<br>extra characters and return the end position, and provide a wrapper tha=
t=20
<br>implements the aforementioned check. But that wrapper is important, as=
=20
<br>that's what is going to be used more often than not. I would consider=
=20
<br>the proposal incomplete if it does not provide that API.
<br><br></blockquote><div><br></div><div>I agree completely, both versions =
must be provided. One which parses as much as it can and returns an end ite=
rator, and another (simple wrapper) which requires that all characters are =
part of the value.</div><div><br></div><div>I would have a lot of use cases=
 for both of these.</div></div>

<p></p>

-- <br />
&nbsp;<br />
--- <br />
You received this message because you are subscribed to the Google Groups &=
quot;ISO C++ Standard - Future Proposals&quot; group.<br />
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to std-proposals+unsubscribe@isocpp.org.<br />
To post to this group, send email to std-proposals@isocpp.org.<br />
Visit this group at <a href=3D"http://groups.google.com/a/isocpp.org/group/=
std-proposals/">http://groups.google.com/a/isocpp.org/group/std-proposals/<=
/a>.<br />

------=_Part_2341_13168357.1391633778169--

.


Author: gmisocpp@gmail.com
Date: Wed, 5 Feb 2014 14:23:15 -0800 (PST)
Raw View
------=_Part_6972_2576633.1391638995027
Content-Type: text/plain; charset=UTF-8



On Thursday, February 6, 2014 9:56:18 AM UTC+13, Matthew Fioravante wrote:
>
>
>
> On Wednesday, February 5, 2014 3:49:49 PM UTC-5, Matthew Woehlke wrote:
>>
>> I suppose you could have the "real implementation" version always accept
>> extra characters and return the end position, and provide a wrapper that
>> implements the aforementioned check. But that wrapper is important, as
>> that's what is going to be used more often than not. I would consider
>> the proposal incomplete if it does not provide that API.
>>
>>
> I agree completely, both versions must be provided. One which parses as
> much as it can and returns an end iterator, and another (simple wrapper)
> which requires that all characters are part of the value.
>
> I would have a lot of use cases for both of these.
>


This sounds like what I was suggesting earlier with the two routines

conversion_result parse(int& value,range) noexcept;;
and
conversion_result parse_checked(int& v,range r, check_options o);

parse_checked which throws calls parse() which never throws, it just
checks. all the heavy work is done by parse(), parse_checked just looks at
the parse_status to decide what to throw.

which is also why parse must check for empty ranges etc as crashing isn't
reasonable for such a condition I think it has to test for anyway and
parse_checked relies on it to throw.

But I don't think they should have to call each other that's just an
implementation detail. parse checked might be able to  get better error
messages etc. if it didn't.

I am now thinking that the interface should be something like

conversion_result parse(int& value,range,parse_options) noexcept;;
and
conversion_result parse_checked(int& v,range r, parse_options o);

and parse_options contains things like the format(s)/radix expected and
things like options for accepting leading white space.

I'm still not convinced leading white space and trailing space etc. options
should be supported, I think it complicates the interface and performance,
but  I'm still thinking about that.

but if it should the parse_options can handle that. There can be a
default_parse_options() function.or something that defaults to strict or
maybe accepts leading and trailing spaces, but I'm not found of accepting
leading and trailing whitespace as that basically includes tabs, cr's,
lf's, etc. which I think is not a good idea.

--

---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.

------=_Part_6972_2576633.1391638995027
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr"><br><br>On Thursday, February 6, 2014 9:56:18 AM UTC+13, M=
atthew Fioravante wrote:<blockquote class=3D"gmail_quote" style=3D"margin: =
0px 0px 0px 0.8ex; padding-left: 1ex; border-left-color: rgb(204, 204, 204)=
; border-left-width: 1px; border-left-style: solid;"><div dir=3D"ltr"><br><=
br>On Wednesday, February 5, 2014 3:49:49 PM UTC-5, Matthew Woehlke wrote:<=
blockquote class=3D"gmail_quote" style=3D"margin: 0px 0px 0px 0.8ex; paddin=
g-left: 1ex; border-left-color: rgb(204, 204, 204); border-left-width: 1px;=
 border-left-style: solid;">I suppose you could have the "real implementati=
on" version always accept=20
<br>extra characters and return the end position, and provide a wrapper tha=
t=20
<br>implements the aforementioned check. But that wrapper is important, as=
=20
<br>that's what is going to be used more often than not. I would consider=
=20
<br>the proposal incomplete if it does not provide that API.
<br><br></blockquote><div><br></div><div>I agree completely, both versions =
must be provided. One which parses as much as it can and returns an end ite=
rator, and another (simple wrapper) which requires that all characters are =
part of the value.</div><div><br></div><div>I would have a lot of use cases=
 for both of these.</div></div></blockquote><div><br></div><div><br></div><=
div>This sounds like&nbsp;what I was suggesting earlier with the two routin=
es</div><div><br></div><div>conversion_result parse(int&amp; value,range) n=
oexcept;;</div><div>and</div><div>conversion_result parse_checked(int&amp; =
v,range r, check_options o);</div><div><br></div><div>parse_checked which t=
hrows&nbsp;calls parse() which never throws, it just checks. all the heavy =
work is done by parse(), parse_checked just looks at the parse_status to de=
cide what to throw.</div><div><br></div><div>which is also why parse must c=
heck for empty ranges etc as crashing isn't reasonable for such a condition=
 I think it has to test for anyway and parse_checked relies on it to throw.=
</div><div><br></div><div>But I don't think they should have to call each o=
ther that's just an implementation detail. parse checked might be able to&n=
bsp; get better error messages etc. if it didn't.</div><div><br></div><div>=
I am now thinking that the interface should be something like</div><div><br=
></div><div><div>conversion_result parse(int&amp; value,range,parse_options=
) noexcept;;</div><div>and</div><div>conversion_result parse_checked(int&am=
p; v,range r, parse_options o);</div><div><br></div><div>and parse_options =
contains things like the format(s)/radix&nbsp;expected&nbsp;and things like=
 options for&nbsp;accepting leading white space.</div><div><br></div><div>I=
'm still not convinced leading white space and trailing space etc. options =
should be supported, I think it complicates the interface and performance, =
but &nbsp;I'm still thinking about that.</div><div><br></div><div>but if it=
 should the parse_options can handle that.&nbsp;There can be a default_pars=
e_options() function.or something that defaults to strict or maybe accepts =
leading and trailing spaces, but I'm not found of accepting leading and tra=
iling whitespace as that basically includes tabs, cr's, lf's, etc. which I =
think is not a good idea.</div><div><br></div></div></div>

<p></p>

-- <br />
&nbsp;<br />
--- <br />
You received this message because you are subscribed to the Google Groups &=
quot;ISO C++ Standard - Future Proposals&quot; group.<br />
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to std-proposals+unsubscribe@isocpp.org.<br />
To post to this group, send email to std-proposals@isocpp.org.<br />
Visit this group at <a href=3D"http://groups.google.com/a/isocpp.org/group/=
std-proposals/">http://groups.google.com/a/isocpp.org/group/std-proposals/<=
/a>.<br />

------=_Part_6972_2576633.1391638995027--

.


Author: Matthew Woehlke <mw_triad@users.sourceforge.net>
Date: Fri, 07 Feb 2014 17:14:10 -0500
Raw View
On 2014-02-07 15:18, Miro Knejp wrote:
> Speaking floats, which part is the more complex/bloated one:
> Extracting digits and symbols from the input or assembling them into a
> floating point value with minimal rounding errors, etc? The latter can
> easily be separated into a stateful object that is fed the numerical
> values (i.e. digit values) and semantics (i.e. sign, comma, exponent
> indicators) of the input at which point input encodings, character types
> or locales are already translated to a neutral subset. Some part of the
> numeric parsers certainly needs to be inline but some can be implemented
> out-of-line.

While that may be true (and in fact, probably quite valuable in terms of
simplicity of implementation), how would you store the intermediate
state without said storage killing performance?

I suppose you could do something like:

parse<float>(/*params elided*/)
{
   fp_impl impl;
   ...
   impl.set_sign(fp_impl::sign_positive);
   ...
   while /*digits*/
   {
     impl.consume_digit(int_value_of_digit)
   }
   ...etc.
   return impl.value(); // grotesquely simplified
}

....where fp_impl is a class/struct that contains whatever internal state
it needs to operate.

Hmm... actually now I sort-of like that, although trying to turn that
into something the C library could also use is more "interesting".

--
Matthew

--

---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.

.


Author: Bengt Gustafsson <bengt.gustafsson@beamways.com>
Date: Fri, 7 Feb 2014 15:22:19 -0800 (PST)
Raw View
------=_Part_744_3457269.1391815339300
Content-Type: text/plain; charset=UTF-8

well, with such a helper object with its implementation in a cpp file
somewhere we are not going to get extraordinary performance, but maybe good
enough? A non-virtual call per character seems a little too much overhead
to me.

Thusly we need more than that visible in the h files. Or maybe we can rely
on link time optimization nowadays?

To be noted however, is that everything is not inlined jut because it is a
template function. There will be a handfull instantiations of the parse<T>
template per executable, for string, char*, string_view and maybe some
more. For big template functions it is unlikely that the compiler will
actually inline. Instead the compiler stores the implementation
as a soft symbol in each obj-file (or better, has a backend which keeps
track of what instantiations have already been code-generated for this
exe). I know that an ancient DEC compiler did the latter, but it was dead
slow anyway... Does gcc or clang remember instantiations? I know MS
doesn't...

So the big problem is probably not code bloat but compile times. I don't
know how this is affected by precompiled headers, probably not much, I
guess code generation happens every time anyway...?

As for putting the skipspace handling outside I think it would be a real
turnoff:

- The simplest code "works" until users happen to type a leading or
trailing space on that important demo.

- When parsing something more complex than a number, say a point x, y you
would have to explicitly call skipspace between each member, and testing is
again a real problem.

- In what situation is it important to give an error message if there is
whitespace? What can go wrong in a real application if the whitespace is
ignored? I fail to see those cases other than very marginal. I mean, even
if you have speced a file format to forbid spaces (for some reason) you can
be quite certain that the other guy interfacing ot you will send you spaces
anyway. What good does it do to anyone to fail in this case?

And noone has said that we should not provide a "strict" mode/return
flag/input flag or something to cater for these cases.

I mean, "getting it right" must mean that it is easy to use and works as
expected. All other number converters in all languages I know of eat
leading spaces. Most of them can't even tell you if there were any!
Den fredagen den 7:e februari 2014 kl. 23:14:10 UTC+1 skrev Matthew Woehlke:
>
> On 2014-02-07 15:18, Miro Knejp wrote:
> > Speaking floats, which part is the more complex/bloated one:
> > Extracting digits and symbols from the input or assembling them into a
> > floating point value with minimal rounding errors, etc? The latter can
> > easily be separated into a stateful object that is fed the numerical
> > values (i.e. digit values) and semantics (i.e. sign, comma, exponent
> > indicators) of the input at which point input encodings, character types
> > or locales are already translated to a neutral subset. Some part of the
> > numeric parsers certainly needs to be inline but some can be implemented
> > out-of-line.
>
> While that may be true (and in fact, probably quite valuable in terms of
> simplicity of implementation), how would you store the intermediate
> state without said storage killing performance?
>
> I suppose you could do something like:
>
> parse<float>(/*params elided*/)
> {
>    fp_impl impl;
>    ...
>    impl.set_sign(fp_impl::sign_positive);
>    ...
>    while /*digits*/
>    {
>      impl.consume_digit(int_value_of_digit)
>    }
>    ...etc.
>    return impl.value(); // grotesquely simplified
> }
>
> ...where fp_impl is a class/struct that contains whatever internal state
> it needs to operate.
>
> Hmm... actually now I sort-of like that, although trying to turn that
> into something the C library could also use is more "interesting".
>
> --
> Matthew
>
>

--

---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.

------=_Part_744_3457269.1391815339300
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr">well, with such a helper object with its implementation in=
 a cpp file somewhere we are not going to get extraordinary performance, bu=
t maybe good enough? A non-virtual call per character seems a little too mu=
ch overhead to me.<div><br></div><div>Thusly we need more than that visible=
 in the h files. Or maybe we can rely on link time optimization nowadays?</=
div><div><br></div><div>To be noted however, is that everything is not inli=
ned jut because it is a template function. There will be a handfull instant=
iations of the parse&lt;T&gt; template per executable, for string, char*, s=
tring_view and maybe some more. For big template functions it is unlikely t=
hat the compiler will actually inline. Instead the compiler stores the impl=
ementation</div><div>as a soft symbol in each obj-file (or better, has a ba=
ckend which keeps track of what instantiations have already been code-gener=
ated for this exe). I know that an ancient DEC compiler did the latter, but=
 it was dead slow anyway... Does gcc or clang remember instantiations? I kn=
ow MS doesn't...</div><div><br></div><div>So the big problem is probably no=
t code bloat but compile times. I don't know how this is affected by precom=
piled headers, probably not much, I guess code generation happens every tim=
e anyway...?</div><div><br></div><div>As for putting the skipspace handling=
 outside I think it would be a real turnoff:</div><div><br></div><div>- The=
 simplest code "works" until users happen to type a leading or trailing spa=
ce on that important demo.</div><div><br></div><div>- When parsing somethin=
g more complex than a number, say a point x, y you would have to explicitly=
 call skipspace between each member, and testing is again a real problem.</=
div><div><br></div><div>- In what situation is it important to give an erro=
r message if there is whitespace? What can go wrong in a real application i=
f the whitespace is ignored? I fail to see those cases other than very marg=
inal. I mean, even if you have speced a file format to forbid spaces (for s=
ome reason) you can be quite certain that the other guy interfacing ot you =
will send you spaces anyway. What good does it do to anyone to fail in this=
 case?</div><div><br></div><div>And noone has said that we should not provi=
de a "strict" mode/return flag/input flag or something to cater for these c=
ases.</div><div><br></div><div>I mean, "getting it right" must mean that it=
 is easy to use and works as expected. All other number converters in all l=
anguages I know of eat leading spaces. Most of them can't even tell you if =
there were any!<br>Den fredagen den 7:e februari 2014 kl. 23:14:10 UTC+1 sk=
rev Matthew Woehlke:<blockquote class=3D"gmail_quote" style=3D"margin: 0;ma=
rgin-left: 0.8ex;border-left: 1px #ccc solid;padding-left: 1ex;">On 2014-02=
-07 15:18, Miro Knejp wrote:
<br>&gt; Speaking floats, which part is the more complex/bloated one:
<br>&gt; Extracting digits and symbols from the input or assembling them in=
to a
<br>&gt; floating point value with minimal rounding errors, etc? The latter=
 can
<br>&gt; easily be separated into a stateful object that is fed the numeric=
al
<br>&gt; values (i.e. digit values) and semantics (i.e. sign, comma, expone=
nt
<br>&gt; indicators) of the input at which point input encodings, character=
 types
<br>&gt; or locales are already translated to a neutral subset. Some part o=
f the
<br>&gt; numeric parsers certainly needs to be inline but some can be imple=
mented
<br>&gt; out-of-line.
<br>
<br>While that may be true (and in fact, probably quite valuable in terms o=
f=20
<br>simplicity of implementation), how would you store the intermediate=20
<br>state without said storage killing performance?
<br>
<br>I suppose you could do something like:
<br>
<br>parse&lt;float&gt;(/*params elided*/)
<br>{
<br>&nbsp; &nbsp;fp_impl impl;
<br>&nbsp; &nbsp;...
<br>&nbsp; &nbsp;impl.set_sign(fp_impl::sign_<wbr>positive);
<br>&nbsp; &nbsp;...
<br>&nbsp; &nbsp;while /*digits*/
<br>&nbsp; &nbsp;{
<br>&nbsp; &nbsp; &nbsp;impl.consume_digit(int_value_<wbr>of_digit)
<br>&nbsp; &nbsp;}
<br>&nbsp; &nbsp;...etc.
<br>&nbsp; &nbsp;return impl.value(); // grotesquely simplified
<br>}
<br>
<br>...where fp_impl is a class/struct that contains whatever internal stat=
e=20
<br>it needs to operate.
<br>
<br>Hmm... actually now I sort-of like that, although trying to turn that=
=20
<br>into something the C library could also use is more "interesting".
<br>
<br>--=20
<br>Matthew
<br>
<br></blockquote></div></div>

<p></p>

-- <br />
&nbsp;<br />
--- <br />
You received this message because you are subscribed to the Google Groups &=
quot;ISO C++ Standard - Future Proposals&quot; group.<br />
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to std-proposals+unsubscribe@isocpp.org.<br />
To post to this group, send email to std-proposals@isocpp.org.<br />
Visit this group at <a href=3D"http://groups.google.com/a/isocpp.org/group/=
std-proposals/">http://groups.google.com/a/isocpp.org/group/std-proposals/<=
/a>.<br />

------=_Part_744_3457269.1391815339300--

.


Author: Csaba Csoma <csabacsoma@gmail.com>
Date: Sat, 8 Feb 2014 20:37:08 -0800 (PST)
Raw View
------=_Part_62_24287923.1391920628646
Content-Type: text/plain; charset=UTF-8

Related Stack Overflow discussion:
http://stackoverflow.com/questions/194465/how-to-parse-a-string-to-an-int-in-c/11354496

Csaba

On Sunday, January 26, 2014 8:25:02 AM UTC-8, Matthew Fioravante wrote:
>
> string to T (int, float, etc..) conversions seem like to rather easy task
> (aside from floating point round trip issues), and yet for the life of C
> and C++ the standard library has consistently failed to provide a decent
> interface.
>
> Lets review:
>
> int atoi(const char* s); //and atoll,atol,atoll, atof etc..
>
> Whats wrong with this?
>
>    - Returns 0 on parsing failure, making it impossible to parse 0
>    strings. This already renders this function effectively useless and we can
>    skip the rest of the bullet points right here.
>    - It discards leading whitespace, this has several problems of its own:
>       - If we want to check whether the string is strictly a numeric
>       string, we have to add our own check that the first character is a digit.
>       This makes the interface clumsy to use and easy to screw up.
>       - std::isspace() is locale dependent and requires an indirect
>       function call (try it on gcc.godbolt.org). This makes what could be
>       a very simple and inlinable conversion potentially expensive. It also
>       prevents constexpr.
>       - From a design standpoint, this whitespace handling is a very
>       narrow use case. It does too many things and in my opinion is a bad design.
>       I often do not have whitespace delimited input in my projects.
>    - No atod() for doubles or atold() for long doubles.
>    - No support for unsigned types, although this may not actually be a
>    problem.
>    - Uses horrible C interface (type suffixes in names) with no
>    overloading or template arguments. What function do we use if we want to
>    parse an int32_t?
>
> long strtol(const char* str, char **str_end, int base);
>
> Whats wrong with this one?
>
>    - Again it has this silly leading whitespace behavior (see above).
>    - Its not obvious how to correctly determine whether or not parsing
>    failed. Every time I use this function I have to look it up again to make
>    sure I get it exactly right and have covered all of the corner cases.
>    - Uses 0/T_MAX/T_MIN to denote errors, when these could be validly
>    parsed from strings. Checking whether or not these values were parsed or
>    are representing errors is clumsy.
>    - Again C interface issues (see above).
>
>
> At this point, I think we are ready to define a new set of int/float
> parsing routines.
>
> Design goals:
>
>    - Easy to use, usage is obvious.
>    - No assumptions about use cases, we just want to parse strings. This
>    means none of this automatic whitespace handling.
>    - Efficient and inline
>    - constexpr
>
> Here is a first attempt for an integer parsing routine.
>
> //Attempts to parse s as an integer. The valid integer string consists of
> the following:
> //* '+' or '-' sign as the first character (- only acceptable for signed
> integral types)
> //* prefix (0) indicating octal base (applies only when base is 0 or 8)
> //* prefix (0x or 0X) indicating hexadecimal base (applies only when base
> is 16 or 0).
> //* All of the rest of the characters MUST be digits.
> //Returns true if an integral value was successfully parsed and stores the
> value in val,
> //otherwise returns false and leaves val unmodified.
> //Sets errno to ERANGE if the string was an integer but would overflow
> type integral.
> template <typename integral>
> constexpr bool strto(string_view s, integral& val, int base);
>
> //Same as the previous, except that instead of trying to parse the entire
> string, we only parse the integral part.
> //The beginning of the string must be an integer as specified above. Will
> set tail to point to the end of the string after the integral part.
> template <typename integral>
> constexpr bool strto(string_view s, integral& val, int base, string_view&
> tail);
>
>
> First off, all of these return bool which makes it very easy to check
> whether or not parsing failed.
>
> While the interface does not allow this idom:
>
> int x = atoi(s);
>
> It works with this idiom which in all of my use cases is much more common:
> int val;
> if(!strto(s, val, 10)) {
>   throw some_error();
> }
> printf("We parsed %d!\n", val);
>
> Some examples:
>
> int val;
> string_view sv= "12345";
> assert(strto(sv, val, 10));
> assert(val == 12345);
> sv = "123 456";
> val = -2;
> assert(!strto(sv, val, 10));
> assert(val == -2);
> assert(strto(sv, val, 10, sv));
> assert(val == 123);
> assert(sv == " 456");
> sv.remove_prefix(1); //chop off the " ";
> assert(sv == "456");
> assert(strto(sv, val, 10));
> assert(val = 456);
> val = 0;
> assert(strto(sv, val, 10, sv));
> assert(val == 456);
> assert(sv == "");
>
>
> Similarly we can define this for floating point types. We may also want
> null terminated const char* versions as converting a const char* to
> sting_view requires a call to strlen().
>

--

---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.

------=_Part_62_24287923.1391920628646
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr">Related Stack Overflow discussion:<div>http://stackoverflo=
w.com/questions/194465/how-to-parse-a-string-to-an-int-in-c/11354496</div><=
div><br></div><div>Csaba<br><br>On Sunday, January 26, 2014 8:25:02 AM UTC-=
8, Matthew Fioravante wrote:<blockquote class=3D"gmail_quote" style=3D"marg=
in: 0;margin-left: 0.8ex;border-left: 1px #ccc solid;padding-left: 1ex;"><d=
iv dir=3D"ltr">string to T (int, float, etc..) conversions seem like to rat=
her easy task (aside from floating point round trip issues), and yet for th=
e life of C and C++ the standard library has consistently failed to provide=
 a decent interface.<div><br>Lets review:</div><div><br></div><div>int atoi=
(const char* s); //and atoll,atol,atoll, atof etc..</div><div><br></div><di=
v>Whats wrong with this?</div><div><ul><li><span style=3D"line-height:norma=
l">Returns 0 on parsing failure, making it impossible to parse 0 strings. T=
his already renders this function effectively useless and we can skip the r=
est of the bullet points right here.</span></li><li><span style=3D"line-hei=
ght:normal">It discards leading whitespace, this has several problems of it=
s own:</span></li><ul><li><span style=3D"line-height:normal">If we want to =
check whether the string is strictly a numeric string, we have to add our o=
wn check that the first character is a digit. This makes the interface clum=
sy to use and easy to screw up.</span></li><li><span style=3D"line-height:n=
ormal">std::isspace() is locale dependent and requires an indirect function=
 call (try it on <a href=3D"http://gcc.godbolt.org" target=3D"_blank" onmou=
sedown=3D"this.href=3D'http://www.google.com/url?q\75http%3A%2F%2Fgcc.godbo=
lt.org\46sa\75D\46sntz\0751\46usg\75AFQjCNEB0qUBFo9DsZQOPRCvRuNRhhH4TA';ret=
urn true;" onclick=3D"this.href=3D'http://www.google.com/url?q\75http%3A%2F=
%2Fgcc.godbolt.org\46sa\75D\46sntz\0751\46usg\75AFQjCNEB0qUBFo9DsZQOPRCvRuN=
RhhH4TA';return true;">gcc.godbolt.org</a>). This makes what could be a ver=
y simple and inlinable conversion potentially expensive. It also prevents c=
onstexpr.</span></li><li><span style=3D"line-height:normal">From a design s=
tandpoint, this whitespace handling is a very narrow use case. It does too =
many things and in my opinion is a bad design. I often do not have whitespa=
ce delimited input in my projects.</span></li></ul><li><span style=3D"line-=
height:normal">No atod() for doubles or atold() for long doubles.</span></l=
i><li><span style=3D"line-height:normal">No support for unsigned types, alt=
hough this may not actually be a problem.</span></li><li><span style=3D"lin=
e-height:normal">Uses horrible C interface (type suffixes in names) with no=
 overloading or template arguments. What function do we use if we want to p=
arse an int32_t?</span></li></ul></div><div>long strtol(const char* str, ch=
ar **str_end, int base);</div><div><br></div><div>Whats wrong with this one=
?</div><div><ul><li><span style=3D"line-height:normal">Again it has this si=
lly leading whitespace behavior (see above).</span></li><li><span style=3D"=
line-height:normal">Its not obvious how to correctly determine whether or n=
ot parsing failed. Every time I use this function I have to look it up agai=
n to make sure I get it exactly right and have covered all of the corner ca=
ses.</span></li><li><span style=3D"line-height:normal">Uses 0/T_MAX/T_MIN t=
o denote errors, when these could be validly parsed from strings. Checking =
whether or not these values were parsed or are representing errors is clums=
y.</span></li><li><span style=3D"line-height:normal">Again C interface issu=
es (see above).</span></li></ul><div><br></div></div><div>At this point, I =
think we are ready to define a new set of int/float parsing routines.</div>=
<div><br>Design goals:</div><div><ul><li><span style=3D"line-height:normal"=
>Easy to use, usage is obvious.</span></li><li><span style=3D"line-height:n=
ormal">No assumptions about use cases, we just want to parse strings. This =
means none of this automatic whitespace handling.</span></li><li><span styl=
e=3D"line-height:normal">Efficient and inline</span></li><li><span style=3D=
"line-height:normal">constexpr</span></li></ul><div>Here is a first attempt=
 for an integer parsing routine.</div></div><div><br></div><div>//Attempts =
to parse s as an integer. The valid integer string consists of the followin=
g:</div><div>//* '+' or '-' sign as the first character (- only acceptable =
for signed integral types)</div><div>//* prefix (0) indicating octal base (=
applies only when base is 0 or 8)</div><div>//* prefix (0x or 0X) indicatin=
g hexadecimal base (applies only when base is 16 or 0).</div><div>//* All o=
f the rest of the characters MUST be digits.</div><div>//Returns true if an=
 integral value was successfully parsed and stores the value in val,</div><=
div>//otherwise returns false and leaves val unmodified.&nbsp;</div><div>//=
Sets errno to ERANGE if the string was an integer but would overflow type i=
ntegral.</div><div>template &lt;typename integral&gt;<br>constexpr bool str=
to(string_view s, integral&amp; val, int base);</div><div><br></div><div>//=
Same as the previous, except that instead of trying to parse the entire str=
ing, we only parse the integral part.&nbsp;<br>//The beginning of the strin=
g must be an integer as specified above. Will set tail to point to the end =
of the string after the integral part.</div><div>template &lt;typename inte=
gral&gt;</div><div>constexpr bool strto(string_view s, integral&amp; val, i=
nt base, string_view&amp; tail);</div><div><br></div><div><br></div><div>Fi=
rst off, all of these return bool which makes it very easy to check whether=
 or not parsing failed.</div><div><br></div><div>While the interface does n=
ot allow this idom:</div><div><br></div><div>int x =3D atoi(s);</div><div><=
br></div><div>It works with this idiom which in all of my use cases is much=
 more common:</div><div>int val;</div><div>if(!strto(s, val, 10)) {</div><d=
iv>&nbsp; throw some_error();<br>}</div><div>printf("We parsed %d!\n", val)=
;</div><div><br></div><div>Some examples:</div><div><br></div><div>int val;=
</div><div>string_view sv=3D "12345";</div><div>assert(strto(sv, val, 10));=
</div><div>assert(val =3D=3D 12345);</div><div>sv =3D "123 456";</div><div>=
val =3D -2;</div><div>assert(!strto(sv, val, 10));</div><div>assert(val =3D=
=3D -2);</div><div>assert(strto(sv, val, 10, sv));</div><div>assert(val =3D=
=3D 123);</div><div>assert(sv =3D=3D " 456");</div><div>sv.remove_prefix(1)=
; //chop off the " ";</div><div>assert(sv =3D=3D "456");</div><div>assert(s=
trto(sv, val, 10));</div><div>assert(val =3D 456);</div><div>val =3D 0;</di=
v><div>assert(strto(sv, val, 10, sv));</div><div>assert(val =3D=3D 456);</d=
iv><div>assert(sv =3D=3D "");</div><div><br></div><div><br></div><div>Simil=
arly we can define this for floating point types. We may also want null ter=
minated const char* versions as converting a const char* to sting_view requ=
ires a call to strlen().&nbsp;</div></div></blockquote></div></div>

<p></p>

-- <br />
&nbsp;<br />
--- <br />
You received this message because you are subscribed to the Google Groups &=
quot;ISO C++ Standard - Future Proposals&quot; group.<br />
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to std-proposals+unsubscribe@isocpp.org.<br />
To post to this group, send email to std-proposals@isocpp.org.<br />
Visit this group at <a href=3D"http://groups.google.com/a/isocpp.org/group/=
std-proposals/">http://groups.google.com/a/isocpp.org/group/std-proposals/<=
/a>.<br />

------=_Part_62_24287923.1391920628646--

.