Thread

Topic: String to T conversions - getting it right

Author: Nevin Liber <nevin@eviloverlord.com>
Date: Wed, 29 Jan 2014 10:23:59 -0600 Raw View

--001a11c288a2a1ab9a04f11e6067
Content-Type: text/plain; charset=ISO-8859-1

On 29 January 2014 10:18, Bengt Gustafsson <bengt.gustafsson@beamways.com>wrote:

> Regarding default values produced when the conversion fails this is
> another argument for this style:
>
> <error return type> from_string(T& dest, string_view& src)
>
> Now the standard can specify that the function shall not touch dest if
> conversion fails.
>

How exactly do you guarantee that if an exception is thrown while dest is
being mutated?
--
 Nevin ":-)" Liber  <mailto:nevin@eviloverlord.com>  (847) 691-1404

--

---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.

--001a11c288a2a1ab9a04f11e6067
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr">On 29 January 2014 10:18, Bengt Gustafsson <span dir=3D"lt=
r">&lt;<a href=3D"mailto:bengt.gustafsson@beamways.com" target=3D"_blank">b=
engt.gustafsson@beamways.com</a>&gt;</span> wrote:<br><div class=3D"gmail_e=
xtra"><div class=3D"gmail_quote">

<blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p=
x #ccc solid;padding-left:1ex"><div dir=3D"ltr">Regarding default values pr=
oduced when the conversion fails this is another argument for this style:<d=
iv>

<br></div><div>&lt;error return type&gt; from_string(T&amp; dest, string_vi=
ew&amp; src)</div><div><br></div><div>Now the standard can specify that the=
 function shall not touch dest if conversion fails.</div></div></blockquote=
>

<div><br></div><div>How exactly do you guarantee that if an exception is th=
rown while dest is being mutated?</div><div>--=A0<br></div></div>=A0Nevin &=
quot;:-)&quot; Liber=A0 &lt;mailto:<a href=3D"mailto:nevin@eviloverlord.com=
" target=3D"_blank">nevin@eviloverlord.com</a>&gt;=A0 (847) 691-1404
</div></div>

<p></p>

-- <br />
&nbsp;<br />
--- <br />
You received this message because you are subscribed to the Google Groups &=
quot;ISO C++ Standard - Future Proposals&quot; group.<br />
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to std-proposals+unsubscribe@isocpp.org.<br />
To post to this group, send email to std-proposals@isocpp.org.<br />
Visit this group at <a href=3D"http://groups.google.com/a/isocpp.org/group/=
std-proposals/">http://groups.google.com/a/isocpp.org/group/std-proposals/<=
/a>.<br />

--001a11c288a2a1ab9a04f11e6067--

.

Author: Miro Knejp <miro@knejp.de>
Date: Wed, 29 Jan 2014 20:56:14 +0100 Raw View

Am 29.01.2014 17:49, schrieb Matthew Woehlke:
> On 2014-01-29 11:18, Bengt Gustafsson wrote:
>> Regarding default values produced when the conversion fails this is
>> another
>> argument for this style:
>>
>> <error return type> from_string(T& dest, string_view& src)
>>
>> Now the standard can specify that the function shall not touch dest if
>> conversion fails. The default value is the previous value of the
>> variable!
>
> So... not only can I still not assign the result to a const local, now
> 'dest' potentially contains uninitialized memory? I don't see how
> that's an improvement.
>
> If it is really necessary to have a description of the failure type
> (and errno is not suitable; personally I find nothing wrong with using
> errno), then maybe a return type that is similar to std::optional with
> an additional 'why it is disengaged' could be created. (Maybe even
> subclass std::optional and call it e.g. std::result?)
>
>> // No check required
>> from_string(dest, "123").ignore();
>
> You omitted the declaration and initialization of 'dest'. IOW:
>
> // your proposal
> auto dest = int{12};
> from_string(dest, "34").ignore();
> foo(dest);
>
> - vs. -
>
> // std::optional as return type
> foo(from_string<int>("34").value_or(12));
>
> Using std::optional, I (in the above example, anyway) avoided even
> having a named variable to receive the value. And if I wanted one, I
> could make it const, which I couldn't do with your version.
>
I am now using the following interface in the format parser implementation:

pair<optional<T>, Iter> parse_integer<T>(Iter first, Iter last, int
radix = 10)

and the convenience overload

optional<T> parse_integer<T>(string_view s, int radix = 10);

which could, using internal tag dispatching, be reduced to

parse<T>(...)

The signatures are very easy to use and give me all I need. Both
greedily consume as many valid characters as possible (even on
over/underflow) so parsing can continue past the (in)valid input. Sure,
you don't get a detailed error report but how important is it really?
All I care about is whether the number was valid or not. If the optional
is disengaged I can *guess* what happened using the iterator overload.
If the returned iterator equals first, then there was no number to begin
with, otherwise the number format was wrong or over/underflowed occured.
In any case the returned iterator points to the first character past the
number. Not sure how important it really is to distinguish
over/underflow from an invalid pattern though. I had no use for that
information so I would be interested to hear about scenarios where it
really does matter.

This version does not skip whitespaces or parse the radix prefix, for
that I have separate

pair<int, Iter> parse_radix_prefix(Iter first, Iter last) // Consume
"0x", "0X", "0b", "0B" or "0" and return 16, 2, 8 or 0 and iterator to next
pair<optional<T>, Iter> parse_prefixed_integer<T>(Iter first, Iter last,
int radix = 10) // Accepts 0x123, 0b111, -0123, -0b111 and uses "radix"
if no prefix was found

The negative sign is only accepted if T is signed and they currently
completely ignore anything locale specific and are ASCII only. In my
oppinion it's better to have separate overloads to control whether
parsing is done locale-agnostic or not and in my current use case
language neutral parsing had higher priority, but that's just a matter
of bikeshedding and overloads doing locale supported parsing to detect
culture based sign, grouping, decimal, etc. should be available, too.

Instead of using "pair" one might have a custom utility struct where the
members aren't named "first" and "second" for clarity.

--

---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.

.

Author: "Vicente J. Botet Escriba" <vicente.botet@wanadoo.fr>
Date: Thu, 30 Jan 2014 08:10:29 +0100 Raw View

This is a multi-part message in MIME format.
--------------060609080906020302020906
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: quoted-printable

Le 29/01/14 20:56, Miro Knejp a =C3=A9crit :
> Am 29.01.2014 17:49, schrieb Matthew Woehlke:
>> On 2014-01-29 11:18, Bengt Gustafsson wrote:
>>> Regarding default values produced when the conversion fails this is=20
>>> another
>>> argument for this style:
>>>
>>> <error return type> from_string(T& dest, string_view& src)
>>>
>>> Now the standard can specify that the function shall not touch dest if
>>> conversion fails. The default value is the previous value of the=20
>>> variable!
>>
>> So... not only can I still not assign the result to a const local,=20
>> now 'dest' potentially contains uninitialized memory? I don't see how=20
>> that's an improvement.
>>
>> If it is really necessary to have a description of the failure type=20
>> (and errno is not suitable; personally I find nothing wrong with=20
>> using errno), then maybe a return type that is similar to=20
>> std::optional with an additional 'why it is disengaged' could be=20
>> created. (Maybe even subclass std::optional and call it e.g.=20
>> std::result?)
>>
>>> // No check required
>>> from_string(dest, "123").ignore();
>>
>> You omitted the declaration and initialization of 'dest'. IOW:
>>
>> // your proposal
>> auto dest =3D int{12};
>> from_string(dest, "34").ignore();
>> foo(dest);
>>
>> - vs. -
>>
>> // std::optional as return type
>> foo(from_string<int>("34").value_or(12));
>>
>> Using std::optional, I (in the above example, anyway) avoided even=20
>> having a named variable to receive the value. And if I wanted one, I=20
>> could make it const, which I couldn't do with your version.
>>
> I am now using the following interface in the format parser=20
> implementation:
>
> pair<optional<T>, Iter> parse_integer<T>(Iter first, Iter last, int=20
> radix =3D 10)
>


Given a function

In Boost.Expected we have an example with something like

pair< Iter, expected<T, std::ios_base::iostate>> parse_integer<T>(Iter=20
first, Iter last);

or

expected< pair< Iter, T>, pair<Iter, std::ios_base::iostate>>=20
parse_integer<T>(Iter first, Iter last);

A parse interger range could be implemented as

expected< pair< Iter, pair<T,T>>, pair<Iter, std::ios_base::iostate>>=20
parse_integer_range<T>(Iter s, Iter e) {
     auto f =3D parse_integer<T>(s, e); RETURN_IF_UNEXPECTED(f);
     auto m =3D parse_string("..", f.first, e); RETURN_IF_UNEXPECTED(m);
     auto l =3D parse_integer<T>(m, e); RETURN_IF_UNEXPECTED(l);
     return make_expected(make_pair(l.first, make_pair(f.second,=20
l.second))));
}

where

#define RETURN_IF_UNEXPECTED(f) if (! f) return f.get_exceptional();

Note that we can also see pair< Iter, expected<T,=20
std::ios_base::iostate>> as equivalent to expected< pair< Iter, T>,=20
pair<Iter, std::ios_base::iostate>> and so make it a monad also.

I would like to be able to write it just as

expected< pair< Iter, pair<T,T>>, pair<Iter, std::ios_base::iostate>>=20
parse_integer_range<T>(Iter s, Iter e) {
     auto f =3D *await* parse_integer<T>(s, e);
     auto m =3D *await* parse_string("..", f.first, e);
     auto l =3D *await* parse_integer<T>(m, e);
     return make_pair(l.first, make_pair(f.second, l.second)));
}

The keyword await could be subject to discussion.
The advantage here is that we are writing the code as if the functions=20
parse_integer thrown an exception in case of errors.
The await operator would make return the parse_integer_range if the=20
expression on the right has an error stored.
The returned value would have the type of the parse_integer_range with=20
the stored error.
> and the convenience overload
>
> optional<T> parse_integer<T>(string_view s, int radix =3D 10);
How do will use this overload? Could you define a parse_interger_range with=
?
Or is the intent to match the whole string and so the name should be=20
match_integer?
>
> which could, using internal tag dispatching, be reduced to
>
> parse<T>(...)
>
> The signatures are very easy to use and give me all I need. Both=20
> greedily consume as many valid characters as possible (even on=20
> over/underflow) so parsing can continue past the (in)valid input.=20
> Sure, you don't get a detailed error report but how important is it=20
> really? All I care about is whether the number was valid or not. If=20
> the optional is disengaged I can *guess* what happened using the=20
> iterator overload. If the returned iterator equals first, then there=20
> was no number to begin with, otherwise the number format was wrong or=20
> over/underflowed occured. In any case the returned iterator points to=20
> the first character past the number. Not sure how important it really=20
> is to distinguish over/underflow from an invalid pattern though. I had=20
> no use for that information so I would be interested to hear about=20
> scenarios where it really does matter.
>
Maybe your application don't care of the detailed error, but when=20
designing a library it is better to provide as much information as has=20
been obtained so that the user can do whatever she needs.


Besy,
Vicente

--=20

---=20
You received this message because you are subscribed to the Google Groups "=
ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposa=
ls/.

--------------060609080906020302020906
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

<html>
  <head>
    <meta content=3D"text/html; charset=3DUTF-8" http-equiv=3D"Content-Type=
">
  </head>
  <body bgcolor=3D"#FFFFFF" text=3D"#000000">
    <div class=3D"moz-cite-prefix">Le 29/01/14 20:56, Miro Knejp a =C3=A9cr=
it=C2=A0:<br>
    </div>
    <blockquote cite=3D"mid:52E95CDE.40007@knejp.de" type=3D"cite">Am
      29.01.2014 17:49, schrieb Matthew Woehlke:
      <br>
      <blockquote type=3D"cite">On 2014-01-29 11:18, Bengt Gustafsson
        wrote:
        <br>
        <blockquote type=3D"cite">Regarding default values produced when
          the conversion fails this is another
          <br>
          argument for this style:
          <br>
          <br>
          &lt;error return type&gt; from_string(T&amp; dest,
          string_view&amp; src)
          <br>
          <br>
          Now the standard can specify that the function shall not touch
          dest if
          <br>
          conversion fails. The default value is the previous value of
          the variable!
          <br>
        </blockquote>
        <br>
        So... not only can I still not assign the result to a const
        local, now 'dest' potentially contains uninitialized memory? I
        don't see how that's an improvement.
        <br>
        <br>
        If it is really necessary to have a description of the failure
        type (and errno is not suitable; personally I find nothing wrong
        with using errno), then maybe a return type that is similar to
        std::optional with an additional 'why it is disengaged' could be
        created. (Maybe even subclass std::optional and call it e.g.
        std::result?)
        <br>
        <br>
        <blockquote type=3D"cite">// No check required
          <br>
          from_string(dest, "123").ignore();
          <br>
        </blockquote>
        <br>
        You omitted the declaration and initialization of 'dest'. IOW:
        <br>
        <br>
        // your proposal
        <br>
        auto dest =3D int{12};
        <br>
        from_string(dest, "34").ignore();
        <br>
        foo(dest);
        <br>
        <br>
        - vs. -
        <br>
        <br>
        // std::optional as return type
        <br>
        foo(from_string&lt;int&gt;("34").value_or(12));
        <br>
        <br>
        Using std::optional, I (in the above example, anyway) avoided
        even having a named variable to receive the value. And if I
        wanted one, I could make it const, which I couldn't do with your
        version.
        <br>
        <br>
      </blockquote>
      I am now using the following interface in the format parser
      implementation:
      <br>
      <br>
      pair&lt;optional&lt;T&gt;, Iter&gt; parse_integer&lt;T&gt;(Iter
      first, Iter last, int radix =3D 10)
      <br>
      <br>
    </blockquote>
    <br>
    <br>
    Given a function <br>
    <br>
    In Boost.Expected we have an example with something like <br>
    <br>
    pair&lt; Iter, expected&lt;T, std::ios_base::iostate&gt;&gt;
    parse_integer&lt;T&gt;(Iter first, Iter last);<br>
    <br>
    or=C2=A0 <br>
    <br>
    expected&lt; pair&lt; Iter, T&gt;, pair&lt;Iter,
    std::ios_base::iostate&gt;&gt; parse_integer&lt;T&gt;(Iter first,
    Iter last);<br>
    <br>
    A parse interger range could be implemented as<br>
    =C2=A0<br>
    expected&lt; pair&lt; Iter, pair&lt;T,T&gt;&gt;, pair&lt;Iter,
    std::ios_base::iostate&gt;&gt; parse_integer_range&lt;T&gt;(Iter s,
    Iter e) {<br>
    =C2=A0=C2=A0=C2=A0 auto f =3D parse_integer&lt;T&gt;(s, e); RETURN_IF_U=
NEXPECTED(f);<br>
    =C2=A0=C2=A0=C2=A0 auto m =3D parse_string("..", f.first, e);
    RETURN_IF_UNEXPECTED(m);<br>
    =C2=A0=C2=A0=C2=A0 auto l =3D parse_integer&lt;T&gt;(m, e); RETURN_IF_U=
NEXPECTED(l);<br>
    =C2=A0=C2=A0=C2=A0 return make_expected(make_pair(l.first, make_pair(f.=
second,
    l.second))));<br>
    }<br>
    <br>
    where <br>
    <br>
    #define RETURN_IF_UNEXPECTED(f) if (! f) return f.get_exceptional();<br=
>
    <br>
    Note that we can also see pair&lt; Iter, expected&lt;T,
    std::ios_base::iostate&gt;&gt; as equivalent to expected&lt;
    pair&lt; Iter, T&gt;, pair&lt;Iter, std::ios_base::iostate&gt;&gt;
    and so make it a monad also.<br>
    <br>
    I would like to be able to write it just as<br>
    <br>
    expected&lt; pair&lt; Iter, pair&lt;T,T&gt;&gt;, pair&lt;Iter,
    std::ios_base::iostate&gt;&gt; parse_integer_range&lt;T&gt;(Iter s,
    Iter e) {<br>
    =C2=A0=C2=A0=C2=A0 auto f =3D <b>await</b> parse_integer&lt;T&gt;(s, e)=
; <br>
    =C2=A0=C2=A0=C2=A0 auto m =3D <b>await</b> parse_string("..", f.first, =
e); <br>
    =C2=A0=C2=A0=C2=A0 auto l =3D <b>await</b> parse_integer&lt;T&gt;(m, e)=
; <br>
    =C2=A0=C2=A0=C2=A0 return make_pair(l.first, make_pair(f.second, l.seco=
nd)));<br>
    }<br>
    <br>
    The keyword await could be subject to discussion. <br>
    The advantage here is that we are writing the code as if the
    functions parse_integer thrown an exception in case of errors.<br>
    The await operator would make return the parse_integer_range if the
    expression on the right has an error stored. <br>
    The returned value would have the type of the parse_integer_range
    with the stored error.<br>
    <blockquote cite=3D"mid:52E95CDE.40007@knejp.de" type=3D"cite">and the
      convenience overload
      <br>
      <br>
      optional&lt;T&gt; parse_integer&lt;T&gt;(string_view s, int radix
      =3D 10);
      <br>
    </blockquote>
    How do will use this overload? Could you define a
    parse_interger_range with?<br>
    Or is the intent to match the whole string and so the name should be
    match_integer?<br>
    <blockquote cite=3D"mid:52E95CDE.40007@knejp.de" type=3D"cite">
      <br>
      which could, using internal tag dispatching, be reduced to
      <br>
      <br>
      parse&lt;T&gt;(...)
      <br>
      <br>
      The signatures are very easy to use and give me all I need. Both
      greedily consume as many valid characters as possible (even on
      over/underflow) so parsing can continue past the (in)valid input.
      Sure, you don't get a detailed error report but how important is
      it really? All I care about is whether the number was valid or
      not. If the optional is disengaged I can *guess* what happened
      using the iterator overload. If the returned iterator equals
      first, then there was no number to begin with, otherwise the
      number format was wrong or over/underflowed occured. In any case
      the returned iterator points to the first character past the
      number. Not sure how important it really is to distinguish
      over/underflow from an invalid pattern though. I had no use for
      that information so I would be interested to hear about scenarios
      where it really does matter.
      <br>
      <br>
    </blockquote>
    Maybe your application don't care of the detailed error, but when
    designing a library it is better to provide as much information as
    has been obtained so that the user can do whatever she needs.<br>
    <br>
    <br>
    Besy,<br>
    Vicente<br>
  </body>
</html>

<p></p>

-- <br />
&nbsp;<br />
--- <br />
You received this message because you are subscribed to the Google Groups &=
quot;ISO C++ Standard - Future Proposals&quot; group.<br />
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to std-proposals+unsubscribe@isocpp.org.<br />
To post to this group, send email to std-proposals@isocpp.org.<br />
Visit this group at <a href=3D"http://groups.google.com/a/isocpp.org/group/=
std-proposals/">http://groups.google.com/a/isocpp.org/group/std-proposals/<=
/a>.<br />

--------------060609080906020302020906--

.

Author: Miro Knejp <miro@knejp.de>
Date: Thu, 30 Jan 2014 10:21:25 +0100 Raw View

This is a multi-part message in MIME format.
--------------000605010506040904060601
Content-Type: text/plain; charset=UTF-8; format=flowed


>
> Given a function
>
> In Boost.Expected we have an example with something like
>
> pair< Iter, expected<T, std::ios_base::iostate>> parse_integer<T>(Iter
> first, Iter last);
iostate is probably not very helpful since there might not be any
streams involved so all you'd get is failbit or goodbit.
>
> or
>
> expected< pair< Iter, T>, pair<Iter, std::ios_base::iostate>>
> parse_integer<T>(Iter first, Iter last);
>
> A parse interger range could be implemented as
>
> expected< pair< Iter, pair<T,T>>, pair<Iter, std::ios_base::iostate>>
> parse_integer_range<T>(Iter s, Iter e) {
>     auto f = parse_integer<T>(s, e); RETURN_IF_UNEXPECTED(f);
>     auto m = parse_string("..", f.first, e); RETURN_IF_UNEXPECTED(m);
>     auto l = parse_integer<T>(m, e); RETURN_IF_UNEXPECTED(l);
>     return make_expected(make_pair(l.first, make_pair(f.second,
> l.second))));
> }
>
> where
>
> #define RETURN_IF_UNEXPECTED(f) if (! f) return f.get_exceptional();
>
> Note that we can also see pair< Iter, expected<T,
> std::ios_base::iostate>> as equivalent to expected< pair< Iter, T>,
> pair<Iter, std::ios_base::iostate>> and so make it a monad also.
>
> I would like to be able to write it just as
>
> expected< pair< Iter, pair<T,T>>, pair<Iter, std::ios_base::iostate>>
> parse_integer_range<T>(Iter s, Iter e) {
>     auto f = *await* parse_integer<T>(s, e);
>     auto m = *await* parse_string("..", f.first, e);
>     auto l = *await* parse_integer<T>(m, e);
>     return make_pair(l.first, make_pair(f.second, l.second)));
> }
>
> The keyword await could be subject to discussion.
> The advantage here is that we are writing the code as if the functions
> parse_integer thrown an exception in case of errors.
> The await operator would make return the parse_integer_range if the
> expression on the right has an error stored.
> The returned value would have the type of the parse_integer_range with
> the stored error.
>> and the convenience overload
>>
>> optional<T> parse_integer<T>(string_view s, int radix = 10);
> How do will use this overload? Could you define a parse_interger_range
> with?
No. That's what the iterator overloads are for. Also, what makes
parse_integer_range so special?
> Or is the intent to match the whole string and so the name should be
> match_integer?
It's called a *convenience* overload for a reason. The idea was to
provide a simple (novice friendly) interface for the very basic and
introductory/trivial use cases. All it does is call parse_xxx(begin(s),
end(s), ...) and discards the returned iterator. Alternatively one might
return a new string_view with the remainder but that doesn't really add
to its simplicity. If the intention was to match the string exactly I
would have called it match_xxx. Which of course shouldn't mean there's
no use for a match_xxx-like interface but it's implementation is trivial
once you have parse_xxx.
>>
>> which could, using internal tag dispatching, be reduced to
>>
>> parse<T>(...)
>>
>> The signatures are very easy to use and give me all I need. Both
>> greedily consume as many valid characters as possible (even on
>> over/underflow) so parsing can continue past the (in)valid input.
>> Sure, you don't get a detailed error report but how important is it
>> really? All I care about is whether the number was valid or not. If
>> the optional is disengaged I can *guess* what happened using the
>> iterator overload. If the returned iterator equals first, then there
>> was no number to begin with, otherwise the number format was wrong or
>> over/underflowed occured. In any case the returned iterator points to
>> the first character past the number. Not sure how important it really
>> is to distinguish over/underflow from an invalid pattern though. I
>> had no use for that information so I would be interested to hear
>> about scenarios where it really does matter.
>>
> Maybe your application don't care of the detailed error, but when
> designing a library it is better to provide as much information as has
> been obtained so that the user can do whatever she needs.
I didn't mean to imply nobody had a use for it. If the error type used
in expected<> is not a convoluted object like an exception (which
usually requires allocation of an error message) I'm all for adding it.
Maybe errc with values like errc::value_too_large or
errc::invalid_argument. We have this enum now so why not make use of it.
As long as there are no interactions/side effects with errno. Whatever
the return type is, I think it is beneficial for all if the syntax

x = parse_something(...)
if(x) // or x.first depending on the overload used
     ...

is well formed and intuitive.

--

---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.

--------------000605010506040904060601
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

<html>
  <head>
    <meta content=3D"text/html; charset=3DUTF-8" http-equiv=3D"Content-Type=
">
  </head>
  <body text=3D"#000000" bgcolor=3D"#FFFFFF">
    <br>
    <blockquote cite=3D"mid:52E9FAE5.9090906@wanadoo.fr" type=3D"cite"> <br=
>
      Given a function <br>
      <br>
      In Boost.Expected we have an example with something like <br>
      <br>
      pair&lt; Iter, expected&lt;T, std::ios_base::iostate&gt;&gt;
      parse_integer&lt;T&gt;(Iter first, Iter last);<br>
    </blockquote>
    iostate is probably not very helpful since there might not be any
    streams involved so all you'd get is failbit or goodbit.<br>
    <blockquote cite=3D"mid:52E9FAE5.9090906@wanadoo.fr" type=3D"cite"> <br=
>
      or=C2=A0 <br>
      <br>
      expected&lt; pair&lt; Iter, T&gt;, pair&lt;Iter,
      std::ios_base::iostate&gt;&gt; parse_integer&lt;T&gt;(Iter first,
      Iter last);<br>
      <br>
      A parse interger range could be implemented as<br>
      =C2=A0<br>
      expected&lt; pair&lt; Iter, pair&lt;T,T&gt;&gt;, pair&lt;Iter,
      std::ios_base::iostate&gt;&gt; parse_integer_range&lt;T&gt;(Iter
      s, Iter e) {<br>
      =C2=A0=C2=A0=C2=A0 auto f =3D parse_integer&lt;T&gt;(s, e);
      RETURN_IF_UNEXPECTED(f);<br>
      =C2=A0=C2=A0=C2=A0 auto m =3D parse_string("..", f.first, e);
      RETURN_IF_UNEXPECTED(m);<br>
      =C2=A0=C2=A0=C2=A0 auto l =3D parse_integer&lt;T&gt;(m, e);
      RETURN_IF_UNEXPECTED(l);<br>
      =C2=A0=C2=A0=C2=A0 return make_expected(make_pair(l.first, make_pair(=
f.second,
      l.second))));<br>
      }<br>
      <br>
      where <br>
      <br>
      #define RETURN_IF_UNEXPECTED(f) if (! f) return
      f.get_exceptional();<br>
      <br>
      Note that we can also see pair&lt; Iter, expected&lt;T,
      std::ios_base::iostate&gt;&gt; as equivalent to expected&lt;
      pair&lt; Iter, T&gt;, pair&lt;Iter, std::ios_base::iostate&gt;&gt;
      and so make it a monad also.<br>
      <br>
      I would like to be able to write it just as<br>
      <br>
      expected&lt; pair&lt; Iter, pair&lt;T,T&gt;&gt;, pair&lt;Iter,
      std::ios_base::iostate&gt;&gt; parse_integer_range&lt;T&gt;(Iter
      s, Iter e) {<br>
      =C2=A0=C2=A0=C2=A0 auto f =3D <b>await</b> parse_integer&lt;T&gt;(s, =
e); <br>
      =C2=A0=C2=A0=C2=A0 auto m =3D <b>await</b> parse_string("..", f.first=
, e); <br>
      =C2=A0=C2=A0=C2=A0 auto l =3D <b>await</b> parse_integer&lt;T&gt;(m, =
e); <br>
      =C2=A0=C2=A0=C2=A0 return make_pair(l.first, make_pair(f.second, l.se=
cond)));<br>
      }<br>
      <br>
      The keyword await could be subject to discussion. <br>
      The advantage here is that we are writing the code as if the
      functions parse_integer thrown an exception in case of errors.<br>
      The await operator would make return the parse_integer_range if
      the expression on the right has an error stored. <br>
      The returned value would have the type of the parse_integer_range
      with the stored error.<br>
      <blockquote cite=3D"mid:52E95CDE.40007@knejp.de" type=3D"cite">and th=
e
        convenience overload <br>
        <br>
        optional&lt;T&gt; parse_integer&lt;T&gt;(string_view s, int
        radix =3D 10); <br>
      </blockquote>
      How do will use this overload? Could you define a
      parse_interger_range with?<br>
    </blockquote>
    No. That's what the iterator overloads are for. Also, what makes
    parse_integer_range so special?<br>
    <blockquote cite=3D"mid:52E9FAE5.9090906@wanadoo.fr" type=3D"cite"> Or
      is the intent to match the whole string and so the name should be
      match_integer?<br>
    </blockquote>
    It's called a *convenience* overload for a reason. The idea was to
    provide a simple (novice friendly) interface for the very basic and
    introductory/trivial use cases. All it does is call
    parse_xxx(begin(s), end(s), ...) and discards the returned iterator.
    Alternatively one might return a new string_view with the remainder
    but that doesn't really add to its simplicity. If the intention was
    to match the string exactly I would have called it match_xxx. Which
    of course shouldn't mean there's no use for a match_xxx-like
    interface but it's implementation is trivial once you have
    parse_xxx.<br>
    <blockquote cite=3D"mid:52E9FAE5.9090906@wanadoo.fr" type=3D"cite">
      <blockquote cite=3D"mid:52E95CDE.40007@knejp.de" type=3D"cite"> <br>
        which could, using internal tag dispatching, be reduced to <br>
        <br>
        parse&lt;T&gt;(...) <br>
        <br>
        The signatures are very easy to use and give me all I need. Both
        greedily consume as many valid characters as possible (even on
        over/underflow) so parsing can continue past the (in)valid
        input. Sure, you don't get a detailed error report but how
        important is it really? All I care about is whether the number
        was valid or not. If the optional is disengaged I can *guess*
        what happened using the iterator overload. If the returned
        iterator equals first, then there was no number to begin with,
        otherwise the number format was wrong or over/underflowed
        occured. In any case the returned iterator points to the first
        character past the number. Not sure how important it really is
        to distinguish over/underflow from an invalid pattern though. I
        had no use for that information so I would be interested to hear
        about scenarios where it really does matter. <br>
        <br>
      </blockquote>
      Maybe your application don't care of the detailed error, but when
      designing a library it is better to provide as much information as
      has been obtained so that the user can do whatever she needs.<br>
    </blockquote>
    I didn't mean to imply nobody had a use for it. If the error type
    used in expected&lt;&gt; is not a convoluted object like an
    exception (which usually requires allocation of an error message)
    I'm all for adding it. Maybe errc with values like
    errc::value_too_large or errc::invalid_argument. We have this enum
    now so why not make use of it. As long as there are no
    interactions/side effects with errno. Whatever the return type is, I
    think it is beneficial for all if the syntax<br>
    <br>
    x =3D parse_something(...)<br>
    if(x) // or x.first depending on the overload used<br>
    =C2=A0=C2=A0=C2=A0 ...<br>
    <br>
    is well formed and intuitive.<br>
    <br>
  </body>
</html>

<p></p>

-- <br />
&nbsp;<br />
--- <br />
You received this message because you are subscribed to the Google Groups &=
quot;ISO C++ Standard - Future Proposals&quot; group.<br />
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to std-proposals+unsubscribe@isocpp.org.<br />
To post to this group, send email to std-proposals@isocpp.org.<br />
Visit this group at <a href=3D"http://groups.google.com/a/isocpp.org/group/=
std-proposals/">http://groups.google.com/a/isocpp.org/group/std-proposals/<=
/a>.<br />

--------------000605010506040904060601--


.

Author: Bengt Gustafsson <bengt.gustafsson@beamways.com>
Date: Thu, 30 Jan 2014 04:49:58 -0800 (PST) Raw View

------=_Part_27_4565764.1391086198852
Content-Type: text/plain; charset=UTF-8

Neivn: I was only thinking of simple types when I contemplated the ignore()
paradigm. My bad, we should strive for making this useful even for cases
when copying a T after making sure the actual parsing went ok can in itself
throw. Thus, it seems better with an API that returns something that
contans T or an error code. optional<T> is similar to what we need but only
contains a bool, not the code. The value_or method of optional<T> seems
like a good functionality to say "I'm handling any possible error by using
this default value", which is one important use case.

The Boost::expected template suggested by Miro comes close to what I want,
but does not make sure that the error code was checked in its dtor, which I
think was the main feature of my proposed error_return class. I don't know
whether there is a proposal for a std::expected but if so I suggest that it
should do the destructor error check that error_return does.

Den torsdagen den 30:e januari 2014 kl. 01:21:25 UTC-8 skrev Miro Knejp:
>
>
>
> Given a function
>
> In Boost.Expected we have an example with something like
>
> pair< Iter, expected<T, std::ios_base::iostate>> parse_integer<T>(Iter
> first, Iter last);
>
> iostate is probably not very helpful since there might not be any streams
> involved so all you'd get is failbit or goodbit.
>
>
> or
>
> expected< pair< Iter, T>, pair<Iter, std::ios_base::iostate>>
> parse_integer<T>(Iter first, Iter last);
>
> A parse interger range could be implemented as
>
> expected< pair< Iter, pair<T,T>>, pair<Iter, std::ios_base::iostate>>
> parse_integer_range<T>(Iter s, Iter e) {
>     auto f = parse_integer<T>(s, e); RETURN_IF_UNEXPECTED(f);
>     auto m = parse_string("..", f.first, e); RETURN_IF_UNEXPECTED(m);
>     auto l = parse_integer<T>(m, e); RETURN_IF_UNEXPECTED(l);
>     return make_expected(make_pair(l.first, make_pair(f.second,
> l.second))));
> }
>
> where
>
> #define RETURN_IF_UNEXPECTED(f) if (! f) return f.get_exceptional();
>
> Note that we can also see pair< Iter, expected<T, std::ios_base::iostate>>
> as equivalent to expected< pair< Iter, T>, pair<Iter,
> std::ios_base::iostate>> and so make it a monad also.
>
> I would like to be able to write it just as
>
> expected< pair< Iter, pair<T,T>>, pair<Iter, std::ios_base::iostate>>
> parse_integer_range<T>(Iter s, Iter e) {
>     auto f = *await* parse_integer<T>(s, e);
>     auto m = *await* parse_string("..", f.first, e);
>     auto l = *await* parse_integer<T>(m, e);
>     return make_pair(l.first, make_pair(f.second, l.second)));
> }
>
> The keyword await could be subject to discussion.
> The advantage here is that we are writing the code as if the functions
> parse_integer thrown an exception in case of errors.
> The await operator would make return the parse_integer_range if the
> expression on the right has an error stored.
> The returned value would have the type of the parse_integer_range with the
> stored error.
>
> and the convenience overload
>
> optional<T> parse_integer<T>(string_view s, int radix = 10);
>
> How do will use this overload? Could you define a parse_interger_range
> with?
>
> No. That's what the iterator overloads are for. Also, what makes
> parse_integer_range so special?
>
> Or is the intent to match the whole string and so the name should be
> match_integer?
>
> It's called a *convenience* overload for a reason. The idea was to provide
> a simple (novice friendly) interface for the very basic and
> introductory/trivial use cases. All it does is call parse_xxx(begin(s),
> end(s), ...) and discards the returned iterator. Alternatively one might
> return a new string_view with the remainder but that doesn't really add to
> its simplicity. If the intention was to match the string exactly I would
> have called it match_xxx. Which of course shouldn't mean there's no use for
> a match_xxx-like interface but it's implementation is trivial once you have
> parse_xxx.
>
>
> which could, using internal tag dispatching, be reduced to
>
> parse<T>(...)
>
> The signatures are very easy to use and give me all I need. Both greedily
> consume as many valid characters as possible (even on over/underflow) so
> parsing can continue past the (in)valid input. Sure, you don't get a
> detailed error report but how important is it really? All I care about is
> whether the number was valid or not. If the optional is disengaged I can
> *guess* what happened using the iterator overload. If the returned iterator
> equals first, then there was no number to begin with, otherwise the number
> format was wrong or over/underflowed occured. In any case the returned
> iterator points to the first character past the number. Not sure how
> important it really is to distinguish over/underflow from an invalid
> pattern though. I had no use for that information so I would be interested
> to hear about scenarios where it really does matter.
>
>  Maybe your application don't care of the detailed error, but when
> designing a library it is better to provide as much information as has been
> obtained so that the user can do whatever she needs.
>
> I didn't mean to imply nobody had a use for it. If the error type used in
> expected<> is not a convoluted object like an exception (which usually
> requires allocation of an error message) I'm all for adding it. Maybe errc
> with values like errc::value_too_large or errc::invalid_argument. We have
> this enum now so why not make use of it. As long as there are no
> interactions/side effects with errno. Whatever the return type is, I think
> it is beneficial for all if the syntax
>
> x = parse_something(...)
> if(x) // or x.first depending on the overload used
>     ...
>
> is well formed and intuitive.
>
>

--

---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.

------=_Part_27_4565764.1391086198852
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr">Neivn: I was only thinking of simple types when I contempl=
ated the ignore() paradigm. My bad, we should strive for making this useful=
 even for cases when copying a T after making sure the actual parsing went =
ok can in itself throw. Thus, it seems better with an API that returns some=
thing that contans T or an error code. optional&lt;T&gt; is similar to what=
 we need but only contains a bool, not the code. The value_or method of opt=
ional&lt;T&gt; seems like a good functionality to say "I'm handling any pos=
sible error by using this default value", which is one important use case.<=
div><br></div><div>The Boost::expected template suggested by Miro comes clo=
se to what I want, but does not make sure that the error code was checked i=
n its dtor, which I think was the main feature of my proposed error_return =
class. I don't know whether there is a proposal for a std::expected but if =
so I suggest that it should do the destructor error check that error_return=
 does.<br><br>Den torsdagen den 30:e januari 2014 kl. 01:21:25 UTC-8 skrev =
Miro Knejp:<blockquote class=3D"gmail_quote" style=3D"margin: 0;margin-left=
: 0.8ex;border-left: 1px #ccc solid;padding-left: 1ex;">
 =20
   =20
 =20
  <div text=3D"#000000" bgcolor=3D"#FFFFFF">
    <br>
    <blockquote type=3D"cite"> <br>
      Given a function <br>
      <br>
      In Boost.Expected we have an example with something like <br>
      <br>
      pair&lt; Iter, expected&lt;T, std::ios_base::iostate&gt;&gt;
      parse_integer&lt;T&gt;(Iter first, Iter last);<br>
    </blockquote>
    iostate is probably not very helpful since there might not be any
    streams involved so all you'd get is failbit or goodbit.<br>
    <blockquote type=3D"cite"> <br>
      or&nbsp; <br>
      <br>
      expected&lt; pair&lt; Iter, T&gt;, pair&lt;Iter,
      std::ios_base::iostate&gt;&gt; parse_integer&lt;T&gt;(Iter first,
      Iter last);<br>
      <br>
      A parse interger range could be implemented as<br>
      &nbsp;<br>
      expected&lt; pair&lt; Iter, pair&lt;T,T&gt;&gt;, pair&lt;Iter,
      std::ios_base::iostate&gt;&gt; parse_integer_range&lt;T&gt;(Iter
      s, Iter e) {<br>
      &nbsp;&nbsp;&nbsp; auto f =3D parse_integer&lt;T&gt;(s, e);
      RETURN_IF_UNEXPECTED(f);<br>
      &nbsp;&nbsp;&nbsp; auto m =3D parse_string("..", f.first, e);
      RETURN_IF_UNEXPECTED(m);<br>
      &nbsp;&nbsp;&nbsp; auto l =3D parse_integer&lt;T&gt;(m, e);
      RETURN_IF_UNEXPECTED(l);<br>
      &nbsp;&nbsp;&nbsp; return make_expected(make_pair(l.<wbr>first, make_=
pair(f.second,
      l.second))));<br>
      }<br>
      <br>
      where <br>
      <br>
      #define RETURN_IF_UNEXPECTED(f) if (! f) return
      f.get_exceptional();<br>
      <br>
      Note that we can also see pair&lt; Iter, expected&lt;T,
      std::ios_base::iostate&gt;&gt; as equivalent to expected&lt;
      pair&lt; Iter, T&gt;, pair&lt;Iter, std::ios_base::iostate&gt;&gt;
      and so make it a monad also.<br>
      <br>
      I would like to be able to write it just as<br>
      <br>
      expected&lt; pair&lt; Iter, pair&lt;T,T&gt;&gt;, pair&lt;Iter,
      std::ios_base::iostate&gt;&gt; parse_integer_range&lt;T&gt;(Iter
      s, Iter e) {<br>
      &nbsp;&nbsp;&nbsp; auto f =3D <b>await</b> parse_integer&lt;T&gt;(s, =
e); <br>
      &nbsp;&nbsp;&nbsp; auto m =3D <b>await</b> parse_string("..", f.first=
, e); <br>
      &nbsp;&nbsp;&nbsp; auto l =3D <b>await</b> parse_integer&lt;T&gt;(m, =
e); <br>
      &nbsp;&nbsp;&nbsp; return make_pair(l.first, make_pair(f.second, l.se=
cond)));<br>
      }<br>
      <br>
      The keyword await could be subject to discussion. <br>
      The advantage here is that we are writing the code as if the
      functions parse_integer thrown an exception in case of errors.<br>
      The await operator would make return the parse_integer_range if
      the expression on the right has an error stored. <br>
      The returned value would have the type of the parse_integer_range
      with the stored error.<br>
      <blockquote type=3D"cite">and the
        convenience overload <br>
        <br>
        optional&lt;T&gt; parse_integer&lt;T&gt;(string_view s, int
        radix =3D 10); <br>
      </blockquote>
      How do will use this overload? Could you define a
      parse_interger_range with?<br>
    </blockquote>
    No. That's what the iterator overloads are for. Also, what makes
    parse_integer_range so special?<br>
    <blockquote type=3D"cite"> Or
      is the intent to match the whole string and so the name should be
      match_integer?<br>
    </blockquote>
    It's called a *convenience* overload for a reason. The idea was to
    provide a simple (novice friendly) interface for the very basic and
    introductory/trivial use cases. All it does is call
    parse_xxx(begin(s), end(s), ...) and discards the returned iterator.
    Alternatively one might return a new string_view with the remainder
    but that doesn't really add to its simplicity. If the intention was
    to match the string exactly I would have called it match_xxx. Which
    of course shouldn't mean there's no use for a match_xxx-like
    interface but it's implementation is trivial once you have
    parse_xxx.<br>
    <blockquote type=3D"cite">
      <blockquote type=3D"cite"> <br>
        which could, using internal tag dispatching, be reduced to <br>
        <br>
        parse&lt;T&gt;(...) <br>
        <br>
        The signatures are very easy to use and give me all I need. Both
        greedily consume as many valid characters as possible (even on
        over/underflow) so parsing can continue past the (in)valid
        input. Sure, you don't get a detailed error report but how
        important is it really? All I care about is whether the number
        was valid or not. If the optional is disengaged I can *guess*
        what happened using the iterator overload. If the returned
        iterator equals first, then there was no number to begin with,
        otherwise the number format was wrong or over/underflowed
        occured. In any case the returned iterator points to the first
        character past the number. Not sure how important it really is
        to distinguish over/underflow from an invalid pattern though. I
        had no use for that information so I would be interested to hear
        about scenarios where it really does matter. <br>
        <br>
      </blockquote>
      Maybe your application don't care of the detailed error, but when
      designing a library it is better to provide as much information as
      has been obtained so that the user can do whatever she needs.<br>
    </blockquote>
    I didn't mean to imply nobody had a use for it. If the error type
    used in expected&lt;&gt; is not a convoluted object like an
    exception (which usually requires allocation of an error message)
    I'm all for adding it. Maybe errc with values like
    errc::value_too_large or errc::invalid_argument. We have this enum
    now so why not make use of it. As long as there are no
    interactions/side effects with errno. Whatever the return type is, I
    think it is beneficial for all if the syntax<br>
    <br>
    x =3D parse_something(...)<br>
    if(x) // or x.first depending on the overload used<br>
    &nbsp;&nbsp;&nbsp; ...<br>
    <br>
    is well formed and intuitive.<br>
    <br>
  </div>

</blockquote></div></div>

<p></p>

-- <br />
&nbsp;<br />
--- <br />
You received this message because you are subscribed to the Google Groups &=
quot;ISO C++ Standard - Future Proposals&quot; group.<br />
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to std-proposals+unsubscribe@isocpp.org.<br />
To post to this group, send email to std-proposals@isocpp.org.<br />
Visit this group at <a href=3D"http://groups.google.com/a/isocpp.org/group/=
std-proposals/">http://groups.google.com/a/isocpp.org/group/std-proposals/<=
/a>.<br />

------=_Part_27_4565764.1391086198852--

.

Author: Nevin Liber <nevin@eviloverlord.com>
Date: Thu, 30 Jan 2014 10:44:54 -0600 Raw View

--001a11c3841a44eadd04f132c9c4
Content-Type: text/plain; charset=ISO-8859-1

On 30 January 2014 06:49, Bengt Gustafsson <bengt.gustafsson@beamways.com>wrote:

>
>
> The Boost::expected template suggested by Miro comes close to what I want,
> but does not make sure that the error code was checked in its dtor, which I
> think was the main feature of my proposed error_return class.
>

You do know that it is extremely unlikely for a class which throws from its
destructor to be approved by the committee anytime soon, right?  First you
would need to address the issues brought up in the article and discussion
at <http://cpp-next.com/archive/2012/08/evil-or-just-misunderstood/>.

Besides, it seems to me it ought to assert, not throw, as I can't imagine
any circumstance where this isn't a programming bug.
--
 Nevin ":-)" Liber  <mailto:nevin@eviloverlord.com>  (847) 691-1404

--

---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.

--001a11c3841a44eadd04f132c9c4
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr">On 30 January 2014 06:49, Bengt Gustafsson <span dir=3D"lt=
r">&lt;<a href=3D"mailto:bengt.gustafsson@beamways.com" target=3D"_blank">b=
engt.gustafsson@beamways.com</a>&gt;</span> wrote:<br><div class=3D"gmail_e=
xtra"><div class=3D"gmail_quote">

<blockquote class=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border-=
left:1px solid rgb(204,204,204);padding-left:1ex"><div dir=3D"ltr"><br><div=
><br></div><div>The Boost::expected template suggested by Miro comes close =
to what I want, but does not make sure that the error code was checked in i=
ts dtor, which I think was the main feature of my proposed error_return cla=
ss. </div>

</div></blockquote><div><br></div><div>You do know that it is extremely unl=
ikely for a class which throws from its destructor to be approved by the co=
mmittee anytime soon, right?=A0 First you would need to address the issues =
brought up in the article and discussion at &lt;<a href=3D"http://cpp-next.=
com/archive/2012/08/evil-or-just-misunderstood/">http://cpp-next.com/archiv=
e/2012/08/evil-or-just-misunderstood/</a>&gt;.<br>

<br></div><div>Besides, it seems to me it ought to assert, not throw, as I =
can&#39;t imagine any circumstance where this isn&#39;t a programming bug.<=
br></div></div>-- <br>=A0Nevin &quot;:-)&quot; Liber=A0 &lt;mailto:<a href=
=3D"mailto:nevin@eviloverlord.com" target=3D"_blank">nevin@eviloverlord.com=
</a>&gt;=A0 (847) 691-1404
</div></div>

<p></p>

-- <br />
&nbsp;<br />
--- <br />
You received this message because you are subscribed to the Google Groups &=
quot;ISO C++ Standard - Future Proposals&quot; group.<br />
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to std-proposals+unsubscribe@isocpp.org.<br />
To post to this group, send email to std-proposals@isocpp.org.<br />
Visit this group at <a href=3D"http://groups.google.com/a/isocpp.org/group/=
std-proposals/">http://groups.google.com/a/isocpp.org/group/std-proposals/<=
/a>.<br />

--001a11c3841a44eadd04f132c9c4--

.

Author: Bengt Gustafsson <bengt.gustafsson@beamways.com>
Date: Fri, 31 Jan 2014 15:00:38 -0800 (PST) Raw View

------=_Part_701_1058102.1391209238498
Content-Type: text/plain; charset=UTF-8

You are right, it should be an assert of course. In the return code
football thread I just wrote about a outlandish feature that would make
this a static_assert even! This is a check to make sure that the conversion
error code is being checked, so it is really a static feature of the code
at the call site.

I won't repeat myself here. I would rather defer discussions on the error
return method to that thread:

https://groups.google.com/a/isocpp.org/forum/#!topic/std-proposals/260PWIq_7u0

Den torsdagen den 30:e januari 2014 kl. 17:44:54 UTC+1 skrev Nevin ":-)"
Liber:
>
> On 30 January 2014 06:49, Bengt Gustafsson <bengt.gu...@beamways.com<javascript:>
> > wrote:
>
>>
>>
>> The Boost::expected template suggested by Miro comes close to what I
>> want, but does not make sure that the error code was checked in its dtor,
>> which I think was the main feature of my proposed error_return class.
>>
>
> You do know that it is extremely unlikely for a class which throws from
> its destructor to be approved by the committee anytime soon, right?  First
> you would need to address the issues brought up in the article and
> discussion at <
> http://cpp-next.com/archive/2012/08/evil-or-just-misunderstood/>.
>
> Besides, it seems to me it ought to assert, not throw, as I can't imagine
> any circumstance where this isn't a programming bug.
> --
>  Nevin ":-)" Liber  <mailto:ne...@eviloverlord.com <javascript:>>  (847)
> 691-1404
>

--

---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.

------=_Part_701_1058102.1391209238498
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr">You are right, it should be an assert of course. In the re=
turn code football thread I just wrote about a outlandish feature that woul=
d make this a static_assert even! This is a check to make sure that the con=
version error code is being checked, so it is really a static feature of th=
e code at the call site.<div><br></div><div>I won't repeat myself here. I w=
ould rather defer discussions on the error return method to that thread:<di=
v><br></div><div>https://groups.google.com/a/isocpp.org/forum/#!topic/std-p=
roposals/260PWIq_7u0<br><br>Den torsdagen den 30:e januari 2014 kl. 17:44:5=
4 UTC+1 skrev Nevin ":-)" Liber:<blockquote class=3D"gmail_quote" style=3D"=
margin: 0;margin-left: 0.8ex;border-left: 1px #ccc solid;padding-left: 1ex;=
"><div dir=3D"ltr">On 30 January 2014 06:49, Bengt Gustafsson <span dir=3D"=
ltr">&lt;<a href=3D"javascript:" target=3D"_blank" gdf-obfuscated-mailto=3D=
"DOvvft-uLCkJ" onmousedown=3D"this.href=3D'javascript:';return true;" oncli=
ck=3D"this.href=3D'javascript:';return true;">bengt.gu...@beamways.com</a><=
wbr>&gt;</span> wrote:<br><div><div class=3D"gmail_quote">

<blockquote class=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border-=
left:1px solid rgb(204,204,204);padding-left:1ex"><div dir=3D"ltr"><br><div=
><br></div><div>The Boost::expected template suggested by Miro comes close =
to what I want, but does not make sure that the error code was checked in i=
ts dtor, which I think was the main feature of my proposed error_return cla=
ss. </div>

</div></blockquote><div><br></div><div>You do know that it is extremely unl=
ikely for a class which throws from its destructor to be approved by the co=
mmittee anytime soon, right?&nbsp; First you would need to address the issu=
es brought up in the article and discussion at &lt;<a href=3D"http://cpp-ne=
xt.com/archive/2012/08/evil-or-just-misunderstood/" target=3D"_blank" onmou=
sedown=3D"this.href=3D'http://www.google.com/url?q\75http%3A%2F%2Fcpp-next.=
com%2Farchive%2F2012%2F08%2Fevil-or-just-misunderstood%2F\46sa\75D\46sntz\0=
751\46usg\75AFQjCNHXD7Ay-Kx_k09-GJaV6THD2O6tVQ';return true;" onclick=3D"th=
is.href=3D'http://www.google.com/url?q\75http%3A%2F%2Fcpp-next.com%2Farchiv=
e%2F2012%2F08%2Fevil-or-just-misunderstood%2F\46sa\75D\46sntz\0751\46usg\75=
AFQjCNHXD7Ay-Kx_k09-GJaV6THD2O6tVQ';return true;">http://cpp-next.com/archi=
ve/<wbr>2012/08/evil-or-just-<wbr>misunderstood/</a>&gt;.<br>

<br></div><div>Besides, it seems to me it ought to assert, not throw, as I =
can't imagine any circumstance where this isn't a programming bug.<br></div=
></div>-- <br>&nbsp;Nevin ":-)" Liber&nbsp; &lt;mailto:<a href=3D"javascrip=
t:" target=3D"_blank" gdf-obfuscated-mailto=3D"DOvvft-uLCkJ" onmousedown=3D=
"this.href=3D'javascript:';return true;" onclick=3D"this.href=3D'javascript=
:';return true;">ne...@eviloverlord.com</a><wbr>&gt;&nbsp; (847) 691-1404
</div></div>
</blockquote></div></div></div>

<p></p>

-- <br />
&nbsp;<br />
--- <br />
You received this message because you are subscribed to the Google Groups &=
quot;ISO C++ Standard - Future Proposals&quot; group.<br />
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to std-proposals+unsubscribe@isocpp.org.<br />
To post to this group, send email to std-proposals@isocpp.org.<br />
Visit this group at <a href=3D"http://groups.google.com/a/isocpp.org/group/=
std-proposals/">http://groups.google.com/a/isocpp.org/group/std-proposals/<=
/a>.<br />

------=_Part_701_1058102.1391209238498--

.

Author: Olaf van der Spek <olafvdspek@gmail.com>
Date: Mon, 3 Feb 2014 03:18:44 -0800 (PST) Raw View

------=_Part_3939_15712483.1391426324606
Content-Type: text/plain; charset=UTF-8

On Wednesday, January 29, 2014 8:56:14 PM UTC+1, Miro Knejp wrote:
>
> I am now using the following interface in the format parser
> implementation:
>
> pair<optional<T>, Iter> parse_integer<T>(Iter first, Iter last, int
> radix = 10)
>
> and the convenience overload
>
> optional<T> parse_integer<T>(string_view s, int radix = 10);
>
> which could, using internal tag dispatching, be reduced to
>
> parse<T>(...)
>
> The signatures are very easy to use and give me all I need. Both

 Do you perhaps have a link to a project using this interface?

--

---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.

------=_Part_3939_15712483.1391426324606
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr">On Wednesday, January 29, 2014 8:56:14 PM UTC+1, Miro Knej=
p wrote:<blockquote class=3D"gmail_quote" style=3D"margin: 0;margin-left: 0=
..8ex;border-left: 1px #ccc solid;padding-left: 1ex;">I am now using the fol=
lowing interface in the format parser implementation:
<br>
<br>pair&lt;optional&lt;T&gt;, Iter&gt; parse_integer&lt;T&gt;(Iter first, =
Iter last, int=20
<br>radix =3D 10)
<br>
<br>and the convenience overload
<br>
<br>optional&lt;T&gt; parse_integer&lt;T&gt;(string_view s, int radix =3D 1=
0);
<br>
<br>which could, using internal tag dispatching, be reduced to
<br>
<br>parse&lt;T&gt;(...)
<br>
<br>The signatures are very easy to use and give me all I need. Both </bloc=
kquote><div><br></div><div>&nbsp;Do you perhaps have a link to a project us=
ing this interface?</div><div><br></div></div>

<p></p>

-- <br />
&nbsp;<br />
--- <br />
You received this message because you are subscribed to the Google Groups &=
quot;ISO C++ Standard - Future Proposals&quot; group.<br />
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to std-proposals+unsubscribe@isocpp.org.<br />
To post to this group, send email to std-proposals@isocpp.org.<br />
Visit this group at <a href=3D"http://groups.google.com/a/isocpp.org/group/=
std-proposals/">http://groups.google.com/a/isocpp.org/group/std-proposals/<=
/a>.<br />

------=_Part_3939_15712483.1391426324606--

.

Author: Olaf van der Spek <olafvdspek@gmail.com>
Date: Mon, 3 Feb 2014 17:25:46 +0100 Raw View

On Mon, Feb 3, 2014 at 5:20 PM, Matthew Woehlke
<mw_triad@users.sourceforge.net> wrote:
>> I'd prefer 07 to be parsed as 7. Most non-dev people probably expect this
>> as well.
>> Is octal still being used?
>
>
> You can do this by passing base = 10. '0' as a prefix only means base 8 when
> passing base = 0 (i.e. detect from prefix).

What if I want dec and hex but no octal? ;)

>> What's the problem with strlen()?
>
>
> It requires additional execution cycles that don't provide any real benefit.
> And yes, that *does* matter; there are definitely cases where string to
> number conversion is a performance bottleneck, e.g. when reading large files
> of data in textual format. (I say this from actual real-world personal
> experience.)

Numbers in text files are not nul-terminated.

--

---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.

.

Author: Jeffrey Yasskin <jyasskin@google.com>
Date: Mon, 3 Feb 2014 09:46:27 -0800 Raw View

On Mon, Feb 3, 2014 at 9:36 AM, Matthew Woehlke
<mw_triad@users.sourceforge.net> wrote:
> On 2014-02-03 11:25, Olaf van der Spek wrote:
>>
>> On Mon, Feb 3, 2014 at 5:20 PM, Matthew Woehlke
>> <mw_triad@users.sourceforge.net> wrote:
>>>>
>>>> I'd prefer 07 to be parsed as 7. Most non-dev people probably expect
>>>> this
>>>> as well.
>>>> Is octal still being used?
>>>
>>>
>>>
>>> You can do this by passing base = 10. '0' as a prefix only means base 8
>>> when
>>> passing base = 0 (i.e. detect from prefix).
>>
>>
>> What if I want dec and hex but no octal? ;)
>
>
> That's a fair question :-). (But so is if we should throw out the 0 prefix
> as indicating octal.)
>
> I suppose you could test if the string starts with '0x' and call with either
> in,base=10 or in+2,base=16. Not saying that's ideal, though. (Even if I
> suspect that performance-wise it would be similar to base=0.)
>
>
>>>> What's the problem with strlen()?
>>>
>>>
>>> It requires additional execution cycles that don't provide any real
>>> benefit.
>>> And yes, that *does* matter; there are definitely cases where string to
>>> number conversion is a performance bottleneck, e.g. when reading large
>>> files
>>> of data in textual format. (I say this from actual real-world personal
>>> experience.)
>>
>>
>> Numbers in text files are not nul-terminated.
>
>
> They are if I'm using a CSV or XML parsing library that yields
> NUL-terminated char*. (And I seem to recall that such do exist, i.e. they
> take a char* buffer and substitute NUL at the end of "values".)

FWIW, your CSV or XML parsing library should be changed to return a
string_view or equivalent. It has the size, and is throwing it out,
forcing your number parser to do redundant checks for '\0', which
slows you down.

Yes, I know it will take time to propagate the new interface through
system libraries.

> "What's the problem with strlen()" is that is has potential performance
> implications given char const* input data. I'm not convinced that trying to
> determine if there is any reasonable instance where one has char const* data
> in a situation that is performance sensitive qualifies as reason to
> disregard that.
>
> --
> Matthew
>
>
> --
>
> --- You received this message because you are subscribed to the Google
> Groups "ISO C++ Standard - Future Proposals" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to std-proposals+unsubscribe@isocpp.org.
> To post to this group, send email to std-proposals@isocpp.org.
> Visit this group at
> http://groups.google.com/a/isocpp.org/group/std-proposals/.

--

---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.

.

Author: Thiago Macieira <thiago@macieira.info>
Date: Mon, 03 Feb 2014 09:58:09 -0800 Raw View

Em seg 03 fev 2014, =E0s 12:36:45, Matthew Woehlke escreveu:
> > Numbers in text files are not nul-terminated.
>=20
> They are if I'm using a CSV or XML parsing library that yields=20
> NUL-terminated char*. (And I seem to recall that such do exist, i.e.=20
> they take a char* buffer and substitute NUL at the end of "values".)

No, they're not. None of my CSV and XML files on disk have NULs.

If you're getting a NUL, it means your library actually did malloc() to=20
allocate memory just so it could set a \0 there, which totally offsets the =
cost=20
of strlen. If your library is doing that, then strlen() performance is not =
the=20
issue.

--=20
Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org
   Software Architect - Intel Open Source Technology Center
      PGP/GPG: 0x6EF45358; fingerprint:
      E067 918B B660 DBD1 105C  966C 33F5 F005 6EF4 5358

--=20

---=20
You received this message because you are subscribed to the Google Groups "=
ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposa=
ls/.

.

Author: Thiago Macieira <thiago@macieira.org>
Date: Mon, 03 Feb 2014 11:45:13 -0800 Raw View

Em seg 03 fev 2014, =E0s 13:44:23, Matthew Woehlke escreveu:
> > If you're getting a NUL, it means your library actually did malloc() to
> > allocate memory just so it could set a \0 there, which totally offsets =
the
> > cost of strlen. If your library is doing that, then strlen() performanc=
e
> > is not the issue.
>=20
> I'm talking about libraries that require a mutable input buffer=B9 and
> replace ends-of-"values" in that buffer with NUL. (I can't think what it
> was, offhand, but pretty sure I came across a library that did exactly
> this.)
>=20
> (=B9 or is doing the file I/O itself and so has a mutable input buffer.)

Those are rare. Anyway, that's just the cost of strlen(), unless you can=20
modify the library to return the pointer to where it set the NUL. I imagine=
=20
that is a useful feature anyway, since you may want to continue parsing fro=
m=20
where you stopped.

--=20
Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org
   Software Architect - Intel Open Source Technology Center
      PGP/GPG: 0x6EF45358; fingerprint:
      E067 918B B660 DBD1 105C  966C 33F5 F005 6EF4 5358

--=20

---=20
You received this message because you are subscribed to the Google Groups "=
ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposa=
ls/.

.

Author: Miro Knejp <miro@knejp.de>
Date: Mon, 03 Feb 2014 21:48:45 +0100 Raw View

This is a multi-part message in MIME format.
--------------090702090608060005020109
Content-Type: text/plain; charset=UTF-8; format=flowed


Am 03.02.2014 12:18, schrieb Olaf van der Spek:
> On Wednesday, January 29, 2014 8:56:14 PM UTC+1, Miro Knejp wrote:
>
>     I am now using the following interface in the format parser
>     implementation:
>
>     pair<optional<T>, Iter> parse_integer<T>(Iter first, Iter last, int
>     radix = 10)
>
>     and the convenience overload
>
>     optional<T> parse_integer<T>(string_view s, int radix = 10);
>
>     which could, using internal tag dispatching, be reduced to
>
>     parse<T>(...)
>
>     The signatures are very easy to use and give me all I need. Both
>
>
>  Do you perhaps have a link to a project using this interface?
>
The format implementation can be found here:
https://github.com/mknejp/std-format
The defintions of the parse_xxx methods (of which there currently are
only integer versions) are in include/std-format/detail/parse_tools.hpp
and they are used in include/std-format/detail/format_parser.hpp.

Not sure how serious an example that is as for processing the actual
format string I only need to parse integers at two locations in
format_parser.hpp and there is no error reporting (only
success/failure). However, at some point I also need to start parsing
the format options for all the builtin and std types and there it will
become clearer how useful the interface really is. Last time I had to
write number parsing myself was some 10 years ago so please don't mind
if the implementation of the parse methods isn't perfect.

Considering the debate about octal numbers and prefixes I went along and
split the methods up depending on use case, so I have:

parse_integer(...) <- accepts [+-]?[0-9a-zA-Z]+

parse_radix_prefix(...) <- accepts (0[xXbX]?)? retuning the radix 2, 8,
16 or 0 if the pattern doesn't apply

parse_prefixed_integer(...) <- accepts [+-]?(0[xXbB]?)?[0-9]+ and if the
prefix is not recognized uses the radix passed as argument

The actual range of valid characters in [0-9a-zA-Z] depends on the radix
and the minus sign is only accepted for signed integer types. None of
them skip any whitespace characters on any end of the string. They
consume all valid characters even if overflow occurs. Feel free to
replace any character with culture specific signs and digits when
locales apply.

Then I was thinking some more about the return/error dilemma. Inspired
by the mentioning of match_integer three use case scenarios come to mind:

 1. Parsing a longer text. At some point you determine that at position
    i should be a number. This is the case where you probably need the
    most information: success/failure, error description if it failed
    and in both cases an iterator to the next position so you can
    continue processing the remaining source. I guess this is what I
    tried to cover with my parse_xxx interface.
 2. You have a string and it hast to contain a number. In this case the
    source has to match exactly with zero tolerance. This would be the
    case for a match_xxx interface where invalid characters at the end
    cause a failed conversion.
 3. You have a string of some description and expect it to begin with a
    number. Invalid characters at the end of the source range do not
    cause an error. This might involve skipping of whitespaces before
    the number. I see this use case occurring especially in exercises
    and introductory courses when dealing with basic user input for the
    first time. I don't really have a fitting name for such an
    interface. from_string maybe? No idea.

I see number 1 as the most fundamental. The interface is based on
iterators and can thus work with almost any type of input. 2 and 3 can
be implemented in terms of 1 and a string_view overload would probably
be used more often there. As soon as locales are involved all three
should automatically recognize the correct grouping, separating and
decimal characters.

--

---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.

--------------090702090608060005020109
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

<html>
  <head>
    <meta content=3D"text/html; charset=3DUTF-8" http-equiv=3D"Content-Type=
">
  </head>
  <body text=3D"#000000" bgcolor=3D"#FFFFFF">
    <br>
    <div class=3D"moz-cite-prefix">Am 03.02.2014 12:18, schrieb Olaf van
      der Spek:<br>
    </div>
    <blockquote
      cite=3D"mid:2b93e8bc-ad3a-4c30-8f3e-30c3dd37d080@isocpp.org"
      type=3D"cite">
      <div dir=3D"ltr">On Wednesday, January 29, 2014 8:56:14 PM UTC+1,
        Miro Knejp wrote:
        <blockquote class=3D"gmail_quote" style=3D"margin: 0;margin-left:
          0.8ex;border-left: 1px #ccc solid;padding-left: 1ex;">I am now
          using the following interface in the format parser
          implementation:
          <br>
          <br>
          pair&lt;optional&lt;T&gt;, Iter&gt;
          parse_integer&lt;T&gt;(Iter first, Iter last, int <br>
          radix =3D 10)
          <br>
          <br>
          and the convenience overload
          <br>
          <br>
          optional&lt;T&gt; parse_integer&lt;T&gt;(string_view s, int
          radix =3D 10);
          <br>
          <br>
          which could, using internal tag dispatching, be reduced to
          <br>
          <br>
          parse&lt;T&gt;(...)
          <br>
          <br>
          The signatures are very easy to use and give me all I need.
          Both </blockquote>
        <div><br>
        </div>
        <div>=C2=A0Do you perhaps have a link to a project using this
          interface?</div>
        <div><br>
        </div>
      </div>
    </blockquote>
    The format implementation can be found here:
    <a class=3D"moz-txt-link-freetext" href=3D"https://github.com/mknejp/st=
d-format">https://github.com/mknejp/std-format</a><br>
    The defintions of the parse_xxx methods (of which there currently
    are only integer versions) are in
    include/std-format/detail/parse_tools.hpp and they are used in
    include/std-format/detail/format_parser.hpp.<br>
    <br>
    Not sure how serious an example that is as for processing the actual
    format string I only need to parse integers at two locations in
    format_parser.hpp and there is no error reporting (only
    success/failure). However, at some point I also need to start
    parsing the format options for all the builtin and std types and
    there it will become clearer how useful the interface really is.
    Last time I had to write number parsing myself was some 10 years ago
    so please don't mind if the implementation of the parse methods
    isn't perfect.<br>
    <br>
    Considering the debate about octal numbers and prefixes I went along
    and split the methods up depending on use case, so I have:<br>
    <br>
    parse_integer(...) &lt;- accepts [+-]?[0-9a-zA-Z]+<br>
    <br>
    parse_radix_prefix(...) &lt;- accepts (0[xXbX]?)? retuning the radix
    2, 8, 16 or 0 if the pattern doesn't apply<br>
    <br>
    parse_prefixed_integer(...) &lt;- accepts [+-]?(0[xXbB]?)?[0-9]+ and
    if the prefix is not recognized uses the radix passed as argument<br>
    <br>
    The actual range of valid characters in [0-9a-zA-Z] depends on the
    radix and the minus sign is only accepted for signed integer types.
    None of them skip any whitespace characters on any end of the
    string. They consume all valid characters even if overflow occurs.
    Feel free to replace any character with culture specific signs and
    digits when locales apply.<br>
    <br>
    Then I was thinking some more about the return/error dilemma.
    Inspired by the mentioning of match_integer three use case scenarios
    come to mind:<br>
    <ol>
      <li>Parsing a longer text. At some point you determine that at
        position i should be a number. This is the case where you
        probably need the most information: success/failure, error
        description if it failed and in both cases an iterator to the
        next position so you can continue processing the remaining
        source. I guess this is what I tried to cover with my parse_xxx
        interface.<br>
      </li>
      <li>You have a string and it hast to contain a number. In this
        case the source has to match exactly with zero tolerance. This
        would be the case for a match_xxx interface where invalid
        characters at the end cause a failed conversion.</li>
      <li>You have a string of some description and expect it to begin
        with a number. Invalid characters at the end of the source range
        do not cause an error. This might involve skipping of
        whitespaces before the number. I see this use case occurring
        especially in exercises and introductory courses when dealing
        with basic user input for the first time. I don't really have a
        fitting name for such an interface. from_string maybe? No idea.<br>
      </li>
    </ol>
    <p>I see number 1 as the most fundamental. The interface is based on
      iterators and can thus work with almost any type of input. 2 and 3
      can be implemented in terms of 1 and a string_view overload would
      probably be used more often there. As soon as locales are involved
      all three should automatically recognize the correct grouping,
      separating and decimal characters.<br>
    </p>
  </body>
</html>

<p></p>

-- <br />
&nbsp;<br />
--- <br />
You received this message because you are subscribed to the Google Groups &=
quot;ISO C++ Standard - Future Proposals&quot; group.<br />
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to std-proposals+unsubscribe@isocpp.org.<br />
To post to this group, send email to std-proposals@isocpp.org.<br />
Visit this group at <a href=3D"http://groups.google.com/a/isocpp.org/group/=
std-proposals/">http://groups.google.com/a/isocpp.org/group/std-proposals/<=
/a>.<br />

--------------090702090608060005020109--


.

Author: Olaf van der Spek <olafvdspek@gmail.com>
Date: Tue, 4 Feb 2014 00:02:12 +0100 Raw View

On Mon, Feb 3, 2014 at 11:01 PM, Matthew Woehlke
<mw_triad@users.sourceforge.net> wrote:
> If you do care about more than exactly one of the three possible output
> information parts, I don't see any way to avoid having at least one local
> variable. So what is wrong with:

user_t& u = ...
if (parse(u.age, input)) // or !parse, depending on return type
  return / throw

No local var required, no type duplication

if (auto err = parse(u.age, is)) // or !parse, depending on return type
  return err / throw

> unknown_t last_consumed; // um... what's the type of this?

(const) iterator perhaps, though it's also possible to update the
string_view in-place

> And of course, there's the case that we don't care about the status:
>
> type out = default;
> from_string(in, out).ignore();
> use(out);
>
> - versus -
>
> use(from_string<type>(in).value_or(default));

That one liner is nice and would require a wrapper indeed.


--
Olaf

--

---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.

.

Author: Miro Knejp <miro@knejp.de>
Date: Tue, 04 Feb 2014 03:30:54 +0100 Raw View

This is a multi-part message in MIME format.
--------------090709080505080504040802
Content-Type: text/plain; charset=UTF-8; format=flowed


Am 04.02.2014 02:57, schrieb Bengt Gustafsson:
> @Matthew, regarding strlen avoidance: By RANGE I meant a template type
> which has the same api as required by a range based for (which is kind
> of a build in template function). This means that you don't have to
> create a string_view, any type of range will do. Here, for instance,
> is a char_ptr_range for this case:
>
> |
> ...
>
> |
>
When we are talking ranges is it the same stuff SG9 is working on? I'm
not sure if it's really important whether the input to the parsing
methods are iterators or ranges, since in the end the latter should
always be somehow convertible to the former for interoperability with
the remaining standard library and there seems to be more disagreement
on how to provide the results of the conversion, not the inputs. If all
goes south a pair of iterators will always do the job...
> Note that as we are aiming for a extensible set of conversions
> including user defined types, say WGS84 geospactial coordintaes there
> is also an open set of error codes, so an enum or int value is not
> enough. (The from_string may be wrapped in a template function which
> can't be expected to know the interpretation of an int error code for
> any T it may be instantiated for!
>
> Now I want to check some common use cases for the solution with a
> return triplet. To complete the use cases we can also add a fourth
> member skipped which is true if we had to skip spaces. I think that
> the smartest way to solve this may be to provide a set of value()
> functions, but no cast operators:
>
> Tentatively I call the "states" of the return value:
>
> strict - no spaces skipped. All of the string could be converted.
> complete - space skipping ok, but no trailing junk.
> ok - space skipping ok, and trailing junk.
> bad - no parsing was possible, even after skipping spaces.
>
> // The actual parsing is always the same:
> const auto r = parse<int>(char_ptr_range("123"));
>
> int x = r.strict_value(); // throws if r is not strict.
> int y = r.complete_value() // throws if r is not complete
> int z = r.value() // throws if r is not ok
> int w = r.value_or(17); // never throws
Does the actual parsing happen in the parse() method or r.*value*()? If
the latter then what happens if the range consists of input iterators
(e.g. istreambuf_iterator, i.e. single-pass) and I try to call an
r.*value*() method multiple times? Will it prevent me from doing that or
is the sky just going to fall down on my head? If the parsing happens
inside parse() then how do I prevent it from consuming leading
whitesapces from input iterators if leading whitespaces are not allowed
in my use case? It seems that in order to distinguish r.complete_value()
and r.strict_value() the whitespaces were already consumed otherwise
they couldn't provide that information.

--

---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.

--------------090709080505080504040802
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

<html>
  <head>
    <meta content=3D"text/html; charset=3DUTF-8" http-equiv=3D"Content-Type=
">
  </head>
  <body text=3D"#000000" bgcolor=3D"#FFFFFF">
    <br>
    <div class=3D"moz-cite-prefix">Am 04.02.2014 02:57, schrieb Bengt
      Gustafsson:<br>
    </div>
    <blockquote
      cite=3D"mid:a614dbf8-4a1f-4fa3-92a2-8df63af8db3b@isocpp.org"
      type=3D"cite">
      <div dir=3D"ltr">@Matthew, regarding strlen avoidance: By RANGE I
        meant a template type which has the same api as required by a
        range based for (which is kind of a build in template function).
        This means that you don't have to create a string_view, any type
        of range will do. Here, for instance, is a char_ptr_range for
        this case:
        <div><br>
        </div>
        <div class=3D"prettyprint" style=3D"background-color: rgb(250, 250,
          250); border: 1px solid rgb(187, 187, 187); word-wrap:
          break-word;"><code class=3D"prettyprint">
            <div class=3D"subprettyprint"><span style=3D"color: #008;"
                class=3D"styled-by-prettify">...</span><span style=3D"color=
:
                #000;" class=3D"styled-by-prettify"><br>
                <br>
              </span></div>
          </code></div>
        <div><br>
        </div>
      </div>
    </blockquote>
    When we are talking ranges is it the same stuff SG9 is working on?
    I'm not sure if it's really important whether the input to the
    parsing methods are iterators or ranges, since in the end the latter
    should always be somehow convertible to the former for
    interoperability with the remaining standard library and there seems
    to be more disagreement on how to provide the results of the
    conversion, not the inputs. If all goes south a pair of iterators
    will always do the job...<br>
    <blockquote
      cite=3D"mid:a614dbf8-4a1f-4fa3-92a2-8df63af8db3b@isocpp.org"
      type=3D"cite">
      <div dir=3D"ltr">
        <div>Note that as we are aiming for a extensible set of
          conversions including user defined types, say WGS84
          geospactial coordintaes there is also an open set of error
          codes, so an enum or int value is not enough. (The from_string
          may be wrapped in a template function which can't be expected
          to know the interpretation of an int error code for any T it
          may be instantiated for!</div>
        <div><br>
        </div>
        <div>Now I want to check some common use cases for the solution
          with a return triplet. To complete the use cases we can also
          add a fourth member skipped which is true if we had to skip
          spaces. I think that the smartest way to solve this may be to
          provide a set of value() functions, but no cast operators:</div>
        <div><br>
        </div>
        <div>Tentatively I call the "states" of the return value:</div>
        <div><br>
        </div>
        <div>strict - no spaces skipped. All of the string could be
          converted.</div>
        <div>complete - space skipping ok, but no trailing junk.</div>
        <div>ok -=C2=A0<span style=3D"font-size: 13px;">space skipping ok, =
and
            trailing junk.</span></div>
        <div>bad - no parsing was possible, even after skipping spaces.</di=
v>
        <div><br>
        </div>
        <div>// The actual parsing is always the same:</div>
        <div>const auto r =3D parse&lt;int&gt;(<span
            style=3D"background-color: rgb(250, 250, 250); color: rgb(0,
            0, 0); font-family: monospace; font-size: 13px;">char_ptr_range=
(</span><span
            style=3D"font-size: 13px;">"123"));</span></div>
        <div><span style=3D"font-size: 13px;"><br>
          </span></div>
        <div>int x =3D r.strict_value(); // throws if r is not strict.</div=
>
        <div>int y =3D r.complete_value() // throws if r is not complete</d=
iv>
        <div>int z =3D r.value() // throws if r is not ok</div>
        <div>int w =3D r.value_or(17); // never throws</div>
      </div>
    </blockquote>
    Does the actual parsing happen in the parse() method or r.*value*()?
    If the latter then what happens if the range consists of input
    iterators (e.g. istreambuf_iterator, i.e. single-pass) and I try to
    call an r.*value*() method multiple times? Will it prevent me from
    doing that or is the sky just going to fall down on my head? If the
    parsing happens inside parse() then how do I prevent it from
    consuming leading whitesapces from input iterators if leading
    whitespaces are not allowed in my use case? It seems that in order
    to distinguish r.complete_value() and r.strict_value() the
    whitespaces were already consumed otherwise they couldn't provide
    that information.<br>
    <br>
  </body>
</html>

<p></p>

-- <br />
&nbsp;<br />
--- <br />
You received this message because you are subscribed to the Google Groups &=
quot;ISO C++ Standard - Future Proposals&quot; group.<br />
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to std-proposals+unsubscribe@isocpp.org.<br />
To post to this group, send email to std-proposals@isocpp.org.<br />
Visit this group at <a href=3D"http://groups.google.com/a/isocpp.org/group/=
std-proposals/">http://groups.google.com/a/isocpp.org/group/std-proposals/<=
/a>.<br />

--------------090709080505080504040802--


.

Author: Magnus Fromreide <magfr@lysator.liu.se>
Date: Tue, 4 Feb 2014 08:48:14 +0100 Raw View

On Mon, Feb 03, 2014 at 09:48:45PM +0100, Miro Knejp wrote:
>
> As soon as locales are involved all three should automatically recognize
> the correct grouping, separating and decimal characters.

I think the use of locales should be optional and - importantly - follow the
if you don't use it you don't pay for it rule.

/MF

--

---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.

.

Author: Bengt Gustafsson <bengt.gustafsson@beamways.com>
Date: Tue, 4 Feb 2014 07:33:42 -0800 (PST) Raw View

------=_Part_513_16195136.1391528022533
Content-Type: text/plain; charset=UTF-8

Miro:

Yes, leading whitespace is always consumed in parse. If you don't allow
this you loose some performance as you actually convert the number when you
could know that it was an error as soon as you saw the first space. I don't
think this is a big problem. The idea is to skip the space and set a flag
in the return value if there was some space to skip. The strict_value()
function checks this flag and throws if it was set.

However, trailing space can not be skipped until we ask for it. I don't
think this is a big deal as we already have the remaining range in the
result object.

The Range concept we use here should be synchronized with what the range
working group are doing. Apart from the obvious begin() and end() functions
that are anyway required for range based for we only need:

- iterators must be copyable
- A Range must be constructible from two iterators (begin/end).

I feel pretty safe that this will be part of the range concept the working
group comes up with (although the copying part may be debatable with input
iterators). Of course passing the range in by non-const reference solves
this problem elegantly but all such "mutating" suggestions seem to be hard
to get approval of here. For me it would be the easiest route to take: less
copying of ranges, easier to use, less requirements on the Range concept.
(The returned value will have another flag value indicating if the range
ended, but the late trailing space check is of course hard to implement).

Magnus:

When it comes to locales, how important is this for parsing numbers? Can't
we rely on narrow chars being ascii and wide chars being unicode, i.e.
demand that multibyte data is either converted to unicode strings before
parsing or that the range's iterators do the conversion on the fly, in a
lazy style. For other types this may be more important, but I still think
that the locale is more of a thing for the stream than the conversion
function.

We will of course need some adaptor so that cin and similar can be used as
the source range. Maybe an overloaded function is needed as an istream does
not comply to the Range concept and we don't want to have to write
from_string(istream_range(cin)) at every call, do we? BTW: This shows that
a new bytestream type should be Range concept compatible!

Den tisdagen den 4:e februari 2014 kl. 08:48:14 UTC+1 skrev Magnus
Fromreide:
>
> On Mon, Feb 03, 2014 at 09:48:45PM +0100, Miro Knejp wrote:
> >
> > As soon as locales are involved all three should automatically recognize
> > the correct grouping, separating and decimal characters.
>
> I think the use of locales should be optional and - importantly - follow
> the
> if you don't use it you don't pay for it rule.
>
> /MF
>

--

---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.

------=_Part_513_16195136.1391528022533
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr">Miro:<div><br></div><div>Yes, leading whitespace is always=
 consumed in parse. If you don't allow this you loose some performance as y=
ou actually convert the number when you could know that it was an error as =
soon as you saw the first space. I don't think this is a big problem. The i=
dea is to skip the space and set a flag in the return value if there was so=
me space to skip. The strict_value() function checks this flag and throws i=
f it was set.</div><div><br></div><div>However, trailing space can not be s=
kipped until we ask for it. I don't think this is a big deal as we already =
have the remaining range in the result object.</div><div><br></div><div>The=
 Range concept we use here should be synchronized with what the range worki=
ng group are doing. Apart from the obvious begin() and end() functions that=
 are anyway required for range based for we only need:</div><div><br></div>=
<div>- iterators must be copyable</div><div>- A Range must be constructible=
 from two iterators (begin/end).</div><div><br></div><div>I feel pretty saf=
e that this will be part of the range concept the working group comes up wi=
th (although the copying part may be debatable with input iterators). Of co=
urse passing the range in by non-const reference solves this problem elegan=
tly but all such "mutating" suggestions seem to be hard to get approval of =
here. For me it would be the easiest route to take: less copying of ranges,=
 easier to use, less requirements on the Range concept. (The returned value=
 will have another flag value indicating if the range ended, but the late t=
railing space check is of course hard to implement).</div><div><br></div><d=
iv>Magnus:</div><div><br></div><div>When it comes to locales, how important=
 is this for parsing numbers? Can't we rely on narrow chars being ascii and=
 wide chars being unicode, i.e. demand that multibyte data is either conver=
ted to unicode strings before parsing or that the range's iterators do the =
conversion on the fly, in a lazy style. For other types this may be more im=
portant, but I still think that the locale is more of a thing for the strea=
m than the conversion function.</div><div><br></div><div>We will of course =
need some adaptor so that cin and similar can be used as the source range. =
Maybe an overloaded function is needed as an istream does not comply to the=
 Range concept and we don't want to have to write from_string(istream_range=
(cin)) at every call, do we? BTW: This shows that a new bytestream type sho=
uld be Range concept compatible!</div><div><br><br>Den tisdagen den 4:e feb=
ruari 2014 kl. 08:48:14 UTC+1 skrev Magnus Fromreide:<blockquote class=3D"g=
mail_quote" style=3D"margin: 0;margin-left: 0.8ex;border-left: 1px #ccc sol=
id;padding-left: 1ex;">On Mon, Feb 03, 2014 at 09:48:45PM +0100, Miro Knejp=
 wrote:
<br>&gt;=20
<br>&gt; As soon as locales are involved all three should automatically rec=
ognize
<br>&gt; the correct grouping, separating and decimal characters.
<br>
<br>I think the use of locales should be optional and - importantly - follo=
w the
<br>if you don't use it you don't pay for it rule.
<br>
<br>/MF
<br></blockquote></div></div>

<p></p>

-- <br />
&nbsp;<br />
--- <br />
You received this message because you are subscribed to the Google Groups &=
quot;ISO C++ Standard - Future Proposals&quot; group.<br />
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to std-proposals+unsubscribe@isocpp.org.<br />
To post to this group, send email to std-proposals@isocpp.org.<br />
Visit this group at <a href=3D"http://groups.google.com/a/isocpp.org/group/=
std-proposals/">http://groups.google.com/a/isocpp.org/group/std-proposals/<=
/a>.<br />

------=_Part_513_16195136.1391528022533--

.

Author: Magnus Fromreide <magfr@lysator.liu.se>
Date: Wed, 5 Feb 2014 00:46:57 +0100 Raw View

On Tue, Feb 04, 2014 at 01:39:54PM -0500, Matthew Woehlke wrote:
> On 2014-02-03 17:35, gmisocpp@gmail.com wrote:
>
> This seems like overkill. Either the text is well-formed or it
> isn't. I'd say that should be the first check. (Probably considering
> "-1" for unsigned as 'not well formed'.)
>
> If the text is well-formed, then and only then would I get into
> other reasons the parse might have failed, e.g. because the value
> would overflow.
>
> This corresponds loosely with the output iterator and whether or not
> all possible text was consumed.
>
> Users that really care about the empty input case can check that
> themselves easily enough.
>
> >enum conversion_options // bit mask, probably can be simplified, but
> >conveys the idea
> >{
> >     allow_value,
> >     allow_value_and_something,
> >     allow_failed_value,
> >     allow_failed_value_and_something,
> >     allow_errors,
> >     allow_nothing
> >};
>
> While I like the idea, I don't think this set of options are all
> that useful. Instead I would suggest:

I dislike this idea - I would prefer to separate each responsibility in
order to make each function easier to understand and combine as building
blocks.

end_of_input(number(args));
number(args);
end_of_input(optional(number(args)));
optional(number(args));
/* not clear what allow_errors implies */
end_of_input(args);

This is incidentally pretty close to the design used in boost spirit, save that
they overload operators to change the apperance of the method calls and builds
a parse tree from them.

--

---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.

.

Author: Miro Knejp <miro@knejp.de>
Date: Wed, 05 Feb 2014 01:07:11 +0100 Raw View

Am 04.02.2014 16:33, schrieb Bengt Gustafsson:
> Miro:
>
> Yes, leading whitespace is always consumed in parse. If you don't=20
> allow this you loose some performance as you actually convert the=20
> number when you could know that it was an error as soon as you saw the=20
> first space. I don't think this is a big problem. The idea is to skip=20
> the space and set a flag in the return value if there was some space=20
> to skip. The strict_value() function checks this flag and throws if it=20
> was set.
Well what if my use case doesn't allow leading whitespaces? Wasn't that=20
one of the initial concerns that started this whole thing? If it's a=20
single pass input iterator this is a big deal and the parser must not=20
consume any invalid characters I didn't tell it to as they are forever=20
lost to the caller. What if it consumes the whitespaces and then=20
realizes they are not followed by a number? The actual whitespace=20
content is lost. What if I needed this information to increment a line=20
counter after parsing the number? Plus, expected behavior should not=20
differ depending on the iterator category. I further fail to see how=20
that has any impact on performance. If the first character is not a=20
valid part of the number then parsing immediately fails. Where's the=20
lost performance? If the user wants to skip whitespaces that operation=20
can be prepended or composed on top of prase(). Adding the implicit=20
whitespace skipping only limits its range of applicability. I don't=20
think parse() should attempt to be too clever.
>
> However, trailing space can not be skipped until we ask for it. I=20
> don't think this is a big deal as we already have the remaining range=20
> in the result object.
>
> The Range concept we use here should be synchronized with what the=20
> range working group are doing. Apart from the obvious begin() and=20
> end() functions that are anyway required for range based for we only need=
:
>
> - iterators must be copyable
> - A Range must be constructible from two iterators (begin/end).
>
> I feel pretty safe that this will be part of the range concept the=20
> working group comes up with (although the copying part may be=20
> debatable with input iterators). Of course passing the range in by=20
> non-const reference solves this problem elegantly but all such=20
> "mutating" suggestions seem to be hard to get approval of here. For me=20
> it would be the easiest route to take: less copying of ranges, easier=20
> to use, less requirements on the Range concept. (The returned value=20
> will have another flag value indicating if the range ended, but the=20
> late trailing space check is of course hard to implement).
So far all standard iterators are copyable. The only concern is too many=20
increments on input iterators as copies, but that's an implementation=20
topic. I think this whole range talk isn't really helpful and getting us=20
anywhere. How the methods get their input is the least of my concerns to=20
be honest. Where to place the result, error and next_iter (or next=20
subrange for that matter) is the biggest disagreement we have.
>
> Magnus:
>
> When it comes to locales, how important is this for parsing numbers?=20
> Can't we rely on narrow chars being ascii and wide chars being=20
> unicode, i.e. demand that multibyte data is either converted to=20
> unicode strings before parsing or that the range's iterators do the=20
> conversion on the fly, in a lazy style. For other types this may be=20
> more important, but I still think that the locale is more of a thing=20
> for the stream than the conversion function.
It's not just about codecvt. Take 1,000.00 versus 1.000,00 versus=20
1'000.00. Or stuff like =E0=AF=A7=E0=AF=A8=E0=AF=A9=E0=AF=AA. If it can't c=
orrectly interpret multibyte=20
numerals in a UTF-8 string (given an appropriate locale) it's not of=20
much use in an international environment.

For this I was thinking along the lines of:
parse(...) <- use current global locale (or from an istream's locale if=20
that's passed as source).
parse(..., locale) <- use provided locale object
parse(..., no_locale) <- no_locale_t tag type to do it with the "C"=20
locale but without actually using the (virtual!) methods of facets but=20
an optimized language neutral fast-path version instead


--=20

---=20
You received this message because you are subscribed to the Google Groups "=
ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposa=
ls/.

.

Author: Nevin Liber <nevin@eviloverlord.com>
Date: Tue, 4 Feb 2014 20:27:44 -0600 Raw View

--001a11c266c4d8da4904f19f8294
Content-Type: text/plain; charset=ISO-8859-1

On 4 February 2014 19:41, Matthew Woehlke <mw_triad@users.sourceforge.net>wrote:

> On 2014-02-04 19:55, Paul Tessier wrote:
>
>> int parse<T, U>(range<U> r, T& value, locale loc = default_locale)
>>> {
>>>     auto const result = parse<T>(r, loc);
>>>     if (result) value = *result;
>>>     return // um... int? from whence do I get an int?
>>> }
>>>
>>
>> Except that with an out parameter no copies need be made, which depending
>> on cost of copying said type, this may be a bottle neck.  Your version of
>> an out parameter composed of a value returning version forces a copy
>> regardless of the need for one.
>>
>
> Where?
>
> A "good" implementation would emplace in the return value.


To be fair, if you pass in the parameter and the type has an internal heap
allocation, it can reuse the space that was allocated.  Of course, if it
has an internal heap allocation, parse is unlikely to be the bottleneck,
and assignment is probably expensive as well.

Plus,  it is a horrible, horrible interface.

Let's say you wanted to use the result in a member initializer list.  If
so, you end up having to write something which returns a value, as in:

struct Foo
{
    template<typename U>
    explicit Foo(Range<U> r)
    : theNumber{[&]{ BigInt bi; parse( r, bi ); return bi; }}
    {}

    BigInt theNumber;
};

Ugh,  This is C++, not C.  You can do better.
--
 Nevin ":-)" Liber  <mailto:nevin@eviloverlord.com>  (847) 691-1404

--

---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.

--001a11c266c4d8da4904f19f8294
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr">On 4 February 2014 19:41, Matthew Woehlke <span dir=3D"ltr=
">&lt;<a href=3D"mailto:mw_triad@users.sourceforge.net" target=3D"_blank">m=
w_triad@users.sourceforge.net</a>&gt;</span> wrote:<br><div class=3D"gmail_=
extra">

<div class=3D"gmail_quote"><blockquote class=3D"gmail_quote" style=3D"margi=
n:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204=
);border-left-style:solid;padding-left:1ex"><div class=3D"im">On 2014-02-04=
 19:55, Paul Tessier wrote:<br>


</div><blockquote class=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;b=
order-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:s=
olid;padding-left:1ex"><div class=3D"im"><blockquote class=3D"gmail_quote" =
style=3D"margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:r=
gb(204,204,204);border-left-style:solid;padding-left:1ex">


int parse&lt;T, U&gt;(range&lt;U&gt; r, T&amp; value, locale loc =3D defaul=
t_locale)<br>
{<br>
=A0 =A0 auto const result =3D parse&lt;T&gt;(r, loc);<br>
=A0 =A0 if (result) value =3D *result;<br>
=A0 =A0 return // um... int? from whence do I get an int?<br>
}<br>
</blockquote>
<br></div><div class=3D"im">
Except that with an out parameter no copies need be made, which depending<b=
r>
on cost of copying said type, this may be a bottle neck. =A0Your version of=
<br>
an out parameter composed of a value returning version forces a copy<br>
regardless of the need for one.<br>
</div></blockquote>
<br>
Where?<br>
<br>
A &quot;good&quot; implementation would emplace in the return value. </bloc=
kquote><div><br></div><div>To be fair, if you pass in the parameter and the=
 type has an internal heap allocation, it can reuse the space that was allo=
cated. =A0Of course, if it has an internal heap allocation, parse is unlike=
ly to be the bottleneck, and assignment is probably expensive as well.</div=
>

<div><br></div><div>Plus, =A0it is a horrible, horrible interface.</div><di=
v><br></div><div>Let&#39;s say you wanted to use the result in a member ini=
tializer list. =A0If so, you end up having to write something which returns=
 a value, as in:</div>

<div><br></div><div>struct Foo</div><div>{</div><div>=A0 =A0 template&lt;ty=
pename U&gt;</div><div>=A0 =A0 explicit Foo(Range&lt;U&gt; r)</div><div>=A0=
 =A0 : theNumber{[&amp;]{ BigInt bi;=A0<span style=3D"font-family:arial,san=
s-serif;font-size:13px">parse( r, bi ); return bi;</span>=A0}}</div>

<div>=A0 =A0 {}</div><div><br></div><div>=A0 =A0 BigInt theNumber;</div><di=
v>};</div><div><br></div><div>Ugh, =A0This is C++, not C. =A0You can do bet=
ter.</div><div>--=A0<br></div></div>=A0Nevin &quot;:-)&quot; Liber=A0 &lt;m=
ailto:<a href=3D"mailto:nevin@eviloverlord.com" target=3D"_blank">nevin@evi=
loverlord.com</a>&gt;=A0 (847) 691-1404
</div></div>

<p></p>

-- <br />
&nbsp;<br />
--- <br />
You received this message because you are subscribed to the Google Groups &=
quot;ISO C++ Standard - Future Proposals&quot; group.<br />
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to std-proposals+unsubscribe@isocpp.org.<br />
To post to this group, send email to std-proposals@isocpp.org.<br />
Visit this group at <a href=3D"http://groups.google.com/a/isocpp.org/group/=
std-proposals/">http://groups.google.com/a/isocpp.org/group/std-proposals/<=
/a>.<br />

--001a11c266c4d8da4904f19f8294--

.

Author: Nevin Liber <nevin@eviloverlord.com>
Date: Tue, 4 Feb 2014 20:44:43 -0600 Raw View

--089e013d0f48973f5704f19fbfae
Content-Type: text/plain; charset=ISO-8859-1

On 4 February 2014 20:16, Paul Tessier <phernost@gmail.com> wrote:

> Assume that big_int requires the heap to allow for very big int's, say 10
> to 2000 digits, a value returning version has no way to avoid allocating at
> each parse, regardless of move-assignment or RVO.
>

If you are *that* concerned about performance, it is extremely unlikely
you'd be willing to pay the runtime cost for locale support, either.


> A parameter out version can reuse the same big_int and therefore
> potentially avoid the cost of new allocations at each parse.
>
> It is always possible to take *any* snippet of code and replace it with a
> function that takes in and out parameters,
>

Not always.  Sometimes it is impossible to construct the object without
knowing all the parameters first, and they can't necessarily be faked.


> the reverse cannot be said for value returning functions.
>

The problem is that people have to keep writing those functions to make up
for bad interfaces.

--
 Nevin ":-)" Liber  <mailto:nevin@eviloverlord.com>  (847) 691-1404

--

---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.

--089e013d0f48973f5704f19fbfae
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr">On 4 February 2014 20:16, Paul Tessier <span dir=3D"ltr">&=
lt;<a href=3D"mailto:phernost@gmail.com" target=3D"_blank">phernost@gmail.c=
om</a>&gt;</span> wrote:<br><div class=3D"gmail_extra"><div class=3D"gmail_=
quote"><blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-=
left:1px #ccc solid;padding-left:1ex">

<div dir=3D"ltr"><div><div class=3D"h5"><span style=3D"color:rgb(34,34,34)"=
>Assume that big_int requires the heap to allow for very big int&#39;s, say=
 10 to 2000 digits, a value returning version has no way to avoid allocatin=
g at each parse, regardless of move-assignment or RVO.=A0 </span></div>

</div></div></blockquote><div><br></div><div>If you are *that* concerned ab=
out performance, it is extremely unlikely you&#39;d be willing to pay the r=
untime cost for locale support, either.</div><div>=A0</div><blockquote clas=
s=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1px #ccc solid;pad=
ding-left:1ex">

<div dir=3D"ltr"><div><div class=3D"h5"><span style=3D"color:rgb(34,34,34)"=
>A parameter out version can reuse the same big_int and therefore potential=
ly avoid the cost of new allocations at each parse.</span><br></div></div><=
div>

<br>It is always possible to take <b>any</b> snippet of code and replace it=
 with a function that takes in and out parameters,</div></div></blockquote>=
<div><br></div><div>Not always. =A0Sometimes it is impossible to construct =
the object without knowing all the parameters first, and they can&#39;t nec=
essarily be faked.</div>

<div>=A0</div><blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;=
border-left:1px #ccc solid;padding-left:1ex"><div dir=3D"ltr"><div> the rev=
erse cannot be said for value returning functions.=A0</div></div></blockquo=
te><div>

<br></div><div>The problem is that people have to keep writing those functi=
ons to make up for bad interfaces.</div><div><br></div><div>--=A0<br></div>=
</div>=A0Nevin &quot;:-)&quot; Liber=A0 &lt;mailto:<a href=3D"mailto:nevin@=
eviloverlord.com" target=3D"_blank">nevin@eviloverlord.com</a>&gt;=A0 (847)=
 691-1404
</div></div>

<p></p>

-- <br />
&nbsp;<br />
--- <br />
You received this message because you are subscribed to the Google Groups &=
quot;ISO C++ Standard - Future Proposals&quot; group.<br />
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to std-proposals+unsubscribe@isocpp.org.<br />
To post to this group, send email to std-proposals@isocpp.org.<br />
Visit this group at <a href=3D"http://groups.google.com/a/isocpp.org/group/=
std-proposals/">http://groups.google.com/a/isocpp.org/group/std-proposals/<=
/a>.<br />

--089e013d0f48973f5704f19fbfae--

.

Author: Paul Tessier <phernost@gmail.com>
Date: Tue, 4 Feb 2014 19:21:55 -0800 (PST) Raw View

------=_Part_6718_24313720.1391570515121
Content-Type: text/plain; charset=UTF-8



On Tuesday, February 4, 2014 9:44:43 PM UTC-5, Nevin ":-)" Liber wrote:
>
> On 4 February 2014 20:16, Paul Tessier <pher...@gmail.com <javascript:>>wrote:
>
>> Assume that big_int requires the heap to allow for very big int's, say 10
>> to 2000 digits, a value returning version has no way to avoid allocating at
>> each parse, regardless of move-assignment or RVO.
>>
>
> If you are *that* concerned about performance, it is extremely unlikely
> you'd be willing to pay the runtime cost for locale support, either.
>
>
>> A parameter out version can reuse the same big_int and therefore
>> potentially avoid the cost of new allocations at each parse.
>>
>> It is always possible to take *any* snippet of code and replace it with
>> a function that takes in and out parameters,
>>
>
> Not always.  Sometimes it is impossible to construct the object without
> knowing all the parameters first, and they can't necessarily be faked.
>
>
>> the reverse cannot be said for value returning functions.
>>
>
> The problem is that people have to keep writing those functions to make up
> for bad interfaces.
>
> --
>  Nevin ":-)" Liber  <mailto:ne...@eviloverlord.com <javascript:>>  (847)
> 691-1404
>

It is provable that *any* snippet of code can be replaced with a function
that takes in and out parameters.  The number of parameters is equal to the
number variables in use, otherwise [&]{//some code} would not compile.
There is nothing in language that is the equivalent for such code cutting.
As such, since parse has such contentious views about its interface, the
simplest solutions is to cut the part that does the actual work, a
parameter out function, and let everyone else do as they see fit.  Nothing
is gained by ignoring opportunities, and a parameter out version allows
*all* opportunities, the reverse cannot be said for the other versions in
*all* cases.  I would not propose that parameter out functions always be
used but, in this case it solves the current conflicting view points by
allowing all implementations with the absolute minimum cost.

I agree that in average coding, the return by value is much easier to read
and reason about, and should be preferred where its use is reasonable.  I'm
all for included a return by value version.  It's just that it is a simple
3 line template that uses the parameter out version.  The reverse is not
possible, keeping in mind *all* side effects.

T parse_value<T,R>( R range, /* other stuff */ ) {
  T rvo; // default constructed
  parse<T,R>( r, rvo, /* other stuff */ ); // no throw
  return rvo;
}

I'm all for including a parameter out "parse" and a return value
"parse_value" like the above, as this seems to solve the most basic and
most complex use cases.  An optional<T> version is easily constructed in a
similar fashion, as is a throwing version.

--

---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.

------=_Part_6718_24313720.1391570515121
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr"><br><br>On Tuesday, February 4, 2014 9:44:43 PM UTC-5, Nev=
in ":-)" Liber wrote:<blockquote class=3D"gmail_quote" style=3D"margin: 0;m=
argin-left: 0.8ex;border-left: 1px #ccc solid;padding-left: 1ex;"><div dir=
=3D"ltr">On 4 February 2014 20:16, Paul Tessier <span dir=3D"ltr">&lt;<a hr=
ef=3D"javascript:" target=3D"_blank" gdf-obfuscated-mailto=3D"y-YBkvlVUzsJ"=
 onmousedown=3D"this.href=3D'javascript:';return true;" onclick=3D"this.hre=
f=3D'javascript:';return true;">pher...@gmail.com</a>&gt;</span> wrote:<br>=
<div><div class=3D"gmail_quote"><blockquote class=3D"gmail_quote" style=3D"=
margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

<div dir=3D"ltr"><div><div><span style=3D"color:rgb(34,34,34)">Assume that =
big_int requires the heap to allow for very big int's, say 10 to 2000 digit=
s, a value returning version has no way to avoid allocating at each parse, =
regardless of move-assignment or RVO.&nbsp; </span></div>

</div></div></blockquote><div><br></div><div>If you are *that* concerned ab=
out performance, it is extremely unlikely you'd be willing to pay the runti=
me cost for locale support, either.</div><div>&nbsp;</div><blockquote class=
=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1px #ccc solid;padd=
ing-left:1ex">

<div dir=3D"ltr"><div><div><span style=3D"color:rgb(34,34,34)">A parameter =
out version can reuse the same big_int and therefore potentially avoid the =
cost of new allocations at each parse.</span><br></div></div><div>

<br>It is always possible to take <b>any</b> snippet of code and replace it=
 with a function that takes in and out parameters,</div></div></blockquote>=
<div><br></div><div>Not always. &nbsp;Sometimes it is impossible to constru=
ct the object without knowing all the parameters first, and they can't nece=
ssarily be faked.</div>

<div>&nbsp;</div><blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8=
ex;border-left:1px #ccc solid;padding-left:1ex"><div dir=3D"ltr"><div> the =
reverse cannot be said for value returning functions.&nbsp;</div></div></bl=
ockquote><div>

<br></div><div>The problem is that people have to keep writing those functi=
ons to make up for bad interfaces.</div><div><br></div><div>--&nbsp;<br></d=
iv></div>&nbsp;Nevin ":-)" Liber&nbsp; &lt;mailto:<a href=3D"javascript:" t=
arget=3D"_blank" gdf-obfuscated-mailto=3D"y-YBkvlVUzsJ" onmousedown=3D"this=
..href=3D'javascript:';return true;" onclick=3D"this.href=3D'javascript:';re=
turn true;">ne...@eviloverlord.com</a><wbr>&gt;&nbsp; (847) 691-1404
</div></div></blockquote><div><br>It is provable that  <b>any</b> snippet o=
f code can be replaced with a function that takes in and out parameters.&nb=
sp; The number of parameters is equal to the number variables in use, other=
wise [&amp;]{//some code} would not compile.&nbsp; There is nothing in lang=
uage that is the equivalent for such code cutting.&nbsp; As such, since par=
se has such contentious views about its interface, the simplest solutions i=
s to cut the part that does the actual work, a parameter out function, and =
let everyone else do as they see fit.&nbsp; Nothing is gained by ignoring o=
pportunities, and a parameter out version allows <b>all</b> opportunities, =
the reverse cannot be said for the other versions in <b>all</b> cases.&nbsp=
; I would not propose that parameter out functions always be used but, in t=
his case it solves the current conflicting view points by allowing all impl=
ementations with the absolute minimum cost.<br><br>I agree that in average =
coding, the return by value is much=20
easier to read and reason about, and should be preferred where its use is r=
easonable.&nbsp; I'm all for included a return by value version.&nbsp; It's=
 just that it is a simple 3 line template that uses the parameter out versi=
on.&nbsp; The reverse is not possible, keeping in mind <b>all</b> side effe=
cts.<br><br>T parse_value&lt;T,R&gt;( R range, /* other stuff */ ) {<br>&nb=
sp; T rvo; // default constructed<br>&nbsp; parse&lt;T,R&gt;( r, rvo, /* ot=
her stuff */ ); // no throw<br>&nbsp; return rvo;<br>}<br><br>I'm all for i=
ncluding a parameter out "parse" and a return value "parse_value" like the =
above, as this seems to solve the most basic and most complex use cases.&nb=
sp; An optional&lt;T&gt; version is easily constructed in a similar fashion=
, as is a throwing version.<br><br></div></div>

<p></p>

-- <br />
&nbsp;<br />
--- <br />
You received this message because you are subscribed to the Google Groups &=
quot;ISO C++ Standard - Future Proposals&quot; group.<br />
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to std-proposals+unsubscribe@isocpp.org.<br />
To post to this group, send email to std-proposals@isocpp.org.<br />
Visit this group at <a href=3D"http://groups.google.com/a/isocpp.org/group/=
std-proposals/">http://groups.google.com/a/isocpp.org/group/std-proposals/<=
/a>.<br />

------=_Part_6718_24313720.1391570515121--

.

Author: Matthew Fioravante <fmatthew5876@gmail.com>
Date: Tue, 4 Feb 2014 22:02:15 -0800 (PST) Raw View

------=_Part_3309_10056467.1391580135384
Content-Type: text/plain; charset=UTF-8

Wow, a lot of activity for this thread. After reading through, here are
some thoughts I had.

Whitespace handling:

I said it before but I will reiterate again. I think it is a huge mistake
to include whitespace handling, even as a optional flag.

Whitespace handling is orthogonal to number parsing and should be
completely removed from this interface. Maybe the user only wants to parse
spaces but not tabs, maybe they want locale dependent whitespace checking
and maybe they don't? Why not let them use a modern whitespace handling API
(different thread) to remove prefixes and suffixes and we focus just on
parsing the numbers. All of these parsing routines should be simple and
composable. Also adding more flags makes the interface more complicated,
lets keep it simple, exposing only the bare minimum of options required.

If we add whitespace handling to this API it will be half baked. That is,
it will have to make a lot of assumptions about how the user wants to
handle whitespace (see the variants I just mentioned). Or if we account for
all of the possibilities of whitespace handling with flags, now we have a
whitespace API and a number parsing API mashed together, which is even
worse.

Out parameters vs return:

After reading Matthew's posts (the other matthew, not me!). I like his
ideas better. The below is very elegant.

auto result = parse<int>("1234");
if(!result) {
  //handle error
  //Can do switch(result.error()) if we want
}
use(result.value());

We could make value() throw on error like std::expected, giving us a C
style return code interface and C++ style exception interface all in one.
If the user explicitly checks the error status before calling value(), its
easy for the compiler to optimize out the conditional and the throwing
logic entirely, making this an efficient exceptionless interface. As an
added bonus if the caller is a noexcept function, calling value() without
checking for an error will result in a call to std::terminate(), giving us
a runtime check that we checked the error status. We don't need extra
overloads for exceptions. value_or() also provides for users who want the
defaulting behavior, without yet another overload.

So far the most convincing argument I've seen for the out parameter
interface is the std::getline() example where you're parsing an expensive
to copy type like big_int. Perhaps one compromise to this could be an
overload:

//Ignoring iterator versions for brevity
template <typename T>
return_type parse<T>(string_view s);
template <typename T>
return_ref_type parse<T>(string_view s, T& t);

The second one here allows you to provide the object with which to store
the result. The return type of return_ref_type is the same as return_type
with value(), error(), next() etc.. except that value() just returns a
reference to the t passed into the function instead of containing the
value. Alternatively the returned object could omit value() and just have
next() and error().

While the output has been the biggest question (even spawning off the error
code thread), I think most people prefer returning over out parameters. If
we return, we must return all 3 bits of information (value, error, string
tail) in an elegant way.

Ranges:

There was some talk about ranges. I think for now we can stick to
string_view, iterator pairs, and maybe const char* for supporting null
terminated strings. Once ranges come into the standard I don't think its
too difficult to just add a range overload.

Locales:

Locales absolutely must be optional. Parsing routines such as this can
easily become performance hotspots in applications and must be as fast as
possible. Locales mean indirect function calls (virtuals) which can quickly
ruin performance.

Configuring the Parse (the next big question?):

Ignoring whitespace, there is still a possibility for a lot configuration
with how the user may want to do the parse:

   - What radix do I use?
   - Do I allow 0x and 0 (octal) prefixes?
   - Do I allow + prefix?
   - Locales?
   - Commas or just digits?

First, I think we would need to decide on a complete set of options. The
next big question is what is the best interface which will allow the user
to specify them easily and what are the most sensible defaults?


--

---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.

------=_Part_3309_10056467.1391580135384
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr"><div><div><span style=3D"font-size: 13px;">Wow, a lot of a=
ctivity for this thread. After reading through, here are some thoughts I ha=
d.</span><br></div><div><br></div><div>Whitespace handling:</div><div><br>I=
 said it before but I will reiterate again. I think it is a huge mistake to=
 include whitespace handling, even as a optional flag.</div><div><br></div>=
</div><div>Whitespace handling is orthogonal to number parsing and should b=
e completely removed from this interface. Maybe the user only wants to pars=
e spaces but not tabs, maybe they want locale dependent whitespace checking=
 and maybe they don't? Why not let them use a modern whitespace handling AP=
I (different thread) to remove prefixes and suffixes and we focus just on p=
arsing the numbers. All of these parsing routines should be simple and comp=
osable. Also adding more flags makes the interface more complicated, lets k=
eep it simple, exposing only the bare minimum of options required.&nbsp;</d=
iv><div><br></div><div>If we add whitespace handling to this API it will be=
 half baked. That is, it will have to make a lot of assumptions about how t=
he user wants to handle whitespace (see the variants I just mentioned). Or =
if we account for all of the possibilities of whitespace handling with flag=
s, now we have a whitespace API and a number parsing API mashed together, w=
hich is even worse.<br></div><div><br></div><div>Out parameters vs return:<=
/div><div><br></div><div>After reading Matthew's posts (the other matthew, =
not me!). I like his ideas better. The below is very elegant.</div><div><br=
></div><div><div>auto result =3D parse&lt;int&gt;("1234");</div><div>if(!re=
sult) {<br>&nbsp; //handle error</div><div>&nbsp; //Can do switch(result.er=
ror()) if we want</div><div>}</div><div>use(result.value());</div></div><di=
v><br></div><div><div>We could make value() throw on error like std::expect=
ed, giving us a C style return code interface and C++ style exception inter=
face all in one. If the user explicitly checks the error status before call=
ing value(), its easy for the compiler to optimize out the conditional and =
the throwing logic entirely, making this an efficient exceptionless interfa=
ce. As an added bonus if the caller is a noexcept function, calling value()=
 without checking for an error will result in a call to std::terminate(), g=
iving us a runtime check that we checked the error status. We don't need ex=
tra overloads for exceptions. value_or() also provides for users who want t=
he defaulting behavior, without yet another overload.</div></div><div><br><=
/div><div>So far the most convincing argument I've seen for the out paramet=
er interface is the std::getline() example where you're parsing an expensiv=
e to copy type like big_int. Perhaps one compromise to this could be an ove=
rload:</div><div><br></div><div>//Ignoring iterator versions for brevity</d=
iv><div>template &lt;typename T&gt;</div><div>return_type parse&lt;T&gt;(st=
ring_view s);</div><div>template &lt;typename T&gt;</div><div>return_ref_ty=
pe parse&lt;T&gt;(string_view s, T&amp; t);</div><div><br></div><div>The se=
cond one here allows you to provide the object with which to store the resu=
lt. The return type of return_ref_type is the same as return_type with valu=
e(), error(), next() etc.. except that value() just returns a reference to =
the t passed into the function instead of containing the value. Alternative=
ly the returned object could omit value() and just have next() and error().=
</div><div><br></div><div>While the output has been the biggest question (e=
ven spawning off the error code thread), I think most people prefer returni=
ng over out parameters. If we return, we must return all 3 bits of informat=
ion (value, error, string tail) in an elegant way.</div><div><br></div><div=
>Ranges:</div><div><br></div><div>There was some talk about ranges. I think=
 for now we can stick to string_view, iterator pairs, and maybe const char*=
 for supporting null terminated strings. Once ranges come into the standard=
 I don't think its too difficult to just add a range overload.</div><div><b=
r></div><div>Locales:</div><div><br></div><div>Locales absolutely must be o=
ptional. Parsing routines such as this can easily become performance hotspo=
ts in applications and must be as fast as possible. Locales mean indirect f=
unction calls (virtuals) which can quickly ruin performance.</div><div><br>=
</div><div>Configuring the Parse (the next big question?):</div><div><br>Ig=
noring whitespace, there is still a possibility for a lot configuration wit=
h how the user may want to do the parse:</div><div><ul><li><span style=3D"l=
ine-height: normal;">What radix do I use?</span></li><li><span style=3D"lin=
e-height: normal;">Do I allow 0x and 0 (octal) prefixes?</span></li><li><sp=
an style=3D"line-height: normal;">Do I allow + prefix?</span></li><li><span=
 style=3D"line-height: normal;">Locales?</span></li><li><span style=3D"line=
-height: normal;">Commas or just digits?</span></li></ul><div>First, I thin=
k we would need to decide on a complete set of options. The next big questi=
on is what is the best interface which will allow the user to specify them =
easily and what are the most sensible defaults?</div></div><div><br></div><=
div><br></div></div>

<p></p>

-- <br />
&nbsp;<br />
--- <br />
You received this message because you are subscribed to the Google Groups &=
quot;ISO C++ Standard - Future Proposals&quot; group.<br />
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to std-proposals+unsubscribe@isocpp.org.<br />
To post to this group, send email to std-proposals@isocpp.org.<br />
Visit this group at <a href=3D"http://groups.google.com/a/isocpp.org/group/=
std-proposals/">http://groups.google.com/a/isocpp.org/group/std-proposals/<=
/a>.<br />

------=_Part_3309_10056467.1391580135384--

.

Author: Olaf van der Spek <olafvdspek@gmail.com>
Date: Wed, 5 Feb 2014 09:32:19 +0100 Raw View

On Wed, Feb 5, 2014 at 3:44 AM, Nevin Liber <nevin@eviloverlord.com> wrote:
> On 4 February 2014 20:16, Paul Tessier <phernost@gmail.com> wrote:
>>
>> Assume that big_int requires the heap to allow for very big int's, say 10
>> to 2000 digits, a value returning version has no way to avoid allocating at
>> each parse, regardless of move-assignment or RVO.
>
>
> If you are *that* concerned about performance, it is extremely unlikely
> you'd be willing to pay the runtime cost for locale support, either.

True, some use cases probably require a version without locale support.

>> the reverse cannot be said for value returning functions.
>
>
> The problem is that people have to keep writing those functions to make up
> for bad interfaces.

Only if one doesn't provide both interfaces in the library.
In some use cases it's exactly the right interface and it seems like a
proper and simple foundation for other interfaces.

--
Olaf

--

---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.

.

Author: Miro Knejp <miro@knejp.de>
Date: Wed, 05 Feb 2014 13:34:04 +0100 Raw View

This is a multi-part message in MIME format.
--------------020701020404080401030701
Content-Type: text/plain; charset=UTF-8; format=flowed


> Do you mean you actually have a real example of an iterator that
> cannot be dereferenced more than once? (What on earth would create
> such a thing?)
That's not what I described. If I pass a single-pass InputIterator, for
example istreambuf_iterator, to parse() it chews away N whitespaces and
then fails to recognize a number then any information on the whitespaces
is gone as I cannot go back and re-iterate the range.
>
>> The actual whitespace
>> content is lost. What if I needed this information to increment a line
>> counter after parsing the number?
>
> ...then don't tell the parser to eat whitespace. (Note: the parser
> *must* have a strict mode... so I agree with you there. A mode that
> eats everything possible and then tells you how far it got may also be
> required. Anything else is probably in the 'nice to have' category.)
>
And I am on the same track as Matthew F. in that parse() should have one
responsibility and one only: convert the textual representation of a
value to a value, and nothing else. Skipping anything before the value
is an entirely separate matter and has nothing to do with processing the
actual value. Do it before invoking parse. That is separation of
concerns. Let's focus on parsing the actual value, not the noise around it.

> Except that with an out parameter no copies need be made, which
> depending on cost of copying said type, this may be a bottle neck.
> Your version of an out parameter composed of a value returning version
> forces a copy regardless of the need for one.
What happens with the out parameter when parsing fails? Is it in an
undefined state? Or left unmodified? If the latter then parse() had to
create a temporary and the entire allocation prevention and no-copy
argument is down the drain.

I would prefer the value to be in a well defined state when parsing
fails. Incrementing or accessing an iterator may throw an exception and
if the value is then left in a partial state that's really bad. It also
defeats the purpose of using it as a default fallback value. I'd rather
have a strong exception guarantee for the value I'm passing into the
function.

Since without optional<>, expected<> or return codes a failed parse
might as well be indicated by throwing an exception (and that is the
case in many languages and frameworks) a failed parse should have the
same guarantees as if an exception were thrown in my oppinion.

> Ignoring whitespace, there is still a possibility for a lot
> configuration with how the user may want to do the parse:
>
>   * What radix do I use?
>   * Do I allow 0x and 0 (octal) prefixes?
>   * Do I allow + prefix?
>   * Locales?
>   * Commas or just digits?
>
> First, I think we would need to decide on a complete set of options.
> The next big question is what is the best interface which will allow
> the user to specify them easily and what are the most sensible defaults?
I'm not a fan of specifying gazillions of options hence I proposed
separate overloads for each task and "utility" methods to detect each
parts separately (like parse_radix_prefix) to allow composition of the
various parsers and make them available to users other than std
implementers. Options should only exist where flags are mutually
exclusive, like parsing radix 10 versus radix 11, both cannot be done at
the same time and should therefore be specified as an argument. But
anything that is optional (signs, prefixes, locales) can be provided as
separete overloads or methods for composition

--

---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.

--------------020701020404080401030701
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

<html>
  <head>
    <meta content=3D"text/html; charset=3DUTF-8" http-equiv=3D"Content-Type=
">
  </head>
  <body text=3D"#000000" bgcolor=3D"#FFFFFF">
    <br>
    <blockquote cite=3D"mid:lcs19i$sgl$1@ger.gmane.org" type=3D"cite">Do yo=
u
      mean you actually have a real example of an iterator that cannot
      be dereferenced more than once? (What on earth would create such a
      thing?)
      <br>
    </blockquote>
    That's not what I described. If I pass a single-pass InputIterator,
    for example istreambuf_iterator, to parse() it chews away N
    whitespaces and then fails to recognize a number then any
    information on the whitespaces is gone as I cannot go back and
    re-iterate the range.<br>
    <blockquote cite=3D"mid:lcs19i$sgl$1@ger.gmane.org" type=3D"cite">
      <br>
      <blockquote type=3D"cite">The actual whitespace
        <br>
        content is lost. What if I needed this information to increment
        a line
        <br>
        counter after parsing the number?
        <br>
      </blockquote>
      <br>
      ...then don't tell the parser to eat whitespace. (Note: the parser
      *must* have a strict mode... so I agree with you there. A mode
      that eats everything possible and then tells you how far it got
      may also be required. Anything else is probably in the 'nice to
      have' category.)
      <br>
      <br>
    </blockquote>
    And I am on the same track as Matthew F. in that parse() should have
    one responsibility and one only: convert the textual representation
    of a value to a value, and nothing else. Skipping anything before
    the value is an entirely separate matter and has nothing to do with
    processing the actual value. Do it before invoking parse. That is
    separation of concerns. Let's focus on parsing the actual value, not
    the noise around it.<br>
    <br>
    <blockquote type=3D"cite">Except that with an out parameter no copies
      need be made, which depending on cost of copying said type, this
      may be a bottle neck.=C2=A0 Your version of an out parameter composed
      of a value returning version forces a copy regardless of the need
      for one. </blockquote>
    What happens with the out parameter when parsing fails? Is it in an
    undefined state? Or left unmodified? If the latter then parse() had
    to create a temporary and the entire allocation prevention and
    no-copy argument is down the drain.<br>
    <br>
    I would prefer the value to be in a well defined state when parsing
    fails. Incrementing or accessing an iterator may throw an exception
    and if the value is then left in a partial state that's really bad.
    It also defeats the purpose of using it as a default fallback value.
    I'd rather have a strong exception guarantee for the value I'm
    passing into the function.<br>
    <br>
    Since without optional&lt;&gt;, expected&lt;&gt; or return codes a
    failed parse might as well be indicated by throwing an exception
    (and that is the case in many languages and frameworks) a failed
    parse should have the same guarantees as if an exception were thrown
    in my oppinion.<br>
    <br>
    <blockquote type=3D"cite">Ignoring whitespace, there is still a
      possibility for a lot configuration with how the user may want to
      do the parse:
      <div>
        <ul>
          <li><span style=3D"line-height: normal;">What radix do I use?</sp=
an></li>
          <li><span style=3D"line-height: normal;">Do I allow 0x and 0
              (octal) prefixes?</span></li>
          <li><span style=3D"line-height: normal;">Do I allow + prefix?</sp=
an></li>
          <li><span style=3D"line-height: normal;">Locales?</span></li>
          <li><span style=3D"line-height: normal;">Commas or just digits?</=
span></li>
        </ul>
        <div>First, I think we would need to decide on a complete set of
          options. The next big question is what is the best interface
          which will allow the user to specify them easily and what are
          the most sensible defaults?</div>
      </div>
    </blockquote>
    I'm not a fan of specifying gazillions of options hence I proposed
    separate overloads for each task and "utility" methods to detect
    each parts separately (like parse_radix_prefix) to allow composition
    of the various parsers and make them available to users other than
    std implementers. Options should only exist where flags are mutually
    exclusive, like parsing radix 10 versus radix 11, both cannot be
    done at the same time and should therefore be specified as an
    argument. But anything that is optional (signs, prefixes, locales)
    can be provided as separete overloads or methods for composition<br>
    <br>
  </body>
</html>

<p></p>

-- <br />
&nbsp;<br />
--- <br />
You received this message because you are subscribed to the Google Groups &=
quot;ISO C++ Standard - Future Proposals&quot; group.<br />
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to std-proposals+unsubscribe@isocpp.org.<br />
To post to this group, send email to std-proposals@isocpp.org.<br />
Visit this group at <a href=3D"http://groups.google.com/a/isocpp.org/group/=
std-proposals/">http://groups.google.com/a/isocpp.org/group/std-proposals/<=
/a>.<br />

--------------020701020404080401030701--


.

Author: Matthew Fioravante <fmatthew5876@gmail.com>
Date: Wed, 5 Feb 2014 05:48:33 -0800 (PST) Raw View

------=_Part_3724_22120524.1391608113606
Content-Type: text/plain; charset=UTF-8



On Wednesday, February 5, 2014 7:34:04 AM UTC-5, Miro Knejp wrote:
>
> Except that with an out parameter no copies need be made, which depending
> on cost of copying said type, this may be a bottle neck.  Your version of
> an out parameter composed of a value returning version forces a copy
> regardless of the need for one.
>
> What happens with the out parameter when parsing fails? Is it in an
> undefined state? Or left unmodified? If the latter then parse() had to
> create a temporary and the entire allocation prevention and no-copy
> argument is down the drain.
>
> I would prefer the value to be in a well defined state when parsing fails.
> Incrementing or accessing an iterator may throw an exception and if the
> value is then left in a partial state that's really bad. It also defeats
> the purpose of using it as a default fallback value. I'd rather have a
> strong exception guarantee for the value I'm passing into the function.
>
>
For simple types like int, I prefer the convention of leaving the out
parameter unmodified on failure. That way you can initialize it yourself if
you want the default behavior. Also I've had situations where I'm using
parse to update some configuration variable. If the parse fails, i want it
to fall back on whatever the previous value was. The unmodified behavior
supports this use case very nicely. I don't don't have to cache the old
value in a temporary and reassign it on error.

That being said, for expensive types like big_int you're absolutely right
that this requires the creation of temporaries and destroys any performance
benefit the out parameter design. To avoid the temporary, you have to parse
twice, once to check for correctness and again to actually load the value.

For arbitrary types, now the question becomes what state do you leave them
in? One option is an indetermined but valid state. This might be the most
efficient but a little more error prone. Another option is to create a
default constructed temporary and swap with it (or copy from it). This is
safer but may have performance implications if the default construction
does any real work. Also it requires your type to be default constructable
and (copy/move assignable or swapable).

Probably the indetermined state is best as it offers the least
complications. For ints and floats, we can optimize by giving the
unmodified guarantee and all of the benefits it entails.

Or each type has its own convention (most will probably use the "0"
representation).


> Since without optional<>, expected<> or return codes a failed parse might
> as well be indicated by throwing an exception (and that is the case in many
> languages and frameworks) a failed parse should have the same guarantees as
> if an exception were thrown in my oppinion.
>
> Ignoring whitespace, there is still a possibility for a lot configuration
> with how the user may want to do the parse:
>
>    - What radix do I use?
>    - Do I allow 0x and 0 (octal) prefixes?
>    - Do I allow + prefix?
>    - Locales?
>    - Commas or just digits?
>
> First, I think we would need to decide on a complete set of options. The
> next big question is what is the best interface which will allow the user
> to specify them easily and what are the most sensible defaults?
>
> I'm not a fan of specifying gazillions of options hence I proposed
> separate overloads for each task and "utility" methods to detect each parts
> separately (like parse_radix_prefix) to allow composition of the various
> parsers and make them available to users other than std implementers.
> Options should only exist where flags are mutually exclusive, like parsing
> radix 10 versus radix 11, both cannot be done at the same time and should
> therefore be specified as an argument. But anything that is optional
> (signs, prefixes, locales) can be provided as separete overloads or methods
> for composition
>
>
I'm not sure I like the option flags idea as well, but I am currently at a
loss for the best way to handle the optionality. Maybe overloads as you
say. Need to think about it more.

--

---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.

------=_Part_3724_22120524.1391608113606
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr"><br><br>On Wednesday, February 5, 2014 7:34:04 AM UTC-5, M=
iro Knejp wrote:<blockquote class=3D"gmail_quote" style=3D"margin: 0;margin=
-left: 0.8ex;border-left: 1px #ccc solid;padding-left: 1ex;"><div text=3D"#=
000000" bgcolor=3D"#FFFFFF">
    <blockquote type=3D"cite">Except that with an out parameter no copies
      need be made, which depending on cost of copying said type, this
      may be a bottle neck.&nbsp; Your version of an out parameter composed
      of a value returning version forces a copy regardless of the need
      for one. </blockquote>
    What happens with the out parameter when parsing fails? Is it in an
    undefined state? Or left unmodified? If the latter then parse() had
    to create a temporary and the entire allocation prevention and
    no-copy argument is down the drain.<br>
    <br>
    I would prefer the value to be in a well defined state when parsing
    fails. Incrementing or accessing an iterator may throw an exception
    and if the value is then left in a partial state that's really bad.
    It also defeats the purpose of using it as a default fallback value.
    I'd rather have a strong exception guarantee for the value I'm
    passing into the function.<br>
    <br></div></blockquote><div><br></div><div>For simple types like int, I=
 prefer the convention of leaving the out parameter unmodified on failure. =
That way you can initialize it yourself if you want the default behavior. A=
lso I've had situations where I'm using parse to update some configuration =
variable. If the parse fails, i want it to fall back on whatever the previo=
us value was. The unmodified behavior supports this use case very nicely. I=
 don't don't have to cache the old value in a temporary and reassign it on =
error.</div><div><br></div><div>That being said, for expensive types like b=
ig_int you're absolutely right that this requires the creation of temporari=
es and destroys any performance benefit the out parameter design. To avoid =
the temporary, you have to parse twice, once to check for correctness and a=
gain to actually load the value.</div><div><br>For arbitrary types, now the=
 question becomes what state do you leave them in? One option is an indeter=
mined but valid state. This might be the most efficient but a little more e=
rror prone. Another option is to create a default constructed temporary and=
 swap with it (or copy from it). This is safer but may have performance imp=
lications if the default construction does any real work. Also it requires =
your type to be default constructable and (copy/move assignable or swapable=
).</div><div><br></div><div>Probably the indetermined state is best as it o=
ffers the least complications. For ints and floats, we can optimize by givi=
ng the unmodified guarantee and all of the benefits it entails.</div><div><=
br></div><div>Or each type has its own convention (most will probably use t=
he "0" representation).</div><div>&nbsp;</div><blockquote class=3D"gmail_qu=
ote" style=3D"margin: 0;margin-left: 0.8ex;border-left: 1px #ccc solid;padd=
ing-left: 1ex;"><div text=3D"#000000" bgcolor=3D"#FFFFFF">
    Since without optional&lt;&gt;, expected&lt;&gt; or return codes a
    failed parse might as well be indicated by throwing an exception
    (and that is the case in many languages and frameworks) a failed
    parse should have the same guarantees as if an exception were thrown
    in my oppinion.<br>
    <br>
    <blockquote type=3D"cite">Ignoring whitespace, there is still a
      possibility for a lot configuration with how the user may want to
      do the parse:
      <div>
        <ul>
          <li><span style=3D"line-height:normal">What radix do I use?</span=
></li>
          <li><span style=3D"line-height:normal">Do I allow 0x and 0
              (octal) prefixes?</span></li>
          <li><span style=3D"line-height:normal">Do I allow + prefix?</span=
></li>
          <li><span style=3D"line-height:normal">Locales?</span></li>
          <li><span style=3D"line-height:normal">Commas or just digits?</sp=
an></li>
        </ul>
        <div>First, I think we would need to decide on a complete set of
          options. The next big question is what is the best interface
          which will allow the user to specify them easily and what are
          the most sensible defaults?</div>
      </div>
    </blockquote>
    I'm not a fan of specifying gazillions of options hence I proposed
    separate overloads for each task and "utility" methods to detect
    each parts separately (like parse_radix_prefix) to allow composition
    of the various parsers and make them available to users other than
    std implementers. Options should only exist where flags are mutually
    exclusive, like parsing radix 10 versus radix 11, both cannot be
    done at the same time and should therefore be specified as an
    argument. But anything that is optional (signs, prefixes, locales)
    can be provided as separete overloads or methods for composition<br>
    <br></div></blockquote><div><br></div><div>I'm not sure I like the opti=
on flags idea as well, but I am currently at a loss for the best way to han=
dle the optionality. Maybe overloads as you say. Need to think about it mor=
e.&nbsp;</div></div>

<p></p>

-- <br />
&nbsp;<br />
--- <br />
You received this message because you are subscribed to the Google Groups &=
quot;ISO C++ Standard - Future Proposals&quot; group.<br />
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to std-proposals+unsubscribe@isocpp.org.<br />
To post to this group, send email to std-proposals@isocpp.org.<br />
Visit this group at <a href=3D"http://groups.google.com/a/isocpp.org/group/=
std-proposals/">http://groups.google.com/a/isocpp.org/group/std-proposals/<=
/a>.<br />

------=_Part_3724_22120524.1391608113606--

.

Author: Olaf van der Spek <olafvdspek@gmail.com>
Date: Wed, 5 Feb 2014 15:08:04 +0100 Raw View

On Wed, Feb 5, 2014 at 7:02 AM, Matthew Fioravante
<fmatthew5876@gmail.com> wrote:
> After reading Matthew's posts (the other matthew, not me!). I like his ideas
> better. The below is very elegant.
>
> auto result = parse<int>("1234");
> if(!result) {
>   //handle error
>   //Can do switch(result.error()) if we want
> }
> use(result.value());

int result;
if (auto err = parse(result, "1234")) {
   //handle error
   //Can do switch(err) if we want
 }
 use(result);

;)

> We could make value() throw on error like std::expected, giving us a C style
> return code interface and C++ style exception interface all in one. If the
> user explicitly checks the error status before calling value(), its easy for
> the compiler to optimize out the conditional and the throwing logic

Are you sure it's easy? Got a reference?

> What radix do I use?

Have a base/radix parameter defaulting to 10.

> Do I allow 0x and 0 (octal) prefixes?

I'd allow 0x (when radix = 0), I'd never allow 0.

> Do I allow + prefix?

Sure, why not?

> Locales?
> Commas or just digits?

Both handled by a locale-aware variant.

> First, I think we would need to decide on a complete set of options. The
> next big question is what is the best interface which will allow the user to
> specify them easily and what are the most sensible defaults?

--

---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.

.

Author: Matthew Fioravante <fmatthew5876@gmail.com>
Date: Wed, 5 Feb 2014 06:45:58 -0800 (PST) Raw View

------=_Part_7187_12330892.1391611558481
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

On Wednesday, February 5, 2014 9:08:04 AM UTC-5, Olaf van der Spek wrote:
>
> > We could make value() throw on error like std::expected, giving us a C=
=20
> style=20
> > return code interface and C++ style exception interface all in one. If=
=20
> the=20
> > user explicitly checks the error status before calling value(), its eas=
y=20
> for=20
> > the compiler to optimize out the conditional and the throwing logic=20
>
> Are you sure it's easy? Got a reference?=20
>

http://gcc.godbolt.org/#%7B%22version%22%3A3%2C%22filterAsm%22%3A%7B%22labe=
ls%22%3Atrue%2C%22directives%22%3Atrue%2C%22commentOnly%22%3Atrue%7D%2C%22c=
ompilers%22%3A%5B%7B%22source%22%3A%22%23include%20%3Cstdexcept%3E%5Cn%5Cnc=
lass%20expected%20%7B%5Cn%20%20public%3A%5Cn%20%20int%20value()%20const%20%=
7B%20%5Cn%20%20%20%20if(_err)%20%7B%5Cn%20%20%20%20%20%20return%20_v%3B%5Cn=
%20%20%20%20%7D%5Cn%20%20%20%20throw%20std%3A%3Aexception()%3B%5Cn%20%20%7D=
%5Cn%20%20explicit%20operator%20bool()%20const%20%7B%20return%20_err%3B%20%=
7D%5Cn%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%5Cn%2=
0%20private%3A%5Cn%20%20int%20_v%3B%5Cn%20%20bool%20_err%3B%5Cn%20%20%20%20=
%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%5Cn%7D%3B%5Cn%5Cnexpect=
ed%20parse_int(const%20char*%20s)%3B%5Cn%5Cnint%20f1()%20%7B%5Cn%20%20auto%=
20x%20%3D%20parse_int(%5C%221234%5C%22)%3B%5Cn%20%20return%20x.value()%3B%5=
Cn%7D%5Cn%5Cnint%20f2()%20%7B%5Cn%20%20auto%20x%20%3D%20parse_int(%5C%22567=
%5C%22)%3B%5Cn%20%20if(!x)%20%7B%5Cn%20%20%20%20return%200%3B%5Cn%20%20%7D%=
5Cn%20%20return%20x.value()%3B%5Cn%7D%22%2C%22compiler%22%3A%22%2Fopt%2Fcla=
ng-3.3%2Fbin%2Fclang%2B%2B%22%2C%22options%22%3A%22-O3%20-march%3Dnative%20=
-std%3Dc%2B%2B11%22%7D%5D%7D

Here is the code if that link does not work. Take the following code and=20
drop it into http://gcc.godbolt.org. Turn on optimizations and -std=3Dc++11=
=20
and note the dissassembly for f2(). There are no exceptions being created=
=20
on any code path.

#include <stdexcept>

class expected {
  public:
  int value() const {=20
    if(_err) {
      return _v;
    }
    throw std::exception();
  }
  explicit operator bool() const { return _err; }
                    =20
  private:
  int _v;
  bool _err;
                    =20
};

expected parse_int(const char* s);

int f1() {
  auto x =3D parse_int("1234");
  return x.value();
}

int f2() {
  auto x =3D parse_int("567");
  if(!x) {
    return 0;
  }
  return x.value();
}

=20

>
> > What radix do I use?=20
>
> Have a base/radix parameter defaulting to 10.=20
>
> > Do I allow 0x and 0 (octal) prefixes?=20
>
> I'd allow 0x (when radix =3D 0), I'd never allow 0.=20
>

Some people still use octal. I wouldn't remove support for it.

> > Do I allow + prefix?=20
>
> Sure, why not?=20
>
> > Locales?=20
> > Commas or just digits?=20
>
> Both handled by a locale-aware variant.=20
>

That's a good a point. If you specify a locale, its probably because you=20
want comma support along with language support. Maybe comma support should=
=20
be bound to whether or not a locale was requested, with the default fast=20
path just expecting digits.=20

--=20

---=20
You received this message because you are subscribed to the Google Groups "=
ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposa=
ls/.

------=_Part_7187_12330892.1391611558481
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr"><br><br>On Wednesday, February 5, 2014 9:08:04 AM UTC-5, O=
laf van der Spek wrote:<blockquote class=3D"gmail_quote" style=3D"margin: 0=
;margin-left: 0.8ex;border-left: 1px #ccc solid;padding-left: 1ex;">&gt; We=
 could make value() throw on error like std::expected, giving us a C style
<br>&gt; return code interface and C++ style exception interface all in one=
.. If the
<br>&gt; user explicitly checks the error status before calling value(), it=
s easy for
<br>&gt; the compiler to optimize out the conditional and the throwing logi=
c
<br>
<br>Are you sure it's easy? Got a reference?
<br></blockquote><div><br></div><div>http://gcc.godbolt.org/#%7B%22version%=
22%3A3%2C%22filterAsm%22%3A%7B%22labels%22%3Atrue%2C%22directives%22%3Atrue=
%2C%22commentOnly%22%3Atrue%7D%2C%22compilers%22%3A%5B%7B%22source%22%3A%22=
%23include%20%3Cstdexcept%3E%5Cn%5Cnclass%20expected%20%7B%5Cn%20%20public%=
3A%5Cn%20%20int%20value()%20const%20%7B%20%5Cn%20%20%20%20if(_err)%20%7B%5C=
n%20%20%20%20%20%20return%20_v%3B%5Cn%20%20%20%20%7D%5Cn%20%20%20%20throw%2=
0std%3A%3Aexception()%3B%5Cn%20%20%7D%5Cn%20%20explicit%20operator%20bool()=
%20const%20%7B%20return%20_err%3B%20%7D%5Cn%20%20%20%20%20%20%20%20%20%20%2=
0%20%20%20%20%20%20%20%20%20%20%5Cn%20%20private%3A%5Cn%20%20int%20_v%3B%5C=
n%20%20bool%20_err%3B%5Cn%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%2=
0%20%20%20%20%5Cn%7D%3B%5Cn%5Cnexpected%20parse_int(const%20char*%20s)%3B%5=
Cn%5Cnint%20f1()%20%7B%5Cn%20%20auto%20x%20%3D%20parse_int(%5C%221234%5C%22=
)%3B%5Cn%20%20return%20x.value()%3B%5Cn%7D%5Cn%5Cnint%20f2()%20%7B%5Cn%20%2=
0auto%20x%20%3D%20parse_int(%5C%22567%5C%22)%3B%5Cn%20%20if(!x)%20%7B%5Cn%2=
0%20%20%20return%200%3B%5Cn%20%20%7D%5Cn%20%20return%20x.value()%3B%5Cn%7D%=
22%2C%22compiler%22%3A%22%2Fopt%2Fclang-3.3%2Fbin%2Fclang%2B%2B%22%2C%22opt=
ions%22%3A%22-O3%20-march%3Dnative%20-std%3Dc%2B%2B11%22%7D%5D%7D<br></div>=
<div><br></div><div>Here is the code if that link does not work. Take the f=
ollowing code and drop it into http://gcc.godbolt.org. Turn on optimization=
s and -std=3Dc++11 and note the dissassembly for f2(). There are no excepti=
ons being created on any code path.</div><div><br></div><div><div>#include =
&lt;stdexcept&gt;</div><div><br></div><div>class expected {</div><div>&nbsp=
; public:</div><div>&nbsp; int value() const {&nbsp;</div><div>&nbsp; &nbsp=
; if(_err) {</div><div>&nbsp; &nbsp; &nbsp; return _v;</div><div>&nbsp; &nb=
sp; }</div><div>&nbsp; &nbsp; throw std::exception();</div><div>&nbsp; }</d=
iv><div>&nbsp; explicit operator bool() const { return _err; }</div><div>&n=
bsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;<=
/div><div>&nbsp; private:</div><div>&nbsp; int _v;</div><div>&nbsp; bool _e=
rr;</div><div>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp=
; &nbsp; &nbsp;</div><div>};</div><div><br></div><div>expected parse_int(co=
nst char* s);</div><div><br></div><div>int f1() {</div><div>&nbsp; auto x =
=3D parse_int("1234");</div><div>&nbsp; return x.value();</div><div>}</div>=
<div><br></div><div>int f2() {</div><div>&nbsp; auto x =3D parse_int("567")=
;</div><div>&nbsp; if(!x) {</div><div>&nbsp; &nbsp; return 0;</div><div>&nb=
sp; }</div><div>&nbsp; return x.value();</div><div>}</div></div><div><br></=
div><div>&nbsp;</div><blockquote class=3D"gmail_quote" style=3D"margin: 0;m=
argin-left: 0.8ex;border-left: 1px #ccc solid;padding-left: 1ex;">
<br>&gt; What radix do I use?
<br>
<br>Have a base/radix parameter defaulting to 10.
<br>
<br>&gt; Do I allow 0x and 0 (octal) prefixes?
<br>
<br>I'd allow 0x (when radix =3D 0), I'd never allow 0.
<br></blockquote><div><br></div><div>Some people still use octal. I wouldn'=
t remove support for it.</div><div><br></div><blockquote class=3D"gmail_quo=
te" style=3D"margin: 0;margin-left: 0.8ex;border-left: 1px #ccc solid;paddi=
ng-left: 1ex;">
<br>&gt; Do I allow + prefix?
<br>
<br>Sure, why not?
<br>
<br>&gt; Locales?
<br>&gt; Commas or just digits?
<br>
<br>Both handled by a locale-aware variant.
<br></blockquote><div><br></div><div>That's a good a point. If you specify =
a locale, its probably because you want comma support along with language s=
upport. Maybe comma support should be bound to whether or not a locale was =
requested, with the default fast path just expecting digits.&nbsp;</div></d=
iv>

<p></p>

-- <br />
&nbsp;<br />
--- <br />
You received this message because you are subscribed to the Google Groups &=
quot;ISO C++ Standard - Future Proposals&quot; group.<br />
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to std-proposals+unsubscribe@isocpp.org.<br />
To post to this group, send email to std-proposals@isocpp.org.<br />
Visit this group at <a href=3D"http://groups.google.com/a/isocpp.org/group/=
std-proposals/">http://groups.google.com/a/isocpp.org/group/std-proposals/<=
/a>.<br />

------=_Part_7187_12330892.1391611558481--

.

Author: Matthew Fioravante <fmatthew5876@gmail.com>
Date: Wed, 5 Feb 2014 06:59:05 -0800 (PST) Raw View

------=_Part_7023_16974286.1391612345431
Content-Type: text/plain; charset=UTF-8



On Wednesday, February 5, 2014 9:45:58 AM UTC-5, Matthew Fioravante wrote:
>
> > Do I allow 0x and 0 (octal) prefixes?
>>
>> I'd allow 0x (when radix = 0), I'd never allow 0.
>>
>
> Some people still use octal. I wouldn't remove support for it.
>
>
Also another way to handle the radix is with bit flags, then you can
specify exactly which radii you want to support. If you don't specify
radix_8, then the leading zeroes can just denote a decimal values with
leading zeroes.

parse("1234", radix_8 | radix_16 | radix_10);

or for brevity we can have a shortcut

parse("1234", radix_all);

If we go with the flags approach, all of the flag handling and branching
should be inlined, even if the underlying function call to do the parsing
is not. This will allow the compiler to optimize out the branches since 99%
of the time, users will be specifying constants rather than collecting
flags in a variable and dynamically choosing how to parse at runtime.

--

---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.

------=_Part_7023_16974286.1391612345431
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr"><br><br>On Wednesday, February 5, 2014 9:45:58 AM UTC-5, M=
atthew Fioravante wrote:<blockquote class=3D"gmail_quote" style=3D"margin: =
0;margin-left: 0.8ex;border-left: 1px #ccc solid;padding-left: 1ex;"><div d=
ir=3D"ltr"><blockquote class=3D"gmail_quote" style=3D"margin:0;margin-left:=
0.8ex;border-left:1px #ccc solid;padding-left:1ex">&gt; Do I allow 0x and 0=
 (octal) prefixes?
<br>
<br>I'd allow 0x (when radix =3D 0), I'd never allow 0.
<br></blockquote><div><br></div><div>Some people still use octal. I wouldn'=
t remove support for it.</div><div><br></div></div></blockquote><div><br></=
div><div>Also another way to handle the radix is with bit flags, then you c=
an specify exactly which radii you want to support. If you don't specify ra=
dix_8, then the leading zeroes can just denote a decimal values with leadin=
g zeroes.</div><div><br></div><div>parse("1234", radix_8 | radix_16 | radix=
_10);</div><div><br></div><div>or for brevity we can have a shortcut</div><=
div><br></div><div>parse("1234", radix_all);&nbsp;</div><div><br></div><div=
>If we go with the flags approach, all of the flag handling and branching s=
hould be inlined, even if the underlying function call to do the parsing is=
 not. This will allow the compiler to optimize out the branches since 99% o=
f the time, users will be specifying constants rather than collecting flags=
 in a variable and dynamically choosing how to parse at runtime.</div></div=
>

<p></p>

-- <br />
&nbsp;<br />
--- <br />
You received this message because you are subscribed to the Google Groups &=
quot;ISO C++ Standard - Future Proposals&quot; group.<br />
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to std-proposals+unsubscribe@isocpp.org.<br />
To post to this group, send email to std-proposals@isocpp.org.<br />
Visit this group at <a href=3D"http://groups.google.com/a/isocpp.org/group/=
std-proposals/">http://groups.google.com/a/isocpp.org/group/std-proposals/<=
/a>.<br />

------=_Part_7023_16974286.1391612345431--

.

Author: Thiago Macieira <thiago@macieira.org>
Date: Wed, 05 Feb 2014 08:30:12 -0800 Raw View

Em qua 05 fev 2014, =E0s 15:08:04, Olaf van der Spek escreveu:
> > Do I allow 0x and 0 (octal) prefixes?
>=20
> I'd allow 0x (when radix =3D 0), I'd never allow 0.

Don't deviate from strtoll.

radix =3D 0 implies prefixes 0 and 0x are recognised. If the library is upd=
ated=20
with 0b prefix for binaries, then that too.

--=20
Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org
   Software Architect - Intel Open Source Technology Center
      PGP/GPG: 0x6EF45358; fingerprint:
      E067 918B B660 DBD1 105C  966C 33F5 F005 6EF4 5358

--=20

---=20
You received this message because you are subscribed to the Google Groups "=
ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposa=
ls/.

.

Author: Matthew Fioravante <fmatthew5876@gmail.com>
Date: Wed, 5 Feb 2014 09:00:53 -0800 (PST) Raw View

------=_Part_5455_11655359.1391619653676
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

On Wednesday, February 5, 2014 11:30:12 AM UTC-5, Thiago Macieira wrote:
>
> Em qua 05 fev 2014, =C3=A0s 15:08:04, Olaf van der Spek escreveu:=20
> > > Do I allow 0x and 0 (octal) prefixes?=20
> >=20
> > I'd allow 0x (when radix =3D 0), I'd never allow 0.=20
>
> Don't deviate from strtoll.=20
>
> radix =3D 0 implies prefixes 0 and 0x are recognised. If the library is=
=20
> updated=20
> with 0b prefix for binaries, then that too.=20
>

Why wait for the library, 0b prefix is useful and should be supported=20
anyway.

--=20

---=20
You received this message because you are subscribed to the Google Groups "=
ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposa=
ls/.

------=_Part_5455_11655359.1391619653676
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr"><br><br>On Wednesday, February 5, 2014 11:30:12 AM UTC-5, =
Thiago Macieira wrote:<blockquote class=3D"gmail_quote" style=3D"margin: 0;=
margin-left: 0.8ex;border-left: 1px #ccc solid;padding-left: 1ex;">Em qua 0=
5 fev 2014, =C3=A0s 15:08:04, Olaf van der Spek escreveu:
<br>&gt; &gt; Do I allow 0x and 0 (octal) prefixes?
<br>&gt;=20
<br>&gt; I'd allow 0x (when radix =3D 0), I'd never allow 0.
<br>
<br>Don't deviate from strtoll.
<br>
<br>radix =3D 0 implies prefixes 0 and 0x are recognised. If the library is=
 updated=20
<br>with 0b prefix for binaries, then that too.
<br></blockquote><div><br></div><div>Why wait for the library, 0b prefix is=
 useful and should be supported anyway.</div></div>

<p></p>

-- <br />
&nbsp;<br />
--- <br />
You received this message because you are subscribed to the Google Groups &=
quot;ISO C++ Standard - Future Proposals&quot; group.<br />
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to std-proposals+unsubscribe@isocpp.org.<br />
To post to this group, send email to std-proposals@isocpp.org.<br />
Visit this group at <a href=3D"http://groups.google.com/a/isocpp.org/group/=
std-proposals/">http://groups.google.com/a/isocpp.org/group/std-proposals/<=
/a>.<br />

------=_Part_5455_11655359.1391619653676--

.

Author: Thiago Macieira <thiago@macieira.org>
Date: Wed, 05 Feb 2014 09:55:38 -0800 Raw View

Em qua 05 fev 2014, =E0s 09:00:53, Matthew Fioravante escreveu:
> > Don't deviate from strtoll.=20
> >=20
> > radix =3D 0 implies prefixes 0 and 0x are recognised. If the library is=
=20
> > updated=20
> > with 0b prefix for binaries, then that too.=20
>=20
> Why wait for the library, 0b prefix is useful and should be supported=20
> anyway.

Because you shouldn't deviate from strtoll. The support should be done firs=
t in=20
strtoll and then in whatever you're proposing. Yes, I know I'm asking you t=
o=20
convince ISO C and the POSIX standard groups.

The reason being that most C++ standard library implementations will delega=
te=20
to strtoll or similar functions (like we do in Qt). Asking for functionalit=
y=20
different from strtoll means asking for more complexity from library=20
developers.

Alternatively, make sure that strtoll could be implemented on top of a plai=
n C=20
library routine that is the backend for your new function. That would solve=
=20
the problem of complexity.

--=20
Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org
   Software Architect - Intel Open Source Technology Center
      PGP/GPG: 0x6EF45358; fingerprint:
      E067 918B B660 DBD1 105C  966C 33F5 F005 6EF4 5358

--=20

---=20
You received this message because you are subscribed to the Google Groups "=
ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposa=
ls/.

.

Author: Olaf van der Spek <olafvdspek@gmail.com>
Date: Wed, 5 Feb 2014 18:58:11 +0100 Raw View

On Wed, Feb 5, 2014 at 6:55 PM, Thiago Macieira <thiago@macieira.org> wrote=
:
> Em qua 05 fev 2014, =C3=A0s 09:00:53, Matthew Fioravante escreveu:
>> > Don't deviate from strtoll.
>> >
>> > radix =3D 0 implies prefixes 0 and 0x are recognised. If the library i=
s
>> > updated
>> > with 0b prefix for binaries, then that too.
>>
>> Why wait for the library, 0b prefix is useful and should be supported
>> anyway.
>
> Because you shouldn't deviate from strtoll. The support should be done fi=
rst in
> strtoll and then in whatever you're proposing. Yes, I know I'm asking you=
 to
> convince ISO C and the POSIX standard groups.
>
> The reason being that most C++ standard library implementations will dele=
gate
> to strtoll or similar functions (like we do in Qt). Asking for functional=
ity
> different from strtoll means asking for more complexity from library
> developers.

strtoll doesn't support non-nul terminated input does it?



--=20
Olaf

--=20

---=20
You received this message because you are subscribed to the Google Groups "=
ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposa=
ls/.

.

Author: Matthew Fioravante <fmatthew5876@gmail.com>
Date: Wed, 5 Feb 2014 10:33:11 -0800 (PST) Raw View

------=_Part_592_26389433.1391625192169
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

On Wednesday, February 5, 2014 12:58:11 PM UTC-5, Olaf van der Spek wrote:
>
> On Wed, Feb 5, 2014 at 6:55 PM, Thiago Macieira <thi...@macieira.org<java=
script:>>=20
> wrote:=20
> > Em qua 05 fev 2014, =C3=A0s 09:00:53, Matthew Fioravante escreveu:=20
> >> > Don't deviate from strtoll.=20
> >> >=20
> >> > radix =3D 0 implies prefixes 0 and 0x are recognised. If the library=
 is=20
> >> > updated=20
> >> > with 0b prefix for binaries, then that too.=20
> >>=20
> >> Why wait for the library, 0b prefix is useful and should be supported=
=20
> >> anyway.=20
> >=20
> > Because you shouldn't deviate from strtoll. The support should be done=
=20
> first in=20
> > strtoll and then in whatever you're proposing. Yes, I know I'm asking=
=20
> you to=20
> > convince ISO C and the POSIX standard groups.=20
> >=20
> > The reason being that most C++ standard library implementations will=20
> delegate=20
> > to strtoll or similar functions (like we do in Qt). Asking for=20
> functionality=20
> > different from strtoll means asking for more complexity from library=20
> > developers.=20
>
> strtoll doesn't support non-nul terminated input does it?=20
>
>
input to strtoll and friends must be null terminated. Therefore, we cannot=
=20
use them as an implementation base for this proposal without performance=20
degradation (copying to a local null terminated buffer). Parsing ints is=20
not too hard to do. For floats, we will need to dig in and see what=20
strtof() uses under the hood, libmpfr or some such.

strtoll() also doesn't support locales with commas and all of those=20
possible variants.

Thiago makes a valid point though about implementation burden. Any proposal=
=20
would need to include advice for how to implement the library using already=
=20
existing C components on the most popular platforms, or if none exist, make=
=20
a strong case for why we must reimplement everything from scratch. We can=
=20
also petition the authors of these components to support (const char*,=20
size_t) variants of their methods.

For a 3rd party library like QT, it seems like a huge amount of work with=
=20
little payoff to reimplement the parsing functions. For the standard=20
library, it is a possibility if there is good reason for it.

--=20

---=20
You received this message because you are subscribed to the Google Groups "=
ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposa=
ls/.

------=_Part_592_26389433.1391625192169
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr"><br><br>On Wednesday, February 5, 2014 12:58:11 PM UTC-5, =
Olaf van der Spek wrote:<blockquote class=3D"gmail_quote" style=3D"margin: =
0;margin-left: 0.8ex;border-left: 1px #ccc solid;padding-left: 1ex;">On Wed=
, Feb 5, 2014 at 6:55 PM, Thiago Macieira &lt;<a href=3D"javascript:" targe=
t=3D"_blank" gdf-obfuscated-mailto=3D"rurZ66IhhckJ" onmousedown=3D"this.hre=
f=3D'javascript:';return true;" onclick=3D"this.href=3D'javascript:';return=
 true;">thi...@macieira.org</a>&gt; wrote:
<br>&gt; Em qua 05 fev 2014, =C3=A0s 09:00:53, Matthew Fioravante escreveu:
<br>&gt;&gt; &gt; Don't deviate from strtoll.
<br>&gt;&gt; &gt;
<br>&gt;&gt; &gt; radix =3D 0 implies prefixes 0 and 0x are recognised. If =
the library is
<br>&gt;&gt; &gt; updated
<br>&gt;&gt; &gt; with 0b prefix for binaries, then that too.
<br>&gt;&gt;
<br>&gt;&gt; Why wait for the library, 0b prefix is useful and should be su=
pported
<br>&gt;&gt; anyway.
<br>&gt;
<br>&gt; Because you shouldn't deviate from strtoll. The support should be =
done first in
<br>&gt; strtoll and then in whatever you're proposing. Yes, I know I'm ask=
ing you to
<br>&gt; convince ISO C and the POSIX standard groups.
<br>&gt;
<br>&gt; The reason being that most C++ standard library implementations wi=
ll delegate
<br>&gt; to strtoll or similar functions (like we do in Qt). Asking for fun=
ctionality
<br>&gt; different from strtoll means asking for more complexity from libra=
ry
<br>&gt; developers.
<br>
<br>strtoll doesn't support non-nul terminated input does it?
<br>
<br></blockquote><div><br></div><div>input to strtoll and friends must be n=
ull terminated. Therefore, we cannot use them as an implementation base for=
 this proposal without performance degradation (copying to a local null ter=
minated buffer). Parsing ints is not too hard to do. For floats, we will ne=
ed to dig in and see what strtof() uses under the hood, libmpfr or some suc=
h.</div><div><br></div><div>strtoll() also doesn't support locales with com=
mas and all of those possible variants.</div><div><br></div><div>Thiago mak=
es a valid point though about implementation burden. Any proposal would nee=
d to include advice for how to implement the library using already existing=
 C components on the most popular platforms, or if none exist, make a stron=
g case for why we must reimplement everything from scratch. We can also pet=
ition the authors of these components to support (const char*, size_t) vari=
ants of their methods.</div><div><br></div><div>For a 3rd party library lik=
e QT, it seems like a huge amount of work with little payoff to reimplement=
 the parsing functions. For the standard library, it is a possibility if th=
ere is good reason for it.</div></div>

<p></p>

-- <br />
&nbsp;<br />
--- <br />
You received this message because you are subscribed to the Google Groups &=
quot;ISO C++ Standard - Future Proposals&quot; group.<br />
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to std-proposals+unsubscribe@isocpp.org.<br />
To post to this group, send email to std-proposals@isocpp.org.<br />
Visit this group at <a href=3D"http://groups.google.com/a/isocpp.org/group/=
std-proposals/">http://groups.google.com/a/isocpp.org/group/std-proposals/<=
/a>.<br />

------=_Part_592_26389433.1391625192169--

.

Author: Miro Knejp <miro@knejp.de>
Date: Wed, 05 Feb 2014 21:20:52 +0100 Raw View

This is a multi-part message in MIME format.
--------------060601090404010408010909
Content-Type: text/plain; charset=UTF-8; format=flowed

@Matthew Fioravante:
> For simple types like int, I prefer the convention of leaving the out
> parameter unmodified on failure. That way you can initialize it
> yourself if you want the default behavior. Also I've had situations
> where I'm using parse to update some configuration variable. If the
> parse fails, i want it to fall back on whatever the previous value
> was. The unmodified behavior supports this use case very nicely. I
> don't don't have to cache the old value in a temporary and reassign it
> on error.
>
> That being said, for expensive types like big_int you're absolutely
> right that this requires the creation of temporaries and destroys any
> performance benefit the out parameter design.
 From an implementation standpoint both are identical: create a
temporary, populate it, and assign (or swap) when done.
> To avoid the temporary, you have to parse twice, once to check for
> correctness and again to actually load the value.
Which is not possible with InputIterator.
>
> For arbitrary types, now the question becomes what state do you leave
> them in? One option is an indetermined but valid state. This might be
> the most efficient but a little more error prone. Another option is to
> create a default constructed temporary and swap with it (or copy from
> it). This is safer but may have performance implications if the
> default construction does any real work. Also it requires your type to
> be default constructable and (copy/move assignable or swapable).
>
> Probably the indetermined state is best as it offers the least
> complications. For ints and floats, we can optimize by giving the
> unmodified guarantee and all of the benefits it entails.
Requiring the type to be default constructible when modifying the value
in-place seems like a really puzzling prerequisite, doesn't it?

If the behavior is not consistent across all parsers for both builtin
and composite std types chaos will ensue and the committe will probably
never let it through. What you do in implementations for user defined
types is your own business, but the parsers for builtin and std types
should all provide consistent guarantees and behavior. That kind of
inconsistency makes it close to impossible to write reliable template<T>
methods that make use of parse<T>(). How do you know whether the out
parameter is in any usable state after a failure when T is arbitrary?
This needs to be consistent across the board.

@Matthew Woehlke:
> This iterator is non-copyable? And/or incrementing it is destructive
> to copies of the iterator? (I would hope not the latter, as that is
> terrible API.)
>
> If not, there should not be a problem. (Okay, given it is
> istreambuf_iterator, I suppose I can imagine one or both of the above
> being true. It's not obvious to me from either cplusplus.com or
> cppreference.com if istreambuf_iterator is or is not copyable...)
Just imagine the iterator performs an fread() of 1 character every time
it is incremented, thus incrementing the file read position. Copying the
iterator doesn't do anything and is perfectly fine but as soon as you
increment one of them it affects every other. It's a silly example but
shows exactly how the iterator category works. It's design allows it to
read from an unbuffered data source, which you can do only once without
storing the previous value somewhere first.

[input.iterators]/3 "For input iterators, a == b does not imply ++a ==
++b." and "Algorithms on input iterators should never attempt to pass
through the same iterator twice. They should be /single pass/ algorithms."

> I do think we need at least one parsing option; whether or not to
> allow trailing characters.
I don't think that is required. the parser can just stop when it reaches
an invalid character and signal success if the input to that point was
sufficient to create a value. You can inspect the returned iterator
whether the end of the input was reached or not and act accordingly.
This way parse() is very flexible and can work at the core of more
advanced interfaces. There was already mentioning of a match_integer
method requiring the entire source to represent the value and once you
have parse() with it's one well defined reponsibility it is trivial to
implement such a match_X method on top of it.

@Thiago Macieira:
> Because you shouldn't deviate from strtoll. The support should be done
> first in
> strtoll and then in whatever you're proposing. Yes, I know I'm asking
> you to
> convince ISO C and the POSIX standard groups.
>
> The reason being that most C++ standard library implementations will
> delegate
> to strtoll or similar functions (like we do in Qt). Asking for
> functionality
> different from strtoll means asking for more complexity from library
> developers.
>
> Alternatively, make sure that strtoll could be implemented on top of a
> plain C
> library routine that is the backend for your new function. That would
> solve
> the problem of complexity.
While it makes sense and sounds great, you can only implement it in
terms of strtoll if

 1. The iterators point to contiguous memory, and
 2. The value type is char, and
 3. They range is null terminated.

There is currently no reliable way to detect 1 and 3. The contiguous
iterator category proposal would solve 1, but 3 requires dereferencing
the end iterator and thus UB. That limits our options alot. If
implementability with strtoxx is a requirement you can just drop the
iterators, templates, locales and mark this thread as closed since the
limitations of strtoxx started it in the first place.

--

---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.

--------------060601090404010408010909
Content-Type: text/html; charset=UTF-8

<html>
  <head>
    <meta content="text/html; charset=UTF-8" http-equiv="Content-Type">
  </head>
  <body text="#000000" bgcolor="#FFFFFF">
    @Matthew Fioravante:<br>
    <blockquote
      cite="mid:a3d2ae60-fb33-4bdb-afe5-a5a49fcbc856@isocpp.org"
      type="cite">
      <div dir="ltr">For simple types like int, I prefer the convention
        of leaving the out parameter unmodified on failure. That way you
        can initialize it yourself if you want the default behavior.
        Also I've had situations where I'm using parse to update some
        configuration variable. If the parse fails, i want it to fall
        back on whatever the previous value was. The unmodified behavior
        supports this use case very nicely. I don't don't have to cache
        the old value in a temporary and reassign it on error.
        <div><br>
        </div>
        <div>That being said, for expensive types like big_int you're
          absolutely right that this requires the creation of
          temporaries and destroys any performance benefit the out
          parameter design. </div>
      </div>
    </blockquote>
    From an implementation standpoint both are identical: create a
    temporary, populate it, and assign (or swap) when done.
    <blockquote
      cite="mid:a3d2ae60-fb33-4bdb-afe5-a5a49fcbc856@isocpp.org"
      type="cite">
      <div dir="ltr">
        <div>To avoid the temporary, you have to parse twice, once to
          check for correctness and again to actually load the value.</div>
      </div>
    </blockquote>
    Which is not possible with InputIterator.<br>
    <blockquote
      cite="mid:a3d2ae60-fb33-4bdb-afe5-a5a49fcbc856@isocpp.org"
      type="cite">
      <div dir="ltr">
        <div><br>
          For arbitrary types, now the question becomes what state do
          you leave them in? One option is an indetermined but valid
          state. This might be the most efficient but a little more
          error prone. Another option is to create a default constructed
          temporary and swap with it (or copy from it). This is safer
          but may have performance implications if the default
          construction does any real work. Also it requires your type to
          be default constructable and (copy/move assignable or
          swapable).</div>
        <div><br>
        </div>
        <div>Probably the indetermined state is best as it offers the
          least complications. For ints and floats, we can optimize by
          giving the unmodified guarantee and all of the benefits it
          entails.</div>
      </div>
    </blockquote>
    Requiring the type to be default constructible when modifying the
    value in-place seems like a really puzzling prerequisite, doesn't
    it?<br>
    <br>
    If the behavior is not consistent across all parsers for both
    builtin and composite std types chaos will ensue and the committe
    will probably never let it through. What you do in implementations
    for user defined types is your own business, but the parsers for
    builtin and std types should all provide consistent guarantees and
    behavior. That kind of inconsistency makes it close to impossible to
    write reliable template&lt;T&gt; methods that make use of
    parse&lt;T&gt;(). How do you know whether the out parameter is in
    any usable state after a failure when T is arbitrary? This needs to
    be consistent across the board.<br>
    <br>
    @Matthew Woehlke:<br>
    <blockquote type="cite">This iterator is non-copyable? And/or
      incrementing it is destructive to copies of the iterator? (I would
      hope not the latter, as that is terrible API.)<br>
      <br>
      If not, there should not be a problem. (Okay, given it is
      istreambuf_iterator, I suppose I can imagine one or both of the
      above being true. It's not obvious to me from either cplusplus.com
      or cppreference.com if istreambuf_iterator is or is not
      copyable...) </blockquote>
    Just imagine the iterator performs an fread() of 1 character every
    time it is incremented, thus incrementing the file read position.
    Copying the iterator doesn't do anything and is perfectly fine but
    as soon as you increment one of them it affects every other. It's a
    silly example but shows exactly how the iterator category works.
    It's design allows it to read from an unbuffered data source, which
    you can do only once without storing the previous value somewhere
    first.<br>
    <br>
    [input.iterators]/3 "For input iterators, a == b does not imply ++a
    == ++b." and "Algorithms on input iterators should never attempt to
    pass through the same iterator twice. They should be <i>single pass</i>
    algorithms."<br>
    <br>
    <blockquote type="cite">I do think we need at least one parsing
      option; whether or not to allow trailing characters. </blockquote>
    I don't think that is required. the parser can just stop when it
    reaches an invalid character and signal success if the input to that
    point was sufficient to create a value. You can inspect the returned
    iterator whether the end of the input was reached or not and act
    accordingly. This way parse() is very flexible and can work at the
    core of more advanced interfaces. There was already mentioning of a
    match_integer method requiring the entire source to represent the
    value and once you have parse() with it's one well defined
    reponsibility it is trivial to implement such a match_X method on
    top of it.<br>
    <br>
    @Thiago Macieira:<br>
    <blockquote type="cite">Because you shouldn't deviate from strtoll.
      The support should be done first in <br>
      strtoll and then in whatever you're proposing. Yes, I know I'm
      asking you to <br>
      convince ISO C and the POSIX standard groups.<br>
      <br>
      The reason being that most C++ standard library implementations
      will delegate <br>
      to strtoll or similar functions (like we do in Qt). Asking for
      functionality <br>
      different from strtoll means asking for more complexity from
      library <br>
      developers.<br>
      <br>
      Alternatively, make sure that strtoll could be implemented on top
      of a plain C <br>
      library routine that is the backend for your new function. That
      would solve <br>
      the problem of complexity.</blockquote>
    While it makes sense and sounds great, you can only implement it in
    terms of strtoll if<br>
    <ol>
      <li>The iterators point to contiguous memory, and</li>
      <li>The value type is char, and<br>
      </li>
      <li>They range is null terminated.</li>
    </ol>
    <p>There is currently no reliable way to detect 1 and 3. The
      contiguous iterator category proposal would solve 1, but 3
      requires dereferencing the end iterator and thus UB. That limits
      our options alot. If implementability with strtoxx is a
      requirement you can just drop the iterators, templates, locales
      and mark this thread as closed since the limitations of strtoxx
      started it in the first place.<br>
    </p>
  </body>
</html>

<p></p>

-- <br />
&nbsp;<br />
--- <br />
You received this message because you are subscribed to the Google Groups &quot;ISO C++ Standard - Future Proposals&quot; group.<br />
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.<br />
To post to this group, send email to std-proposals@isocpp.org.<br />
Visit this group at <a href="http://groups.google.com/a/isocpp.org/group/std-proposals/">http://groups.google.com/a/isocpp.org/group/std-proposals/</a>.<br />

--------------060601090404010408010909--


.

Author: Thiago Macieira <thiago@macieira.org>
Date: Wed, 05 Feb 2014 14:30:54 -0800 Raw View

Em qua 05 fev 2014, =E0s 21:20:52, Miro Knejp escreveu:
> > The reason being that most C++ standard library implementations will=20
> > delegate
> > to strtoll or similar functions (like we do in Qt). Asking for=20
> > functionality
> > different from strtoll means asking for more complexity from library
> > developers.
> >=20
> > Alternatively, make sure that strtoll could be implemented on top of a=
=20
> > plain C
> > library routine that is the backend for your new function. That would=
=20
> > solve
> > the problem of complexity.
>=20
> While it makes sense and sounds great, you can only implement it in=20
> terms of strtoll if
>=20
>  1. The iterators point to contiguous memory, and
>  2. The value type is char, and
>  3. They range is null terminated.
>=20
> There is currently no reliable way to detect 1 and 3. The contiguous=20
> iterator category proposal would solve 1, but 3 requires dereferencing=20
> the end iterator and thus UB. That limits our options alot. If=20
> implementability with strtoxx is a requirement you can just drop the=20
> iterators, templates, locales and mark this thread as closed since the=20
> limitations of strtoxx started it in the first place.

If you go for my alternative proposal, then you skip the need for 3. That i=
s,=20
implementing strtoll on top of a plain C function that operates on contiguo=
us=20
memory and receives a begin and end pointer.

However, I do think 1 and 2 are *reasonable*. I think the whole discussion=
=20
about input iterators that this discussion has gone on for the past few day=
s=20
is unnecessary. Simply force people to read into a contiguous-memory buffer=
 of=20
one of the base char types.

In specific: I'd like this discussion to assume that the number parsing cod=
e is=20
implemented out-of-line. Inline parsing is, forgive me for saying so, nuts.=
=20
You could maybe do it for integers, but you'd never do it for floating-poin=
t=20
results.=20

--=20
Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org
   Software Architect - Intel Open Source Technology Center
      PGP/GPG: 0x6EF45358; fingerprint:
      E067 918B B660 DBD1 105C  966C 33F5 F005 6EF4 5358

--=20

---=20
You received this message because you are subscribed to the Google Groups "=
ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposa=
ls/.

.

Author: Thiago Macieira <thiago@macieira.org>
Date: Wed, 05 Feb 2014 14:33:48 -0800 Raw View

Em qua 05 fev 2014, =E0s 10:33:11, Matthew Fioravante escreveu:
> For a 3rd party library like QT, it seems like a huge amount of work with=
=20
> little payoff to reimplement the parsing functions. For the standard=20
> library, it is a possibility if there is good reason for it.

We actually have to do it. Qt carries a copy of strtoll, strtoull and strto=
d=20
from FreeBSD because the regular C functions are unusable -- they're locale=
-
dependent. The _l versions of those functions are present from POSIX.1-2008=
,=20
but we can't rely on them.

--=20
Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org
   Software Architect - Intel Open Source Technology Center
      PGP/GPG: 0x6EF45358; fingerprint:
      E067 918B B660 DBD1 105C  966C 33F5 F005 6EF4 5358

--=20

---=20
You received this message because you are subscribed to the Google Groups "=
ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposa=
ls/.

.

Author: Thiago Macieira <thiago@macieira.org>
Date: Wed, 05 Feb 2014 14:40:45 -0800 Raw View

Em qua 05 fev 2014, =E0s 18:58:11, Olaf van der Spek escreveu:
> strtoll doesn't support non-nul terminated input does it?

Then go for the alternate proposal: strtoll implemented on top of the backe=
nd=20
function:

long long strntoll(const char *nptr, size_t len, char **endptr, int base);

long long strtoll(const char *nptr, char **endptr, int base)
{
 return strtoll(nptr, strlen(nptr), endptr, base);
}

Let me emphasise again: you do not want to have too many copies of those=20
functions lying around, specially not strtod. It's a highly complex piece o=
f=20
code. It's definitely not suitable for inlining.

So I'll repeat what I said in the other email: assume that the implementati=
on=20
contains an out-of-line backend. If you can't do that in the proposal, I'd =
say=20
its chances of passing the committee, much less of being implemented proper=
ly,=20
are very slim.
--=20
Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org
   Software Architect - Intel Open Source Technology Center
      PGP/GPG: 0x6EF45358; fingerprint:
      E067 918B B660 DBD1 105C  966C 33F5 F005 6EF4 5358

--=20

---=20
You received this message because you are subscribed to the Google Groups "=
ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposa=
ls/.

.

Author: Matthew Fioravante <fmatthew5876@gmail.com>
Date: Wed, 5 Feb 2014 18:26:50 -0800 (PST) Raw View

------=_Part_4287_1011012.1391653610096
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

On Wednesday, February 5, 2014 5:30:54 PM UTC-5, Thiago Macieira wrote:
>
> Em qua 05 fev 2014, =C3=A0s 21:20:52, Miro Knejp escreveu:=20
> > > The reason being that most C++ standard library implementations will=
=20
> > > delegate=20
> > > to strtoll or similar functions (like we do in Qt). Asking for=20
> > > functionality=20
> > > different from strtoll means asking for more complexity from library=
=20
> > > developers.=20
> > >=20
> > > Alternatively, make sure that strtoll could be implemented on top of =
a=20
> > > plain C=20
> > > library routine that is the backend for your new function. That would=
=20
> > > solve=20
> > > the problem of complexity.=20
> >=20
> > While it makes sense and sounds great, you can only implement it in=20
> > terms of strtoll if=20
> >=20
> >  1. The iterators point to contiguous memory, and=20
> >  2. The value type is char, and=20
> >  3. They range is null terminated.=20
> >=20
> > There is currently no reliable way to detect 1 and 3. The contiguous=20
> > iterator category proposal would solve 1, but 3 requires dereferencing=
=20
> > the end iterator and thus UB. That limits our options alot. If=20
> > implementability with strtoxx is a requirement you can just drop the=20
> > iterators, templates, locales and mark this thread as closed since the=
=20
> > limitations of strtoxx started it in the first place.=20
>
> If you go for my alternative proposal, then you skip the need for 3. That=
=20
> is,=20
> implementing strtoll on top of a plain C function that operates on=20
> contiguous=20
> memory and receives a begin and end pointer.=20
>
> However, I do think 1 and 2 are *reasonable*.=20

I'm fine with 1 and 2 if it turns out genericity is too hard. Possibly with=
=20
overloads for the other character types.

=20

> I think the whole discussion=20
> about input iterators that this discussion has gone on for the past few=
=20
> days=20
> is unnecessary. Simply force people to read into a contiguous-memory=20
> buffer of=20
> one of the base char types.=20
>

In all honesty I only care about string_view. Generic iterators are nice=20
(maybe so we can support vector<char> if there's a compelling reason to use=
=20
one for something??), but I'll be happily using this interface with=20
string_view all the time and probably little else.
=20

>
> In specific: I'd like this discussion to assume that the number parsing=
=20
> code is=20
> implemented out-of-line. Inline parsing is, forgive me for saying so,=20
> nuts.=20
> You could maybe do it for integers, but you'd never do it for=20
> floating-point=20
> results.=20
>

I think at this point we need to do a real study into implementations to=20
answer these questions accurately, particularly with floating point as=20
that's the real bear. How is strtof() implemented on different platforms?=
=20
What do gcc and other compilers use for floating point literals? Also you=
=20
voted against binary 0b prefix because its not supported by strtol(), but=
=20
something is going to be/already is written to support C++14 binary=20
literals. That something can be leveraged here.

--=20

---=20
You received this message because you are subscribed to the Google Groups "=
ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposa=
ls/.

------=_Part_4287_1011012.1391653610096
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr"><br>On Wednesday, February 5, 2014 5:30:54 PM UTC-5, Thiag=
o Macieira wrote:<blockquote class=3D"gmail_quote" style=3D"margin: 0px 0px=
 0px 0.8ex; border-left-width: 1px; border-left-color: rgb(204, 204, 204); =
border-left-style: solid; padding-left: 1ex;">Em qua 05 fev 2014, =C3=A0s 2=
1:20:52, Miro Knejp escreveu:&nbsp;<br>&gt; &gt; The reason being that most=
 C++ standard library implementations will&nbsp;<br>&gt; &gt; delegate&nbsp=
;<br>&gt; &gt; to strtoll or similar functions (like we do in Qt). Asking f=
or&nbsp;<br>&gt; &gt; functionality&nbsp;<br>&gt; &gt; different from strto=
ll means asking for more complexity from library&nbsp;<br>&gt; &gt; develop=
ers.&nbsp;<br>&gt; &gt;&nbsp;<br>&gt; &gt; Alternatively, make sure that st=
rtoll could be implemented on top of a&nbsp;<br>&gt; &gt; plain C&nbsp;<br>=
&gt; &gt; library routine that is the backend for your new function. That w=
ould&nbsp;<br>&gt; &gt; solve&nbsp;<br>&gt; &gt; the problem of complexity.=
&nbsp;<br>&gt;&nbsp;<br>&gt; While it makes sense and sounds great, you can=
 only implement it in&nbsp;<br>&gt; terms of strtoll if&nbsp;<br>&gt;&nbsp;=
<br>&gt; &nbsp;1. The iterators point to contiguous memory, and&nbsp;<br>&g=
t; &nbsp;2. The value type is char, and&nbsp;<br>&gt; &nbsp;3. They range i=
s null terminated.&nbsp;<br>&gt;&nbsp;<br>&gt; There is currently no reliab=
le way to detect 1 and 3. The contiguous&nbsp;<br>&gt; iterator category pr=
oposal would solve 1, but 3 requires dereferencing&nbsp;<br>&gt; the end it=
erator and thus UB. That limits our options alot. If&nbsp;<br>&gt; implemen=
tability with strtoxx is a requirement you can just drop the&nbsp;<br>&gt; =
iterators, templates, locales and mark this thread as closed since the&nbsp=
;<br>&gt; limitations of strtoxx started it in the first place.&nbsp;<br><b=
r>If you go for my alternative proposal, then you skip the need for 3. That=
 is,&nbsp;<br>implementing strtoll on top of a plain C function that operat=
es on contiguous&nbsp;<br>memory and receives a begin and end pointer.&nbsp=
;<br><br>However, I do think 1 and 2 are *reasonable*. </blockquote><div><b=
r></div><div>I'm fine with 1 and 2 if it turns out genericity is too hard. =
Possibly with overloads for the other character types.</div><div><br></div>=
<div>&nbsp;</div><blockquote class=3D"gmail_quote" style=3D"margin: 0px 0px=
 0px 0.8ex; border-left-width: 1px; border-left-color: rgb(204, 204, 204); =
border-left-style: solid; padding-left: 1ex;">I think the whole discussion&=
nbsp;<br>about input iterators that this discussion has gone on for the pas=
t few days&nbsp;<br>is unnecessary. Simply force people to read into a cont=
iguous-memory buffer of&nbsp;<br>one of the base char types.&nbsp;<br></blo=
ckquote><div><br></div><div>In all honesty I only care about string_view. G=
eneric iterators are nice (maybe so we can support vector&lt;char&gt; if th=
ere's a compelling reason to use one for something??), but I'll be happily =
using this interface with string_view all the time and probably little else=
..</div><div>&nbsp;</div><blockquote class=3D"gmail_quote" style=3D"margin: =
0px 0px 0px 0.8ex; border-left-width: 1px; border-left-color: rgb(204, 204,=
 204); border-left-style: solid; padding-left: 1ex;"><br>In specific: I'd l=
ike this discussion to assume that the number parsing code is&nbsp;<br>impl=
emented out-of-line. Inline parsing is, forgive me for saying so, nuts.&nbs=
p;<br>You could maybe do it for integers, but you'd never do it for floatin=
g-point&nbsp;<br>results.&nbsp;<br></blockquote><div><br></div><div>I think=
 at this point we<span style=3D"font-size: 13px;">&nbsp;need to do a real s=
tudy into implementations to answer these questions accurately, particularl=
y with floating point as that's the real bear. How is strtof() implemented =
on different platforms? What do gcc and other compilers use for floating po=
int literals? Also you voted against binary 0b prefix because its not suppo=
rted by strtol(), but something is going to be/already is written to suppor=
t C++14 binary literals. That something can be leveraged here.</span></div>=
</div>

<p></p>

-- <br />
&nbsp;<br />
--- <br />
You received this message because you are subscribed to the Google Groups &=
quot;ISO C++ Standard - Future Proposals&quot; group.<br />
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to std-proposals+unsubscribe@isocpp.org.<br />
To post to this group, send email to std-proposals@isocpp.org.<br />
Visit this group at <a href=3D"http://groups.google.com/a/isocpp.org/group/=
std-proposals/">http://groups.google.com/a/isocpp.org/group/std-proposals/<=
/a>.<br />

------=_Part_4287_1011012.1391653610096--

.

Author: Thiago Macieira <thiago@macieira.org>
Date: Wed, 05 Feb 2014 18:50:38 -0800 Raw View

Em qua 05 fev 2014, =E0s 18:26:50, Matthew Fioravante escreveu:
> I think at this point we need to do a real study into implementations to=
=20
> answer these questions accurately, particularly with floating point as=20
> that's the real bear. How is strtof() implemented on different platforms?=
=20
> What do gcc and other compilers use for floating point literals? Also you=
=20
> voted against binary 0b prefix because its not supported by strtol(), but=
=20
> something is going to be/already is written to support C++14 binary=20
> literals. That something can be leveraged here.

I would prefer that the functionality match strtol, which means convincing =
the=20
C guys that they should also parse 0b (maybe also convince them to add it t=
o=20
their language).

But the important thing is that strtol and this proposal share a backend. I=
t i=20
possible to disable just the 0b detection to still keep the required POSIX=
=20
functionality. Or some library developers may call it "an extension" and go=
=20
with it.
--=20
Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org
   Software Architect - Intel Open Source Technology Center
      PGP/GPG: 0x6EF45358; fingerprint:
      E067 918B B660 DBD1 105C  966C 33F5 F005 6EF4 5358

--=20

---=20
You received this message because you are subscribed to the Google Groups "=
ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposa=
ls/.

.

Author: Magnus Fromreide <magfr@lysator.liu.se>
Date: Thu, 6 Feb 2014 08:36:10 +0100 Raw View

On Wed, Feb 05, 2014 at 02:30:54PM -0800, Thiago Macieira wrote:
> Em qua 05 fev 2014, =E0s 21:20:52, Miro Knejp escreveu:
> > > The reason being that most C++ standard library implementations will=
=20
> > > delegate
> > > to strtoll or similar functions (like we do in Qt). Asking for=20
> > > functionality
> > > different from strtoll means asking for more complexity from library
> > > developers.
> > >=20
> > > Alternatively, make sure that strtoll could be implemented on top of =
a=20
> > > plain C
> > > library routine that is the backend for your new function. That would=
=20
> > > solve
> > > the problem of complexity.
> >=20
> > While it makes sense and sounds great, you can only implement it in=20
> > terms of strtoll if
> >=20
> >  1. The iterators point to contiguous memory, and
> >  2. The value type is char, and
> >  3. They range is null terminated.
> >=20
> > There is currently no reliable way to detect 1 and 3. The contiguous=20
> > iterator category proposal would solve 1, but 3 requires dereferencing=
=20
> > the end iterator and thus UB. That limits our options alot. If=20
> > implementability with strtoxx is a requirement you can just drop the=20
> > iterators, templates, locales and mark this thread as closed since the=
=20
> > limitations of strtoxx started it in the first place.
>=20
> If you go for my alternative proposal, then you skip the need for 3. That=
 is,=20
> implementing strtoll on top of a plain C function that operates on contig=
uous=20
> memory and receives a begin and end pointer.
>=20
> However, I do think 1 and 2 are *reasonable*. I think the whole discussio=
n=20
> about input iterators that this discussion has gone on for the past few d=
ays=20
> is unnecessary. Simply force people to read into a contiguous-memory buff=
er of=20
> one of the base char types.

I would consider it unlucky if

parse(istream_iterator(cin), istream_iterator())

didn't work.

My use case involves parsing data from part-wise contiguos containers (thin=
k
std::deque) where the ability to erase the head of the container is useful.


I further think that forcing the value type to be 'char' is unreasonable,
especially given that there are locales with wide decimal point and
thousands separator (ps_AF).


> In specific: I'd like this discussion to assume that the number parsing c=
ode is=20
> implemented out-of-line. Inline parsing is, forgive me for saying so, nut=
s.=20
> You could maybe do it for integers, but you'd never do it for floating-po=
int=20
> results.=20

I agree. Maybe something along the lines of

struct parser {
 typedef /* implementaion-defined */ code_point;
        status_type input(code_point symbol);
 pair<status_type, code_point*>
 input(code_point* begin, code_point* end);
 long long value() const;
};

Here, code_point is typically the widest character type, but I wouldn't
be opposed to further overloads for the input member, one could e.g. imagin=
e
overloads for char.

/MF

--=20

---=20
You received this message because you are subscribed to the Google Groups "=
ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposa=
ls/.

.

Author: Thiago Macieira <thiago@macieira.org>
Date: Thu, 06 Feb 2014 00:06:42 -0800 Raw View

Em qui 06 fev 2014, =E0s 08:36:10, Magnus Fromreide escreveu:
> I would consider it unlucky if
>=20
> parse(istream_iterator(cin), istream_iterator())
>=20
> didn't work.
>=20
> My use case involves parsing data from part-wise contiguos containers (th=
ink
> std::deque) where the ability to erase the head of the container is usefu=
l.

If we can support that, by all means.

However, I would consider it a showstopper if

parse("1234");

had to be done all inline.

--=20
Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org
   Software Architect - Intel Open Source Technology Center
      PGP/GPG: 0x6EF45358; fingerprint:
      E067 918B B660 DBD1 105C  966C 33F5 F005 6EF4 5358

--=20

---=20
You received this message because you are subscribed to the Google Groups "=
ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposa=
ls/.

.

Author: Miro Knejp <miro@knejp.de>
Date: Fri, 07 Feb 2014 21:18:41 +0100 Raw View

Am 06.02.2014 09:06, schrieb Thiago Macieira:
> Em qui 06 fev 2014, =E0s 08:36:10, Magnus Fromreide escreveu:
>> I would consider it unlucky if
>>
>> parse(istream_iterator(cin), istream_iterator())
>>
>> didn't work.
>>
>> My use case involves parsing data from part-wise contiguos containers (t=
hink
>> std::deque) where the ability to erase the head of the container is usef=
ul.
> If we can support that, by all means.
>
> However, I would consider it a showstopper if
>
> parse("1234");
>
> had to be done all inline.
Speaking floats, which part is the more complex/bloated one:
Extracting digits and symbols from the input or assembling them into a=20
floating point value with minimal rounding errors, etc? The latter can=20
easily be separated into a stateful object that is fed the numerical=20
values (i.e. digit values) and semantics (i.e. sign, comma, exponent=20
indicators) of the input at which point input encodings, character types=20
or locales are already translated to a neutral subset. Some part of the=20
numeric parsers certainly needs to be inline but some can be implemented=20
out-of-line.

--=20

---=20
You received this message because you are subscribed to the Google Groups "=
ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposa=
ls/.

.

Author: Miro Knejp <miro@knejp.de>
Date: Sat, 08 Feb 2014 01:47:11 +0100 Raw View

Am 08.02.2014 00:22, schrieb Bengt Gustafsson:
> - The simplest code "works" until users happen to type a leading or
> trailing space on that important demo.
Well that's user input handling and could fill its own book. But that's
a question of sensible defaults or using the right flags/overloads.
>
> - In what situation is it important to give an error message if there
> is whitespace? What can go wrong in a real application if the
> whitespace is ignored? I fail to see those cases other than very
> marginal. I mean, even if you have speced a file format to forbid
> spaces (for some reason) you can be quite certain that the other guy
> interfacing ot you will send you spaces anyway. What good does it do
> to anyone to fail in this case?
>
<Number>
     12345
</Number>
Your scema tells your validator that the Number element is an integer,
so you use (a whitespace skipping) parse<int>() to get it's value. Now
you skipped the newline after the opening tag and your line counters are
wrong which may cause other stuff to break. Same as your example above,
just inverted.

Silly example? Maybe. But specifications and text based communication
protocols do exist for which the certification process strictly requires
to only accept valid input and error otherwise. Leading whitespaces are
not valid input. Not even a leading + sign for positive numbers. "Oh
wait, your application accepts these? No certificate for you, go home."
The aviation industry is full of these but unfortunately I can't tell
you how they look because they are covered by NDAs. My point is that the
use case exists and therefore should be supported. Who has the
omniscience to judge whether it's common enough to be justified or not?
I certainly have uses for both variations in enough projects.

> And noone has said that we should not provide a "strict" mode/return
> flag/input flag or something to cater for these cases.
I'm aware of that. What I don't like is the attempt to squeeze
*everything* into a single method. My fear is it makes the call site
convoluted and hard to read and for some modes a dedicated overload with
a fitting and descriptive name would be more beneficial. What the
sensible defaults for whatever comes out of this discussion are is an
entirely different topic.
>
> I mean, "getting it right" must mean that it is easy to use and works
> as expected. All other number converters in all languages I know of
> eat leading spaces. Most of them can't even tell you if there were any!
That doesn't conclude we should therefore apply the same limitations. It
also doesn't mean there must be only one single method for  everything.
Always RTFM when using a function you don't know.


--

---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.

.

Author: Thiago Macieira <thiago@macieira.org>
Date: Fri, 07 Feb 2014 18:59:56 -0800 Raw View

Em sex 07 fev 2014, =E0s 21:18:41, Miro Knejp escreveu:
> Speaking floats, which part is the more complex/bloated one:
> Extracting digits and symbols from the input or assembling them into a
> floating point value with minimal rounding errors, etc? The latter can
> easily be separated into a stateful object that is fed the numerical
> values (i.e. digit values) and semantics (i.e. sign, comma, exponent
> indicators) of the input at which point input encodings, character types
> or locales are already translated to a neutral subset. Some part of the
> numeric parsers certainly needs to be inline but some can be implemented
> out-of-line.

That could be done. As I said, the requirement is that this code is not inl=
ine=20
and not templated. It must exist in a .cpp file not visible to the user.

If you want to pass a traits object that specifies how to recognise digits,=
=20
decimals, thousands separators, exponents, plus a function to get the next=
=20
digit, by all means.

Here's an implementation of strtod to get you started (freely licensed):

http://code.google.com/p/freebsd/source/browse/contrib/gdtoa/strtod.c
http://code.google.com/p/freebsd/source/browse/contrib/gdtoa/gdtoaimp.h

If you manage to do that, I'll be very interested in the code. Right now, t=
o=20
parse a UTF-16 number in Qt, we must first convert it to Latin1, which mean=
s=20
allocating memory, which means I can't make those functions noexcept.

--=20
Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org
   Software Architect - Intel Open Source Technology Center
      PGP/GPG: 0x6EF45358; fingerprint:
      E067 918B B660 DBD1 105C  966C 33F5 F005 6EF4 5358

--=20

---=20
You received this message because you are subscribed to the Google Groups "=
ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposa=
ls/.

.

Author: Miro Knejp <miro@knejp.de>
Date: Wed, 12 Feb 2014 04:08:21 +0100 Raw View

Am 08.02.2014 03:59, schrieb Thiago Macieira:
> Em sex 07 fev 2014, =E0s 21:18:41, Miro Knejp escreveu:
>> Speaking floats, which part is the more complex/bloated one:
>> Extracting digits and symbols from the input or assembling them into a
>> floating point value with minimal rounding errors, etc? The latter can
>> easily be separated into a stateful object that is fed the numerical
>> values (i.e. digit values) and semantics (i.e. sign, comma, exponent
>> indicators) of the input at which point input encodings, character types
>> or locales are already translated to a neutral subset. Some part of the
>> numeric parsers certainly needs to be inline but some can be implemented
>> out-of-line.
> That could be done. As I said, the requirement is that this code is not i=
nline
> and not templated. It must exist in a .cpp file not visible to the user.
>
> If you want to pass a traits object that specifies how to recognise digit=
s,
> decimals, thousands separators, exponents, plus a function to get the nex=
t
> digit, by all means.
>
> Here's an implementation of strtod to get you started (freely licensed):
>
> http://code.google.com/p/freebsd/source/browse/contrib/gdtoa/strtod.c
> http://code.google.com/p/freebsd/source/browse/contrib/gdtoa/gdtoaimp.h
>
> If you manage to do that, I'll be very interested in the code. Right now,=
 to
> parse a UTF-16 number in Qt, we must first convert it to Latin1, which me=
ans
> allocating memory, which means I can't make those functions noexcept.
Well that's a problem of codecvt and friends. There's no iterative=20
interface to avoid said allocations and if there was, it would most=20
likely require a virtual call per input character. It would certainly=20
land in the cache and branch prediction would help, too. But that's just=20
how cultures/encodings complicate things. A language-neutral fast path=20
ASCII overload would not suffer from these drawbacks. I see this working=20
in two steps. First, translate the next X input characters into a=20
Unicode codepoint and second, translate that character into a=20
digit/separator/decimal/etc. The former potentially needs allocations=20
and depends only on the encoding, the latter depends on=20
language/culture. If codecvt had an iterative interface one could at=20
least measure what dominates: the virtual calls, the allocation, or=20
assembling the actual float correctly.

But as long as codecvt does not have a method to consume characters=20
incrementally I see no way to go without some sort of temporary output=20
buffer.

--=20

---=20
You received this message because you are subscribed to the Google Groups "=
ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposa=
ls/.

.

Author: David Krauss <potswa@gmail.com>
Date: Wed, 12 Feb 2014 11:22:28 +0800 Raw View

--Apple-Mail=_47F214BA-AEA1-4DE7-AE9B-0E8C566FF83D
Content-Transfer-Encoding: quoted-printable
Content-Type: text/plain; charset=ISO-8859-1


On Feb 12, 2014, at 11:08 AM, Miro Knejp <miro@knejp.de> wrote:

> Well that's a problem of codecvt and friends.
....
> But as long as codecvt does not have a method to consume characters incre=
mentally I see no way to go without some sort of temporary output buffer.

Codecvt uses a user-supplied buffer of type mbstate_t to allow incremental =
processing. However because mbstate_t is entirely implementation-dependent =
(aside from being POD), there's no way for the user to define their own sta=
teful codecvt. It's likely that Qt uses something else, although I don't kn=
ow much about it.

--=20

---=20
You received this message because you are subscribed to the Google Groups "=
ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposa=
ls/.

--Apple-Mail=_47F214BA-AEA1-4DE7-AE9B-0E8C566FF83D
Content-Transfer-Encoding: quoted-printable
Content-Type: text/html; charset=ISO-8859-1

<html><head><meta http-equiv=3D"Content-Type" content=3D"text/html charset=
=3Dwindows-1252"></head><body style=3D"word-wrap: break-word; -webkit-nbsp-=
mode: space; -webkit-line-break: after-white-space;"><br><div><div>On Feb 1=
2, 2014, at 11:08 AM, Miro Knejp &lt;<a href=3D"mailto:miro@knejp.de">miro@=
knejp.de</a>&gt; wrote:</div><br class=3D"Apple-interchange-newline"><block=
quote type=3D"cite"><div style=3D"font-size: 12px; font-style: normal; font=
-variant: normal; font-weight: normal; letter-spacing: normal; line-height:=
 normal; orphans: auto; text-align: start; text-indent: 0px; text-transform=
: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-=
stroke-width: 0px;">Well that's a problem of codecvt and friends.</div></bl=
ockquote><div>&hellip;</div><blockquote type=3D"cite"><div style=3D"font-si=
ze: 12px; font-style: normal; font-variant: normal; font-weight: normal; le=
tter-spacing: normal; line-height: normal; orphans: auto; text-align: start=
; text-indent: 0px; text-transform: none; white-space: normal; widows: auto=
; word-spacing: 0px; -webkit-text-stroke-width: 0px;">But as long as codecv=
t does not have a method to consume characters incrementally I see no way t=
o go without some sort of temporary output buffer.<br></div></blockquote><d=
iv><br></div><div>Codecvt uses a user-supplied buffer of type mbstate_t to =
allow incremental processing. However because mbstate_t is entirely impleme=
ntation-dependent (aside from being POD), there's no way for the user to de=
fine their own stateful codecvt. It's likely that Qt uses something else, a=
lthough I don't know much about it.</div><div><br></div></div></body></html=
>

<p></p>

-- <br />
&nbsp;<br />
--- <br />
You received this message because you are subscribed to the Google Groups &=
quot;ISO C++ Standard - Future Proposals&quot; group.<br />
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to std-proposals+unsubscribe@isocpp.org.<br />
To post to this group, send email to std-proposals@isocpp.org.<br />
Visit this group at <a href=3D"http://groups.google.com/a/isocpp.org/group/=
std-proposals/">http://groups.google.com/a/isocpp.org/group/std-proposals/<=
/a>.<br />

--Apple-Mail=_47F214BA-AEA1-4DE7-AE9B-0E8C566FF83D--

.

Author: Thiago Macieira <thiago@macieira.org>
Date: Tue, 11 Feb 2014 19:40:01 -0800 Raw View

Em qua 12 fev 2014, =E0s 11:22:28, David Krauss escreveu:
> It's likely that Qt uses something else, although I don't know much about
> it.

QStrings are always UTF-16 encoded and the conversion from UTF-16 to Latin1=
 is=20
highly optimised. char16_t strings and char32_t strings also have a very we=
ll-
defined encoding. There's no problem working with them, since conversion ca=
n be=20
done easily. wchar_t strings have an implementation-defined encoding, but i=
t's=20
the same people who decide that encoding as the people who will write the=
=20
conversion function, so it's no problem either.

The problem is only for char strings, which can be multibyte and whose=20
encoding can vary at runtime. Though often enough such encoding is compatib=
le=20
with ASCII and, therefore, the implementation can ignore the multibyte data=
..

Anyway, note that this discussion did not start about encodings. I don't th=
ink=20
that dealing with the four character types is a problem. It's just more wor=
k=20
for the implementation developers.

I was talking about parsing a non-contiguous block of data and depending on=
=20
the iterator. Parsing character by character (whichever character type) is =
the=20
problem here.
--=20
Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org
   Software Architect - Intel Open Source Technology Center
      PGP/GPG: 0x6EF45358; fingerprint:
      E067 918B B660 DBD1 105C  966C 33F5 F005 6EF4 5358

--=20

---=20
You received this message because you are subscribed to the Google Groups "=
ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposa=
ls/.

.

Author: David Krauss <potswa@gmail.com>
Date: Wed, 12 Feb 2014 12:28:19 +0800 Raw View

--Apple-Mail=_6065B764-327C-4048-9EFD-3D3CF26045BB
Content-Transfer-Encoding: quoted-printable
Content-Type: text/plain; charset=ISO-8859-1


On Feb 12, 2014, at 11:40 AM, Thiago Macieira <thiago@macieira.org> wrote:

> I was talking about parsing a non-contiguous block of data and depending =
on=20
> the iterator. Parsing character by character (whichever character type) i=
s the=20
> problem here.

I've not been following the discussion but just saw the short message makin=
g a false assertion about codecvt.

Now that it's been mentioned, codecvt isn't such a bad parsing model, when =
the input doesn't fit in memory and you want to make multiple calls. Howeve=
r that's not the situation for numeric conversion. I think what Miro intend=
ed to say is std::num_get, which is stateless.

Looking at num_get now, it's templated over a generic input iterator type s=
o you could define a discontiguous range that way. But, that only gets you =
the C locale. You cannot go from a std::locale object to a num_get compatib=
le template or arbitrary specialization.

Alternately, you could define your own streambuf class, like std::istringst=
ream but non-owning and discontiguous. The overhead would be one virtual ca=
ll per storage block, perhaps minus one. (If the statically accessible poin=
ters in the streambuf cover the text to be parsed, no virtual call should b=
e needed except the one handling locale indirection, which you can avoid by=
 statically calling a locale object if you know which one you want.) This r=
eally seems like the way to go for a "rope" class.

Anyway, I would think that discontiguous storage would be a disqualificatio=
n to using whatever convenient interface from the present proposal.

--=20

---=20
You received this message because you are subscribed to the Google Groups "=
ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposa=
ls/.

--Apple-Mail=_6065B764-327C-4048-9EFD-3D3CF26045BB
Content-Transfer-Encoding: quoted-printable
Content-Type: text/html; charset=ISO-8859-1

<html><head><meta http-equiv=3D"Content-Type" content=3D"text/html charset=
=3Diso-8859-1"></head><body style=3D"word-wrap: break-word; -webkit-nbsp-mo=
de: space; -webkit-line-break: after-white-space;"><br><div><div>On Feb 12,=
 2014, at 11:40 AM, Thiago Macieira &lt;<a href=3D"mailto:thiago@macieira.o=
rg">thiago@macieira.org</a>&gt; wrote:</div><br class=3D"Apple-interchange-=
newline"><blockquote type=3D"cite">I was talking about parsing a non-contig=
uous block of data and depending on <br>the iterator. Parsing character by =
character (whichever character type) is the <br>problem here.<br></blockquo=
te><div><br></div><div>I've not been following the discussion but just saw =
the short message making a false assertion about <font face=3D"Courier">cod=
ecvt</font>.</div><div><br></div><div>Now that it's been mentioned, codecvt=
 isn't such a bad parsing model, when the input doesn't fit in memory and y=
ou want to make multiple calls. However that's not the situation for numeri=
c conversion. I think what Miro intended to say is <font face=3D"Courier">s=
td::num_get</font>, which is stateless.</div><div><br></div><div>Looking at=
 <font face=3D"Courier">num_get</font> now, it's templated over a generic i=
nput iterator type so you could define a discontiguous range that way. But,=
 that only gets you the C locale. You cannot go from a <font face=3D"Courie=
r">std::locale</font> object to a <font face=3D"Courier">num_get</font>&nbs=
p;compatible template or arbitrary specialization.</div><div><br></div><div=
>Alternately, you could define your own <font face=3D"Courier">streambuf</f=
ont> class, like <font face=3D"Courier">std::istringstream</font>&nbsp;but =
non-owning and discontiguous. The overhead would be one virtual call per st=
orage block, perhaps minus one. (If the statically accessible pointers in t=
he streambuf cover the text to be parsed, no virtual call should be needed =
except the one handling locale indirection, which you can avoid by statical=
ly calling a locale object if you know which one you want.) This really see=
ms like the way to go for a "rope" class.</div><div><br></div><div>Anyway, =
I would think that discontiguous storage would be a disqualification to usi=
ng whatever convenient interface from the present proposal.</div><div><br><=
/div></div></body></html>

<p></p>

-- <br />
&nbsp;<br />
--- <br />
You received this message because you are subscribed to the Google Groups &=
quot;ISO C++ Standard - Future Proposals&quot; group.<br />
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to std-proposals+unsubscribe@isocpp.org.<br />
To post to this group, send email to std-proposals@isocpp.org.<br />
Visit this group at <a href=3D"http://groups.google.com/a/isocpp.org/group/=
std-proposals/">http://groups.google.com/a/isocpp.org/group/std-proposals/<=
/a>.<br />

--Apple-Mail=_6065B764-327C-4048-9EFD-3D3CF26045BB--

.

Author: Bengt Gustafsson <bengt.gustafsson@beamways.com>
Date: Wed, 12 Mar 2014 06:51:14 -0700 (PDT) Raw View

------=_Part_443_2362835.1394632274218
Content-Type: text/plain; charset=UTF-8

I think David is on the right track here. The problem is basically that
streambuf is such an obscure class (as all of iostream really).

The gist of it is however that you have an object controlling a buffer with
one virtual method to refill the buffer. This enables the number parsing
code to be placed in a cpp file, and only templated on the character type.

In a general case there could be a problem with excessive putbacking (for
complex parsing). I don't know of a scheme that can handle any length of
putbacking with a "forward iterator" style input without allocating memory
except for the normal parsing buffer.

For simple number parsing a 1 character putback is enough so it is not a
problem here. As long as you know the max amount of putback and it is less
than the buffer size it can be handled by refilling the buffer earlier, but
it is important that these issues are taken
care of in the buffer class, so that subclasses overriding the reload
function don't have to bother.

Den onsdagen den 12:e februari 2014 kl. 05:28:19 UTC+1 skrev David Krauss:
>
>
> On Feb 12, 2014, at 11:40 AM, Thiago Macieira <thi...@macieira.org<javascript:>>
> wrote:
>
> I was talking about parsing a non-contiguous block of data and depending
> on
> the iterator. Parsing character by character (whichever character type) is
> the
> problem here.
>
>
> I've not been following the discussion but just saw the short message
> making a false assertion about codecvt.
>
> Now that it's been mentioned, codecvt isn't such a bad parsing model, when
> the input doesn't fit in memory and you want to make multiple calls.
> However that's not the situation for numeric conversion. I think what Miro
> intended to say is std::num_get, which is stateless.
>
> Looking at num_get now, it's templated over a generic input iterator type
> so you could define a discontiguous range that way. But, that only gets you
> the C locale. You cannot go from a std::locale object to a num_get compatible
> template or arbitrary specialization.
>
> Alternately, you could define your own streambuf class, like
> std::istringstream but non-owning and discontiguous. The overhead would
> be one virtual call per storage block, perhaps minus one. (If the
> statically accessible pointers in the streambuf cover the text to be
> parsed, no virtual call should be needed except the one handling locale
> indirection, which you can avoid by statically calling a locale object if
> you know which one you want.) This really seems like the way to go for a
> "rope" class.
>
> Anyway, I would think that discontiguous storage would be a
> disqualification to using whatever convenient interface from the present
> proposal.
>
>

--

---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.

------=_Part_443_2362835.1394632274218
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr">I think David is on the right track here. The problem is b=
asically that streambuf is such an obscure class (as all of iostream really=
).<div><br></div><div>The gist of it is however that you have an object con=
trolling a buffer with one virtual method to refill the buffer. This enable=
s the number parsing code to be placed in a cpp file, and only templated on=
 the character type.</div><div><br></div><div>In a general case there could=
 be a problem with excessive putbacking (for complex parsing). I don't know=
 of a scheme that can handle any length of putbacking with a "forward itera=
tor" style input without allocating memory except for the normal parsing bu=
ffer.</div><div><br></div><div>For simple number parsing a 1 character putb=
ack is enough so it is not a problem here. As long as you know the max amou=
nt of putback and it is less than the buffer size it can be handled by refi=
lling the buffer earlier, but it is important that these issues are taken</=
div><div>care of in the buffer class, so that subclasses overriding the rel=
oad function don't have to bother.</div><div><br><br>Den onsdagen den 12:e =
februari 2014 kl. 05:28:19 UTC+1 skrev David Krauss:<blockquote class=3D"gm=
ail_quote" style=3D"margin: 0;margin-left: 0.8ex;border-left: 1px #ccc soli=
d;padding-left: 1ex;"><div style=3D"word-wrap:break-word"><br><div><div>On =
Feb 12, 2014, at 11:40 AM, Thiago Macieira &lt;<a href=3D"javascript:" targ=
et=3D"_blank" gdf-obfuscated-mailto=3D"XXCSOlYJ2oQJ" onmousedown=3D"this.hr=
ef=3D'javascript:';return true;" onclick=3D"this.href=3D'javascript:';retur=
n true;">thi...@macieira.org</a>&gt; wrote:</div><br><blockquote type=3D"ci=
te">I was talking about parsing a non-contiguous block of data and dependin=
g on <br>the iterator. Parsing character by character (whichever character =
type) is the <br>problem here.<br></blockquote><div><br></div><div>I've not=
 been following the discussion but just saw the short message making a fals=
e assertion about <font face=3D"Courier">codecvt</font>.</div><div><br></di=
v><div>Now that it's been mentioned, codecvt isn't such a bad parsing model=
, when the input doesn't fit in memory and you want to make multiple calls.=
 However that's not the situation for numeric conversion. I think what Miro=
 intended to say is <font face=3D"Courier">std::num_get</font>, which is st=
ateless.</div><div><br></div><div>Looking at <font face=3D"Courier">num_get=
</font> now, it's templated over a generic input iterator type so you could=
 define a discontiguous range that way. But, that only gets you the C local=
e. You cannot go from a <font face=3D"Courier">std::locale</font> object to=
 a <font face=3D"Courier">num_get</font>&nbsp;compatible template or arbitr=
ary specialization.</div><div><br></div><div>Alternately, you could define =
your own <font face=3D"Courier">streambuf</font> class, like <font face=3D"=
Courier">std::istringstream</font>&nbsp;but non-owning and discontiguous. T=
he overhead would be one virtual call per storage block, perhaps minus one.=
 (If the statically accessible pointers in the streambuf cover the text to =
be parsed, no virtual call should be needed except the one handling locale =
indirection, which you can avoid by statically calling a locale object if y=
ou know which one you want.) This really seems like the way to go for a "ro=
pe" class.</div><div><br></div><div>Anyway, I would think that discontiguou=
s storage would be a disqualification to using whatever convenient interfac=
e from the present proposal.</div><div><br></div></div></div></blockquote><=
/div></div>

<p></p>

-- <br />
<br />
--- <br />
You received this message because you are subscribed to the Google Groups &=
quot;ISO C++ Standard - Future Proposals&quot; group.<br />
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to <a href=3D"mailto:std-proposals+unsubscribe@isocpp.org">std-proposa=
ls+unsubscribe@isocpp.org</a>.<br />
To post to this group, send email to <a href=3D"mailto:std-proposals@isocpp=
..org">std-proposals@isocpp.org</a>.<br />
Visit this group at <a href=3D"http://groups.google.com/a/isocpp.org/group/=
std-proposals/">http://groups.google.com/a/isocpp.org/group/std-proposals/<=
/a>.<br />

------=_Part_443_2362835.1394632274218--

.