Topic: Parsing Numbers
Author: Olaf van der Spek <olafvdspek@gmail.com>
Date: Mon, 18 May 2015 11:34:07 -0700 (PDT)
Raw View
------=_Part_16_280349281.1431974047181
Content-Type: multipart/alternative;
boundary="----=_Part_17_251244249.1431974047181"
------=_Part_17_251244249.1431974047181
Content-Type: text/plain; charset=UTF-8
Let's get the party started.
What have we got?
We've got functions like strtol and stoi which take a const char* or
std::string and return a number.
long strtol(const char*, char **str_end, int base);
int stoi(const std::string&, std::size_t* pos = 0, int base = 10);
What do we want?
Input should not be required to be null terminated, so string_view seems
like a suitable input type.
Error detection should be simpler, but not everyone is a fan of exceptions.
And IMO skipping spaces should not be part of the parse function.
There's also the question of what to do when not the entire input can be
parsed. Return an error or not.
So, what about this one?
optional<T> parse(string_view, std::size_t* pos = 0, int base = 10);
An alternative could be:
error_code parse(T&, string_view, std::size_t* pos = 0, int base = 10);
http://en.cppreference.com/w/cpp/string/byte/strtol
http://en.cppreference.com/w/cpp/string/byte/atoi
http://en.cppreference.com/w/cpp/string/basic_string/stol
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
------=_Part_17_251244249.1431974047181
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable
<div dir=3D"ltr"><div>Let's get the party started.</div><div><br></div><div=
>What have we got?</div><div><br></div><div>We've got functions like strtol=
and stoi which take a const char* or std::string and return a number. =
;</div><div><br></div><div>long strtol(const char*, char **str_end, int bas=
e);</div><div>int stoi(const std::string&, std::size_t* pos =3D 0=
, int base =3D 10);</div><div><br></div><div>What do we want?</div><div><br=
></div><div>Input should not be required to be null terminated, so string_v=
iew seems like a suitable input type.</div><div>Error detection should be s=
impler, but not everyone is a fan of exceptions. </div><div><br></div>=
<div>And IMO skipping spaces should not be part of the parse function.</div=
><div>There's also the question of what to do when not the entire input can=
be parsed. Return an error or not.</div><div><br></div><div><br></div><div=
>So, what about this one?</div><div><br></div><div>optional<T> parse(=
string_view, std::size_t* pos =3D 0, int base =3D 10);</div><div><br></div>=
<div>An alternative could be:</div><div><br></div><div>error_code parse(T&a=
mp;, string_view, std::size_t* pos =3D 0, int base =3D 10);</div><div><br><=
/div><div><br></div><div><br></div><div>http://en.cppreference.com/w/cpp/st=
ring/byte/strtol</div><div>http://en.cppreference.com/w/cpp/string/byte/ato=
i</div><div>http://en.cppreference.com/w/cpp/string/basic_string/stol</div>=
</div>
<p></p>
-- <br />
<br />
--- <br />
You received this message because you are subscribed to the Google Groups &=
quot;ISO C++ Standard - Future Proposals" group.<br />
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to <a href=3D"mailto:std-proposals+unsubscribe@isocpp.org">std-proposa=
ls+unsubscribe@isocpp.org</a>.<br />
To post to this group, send email to <a href=3D"mailto:std-proposals@isocpp=
..org">std-proposals@isocpp.org</a>.<br />
Visit this group at <a href=3D"http://groups.google.com/a/isocpp.org/group/=
std-proposals/">http://groups.google.com/a/isocpp.org/group/std-proposals/<=
/a>.<br />
------=_Part_17_251244249.1431974047181--
------=_Part_16_280349281.1431974047181--
.
Author: "Vicente J. Botet Escriba" <vicente.botet@wanadoo.fr>
Date: Mon, 18 May 2015 21:59:37 +0200
Raw View
Le 18/05/15 20:34, Olaf van der Spek a =C3=A9crit :
> Let's get the party started.
>
> What have we got?
>
> We've got functions like strtol and stoi which take a const char* or=20
> std::string and return a number.
>
> long strtol(const char*, char **str_end, int base);
> int stoi(const std::string&, std::size_t* pos =3D 0, int base =3D 10);
>
> What do we want?
>
> Input should not be required to be null terminated, so string_view=20
> seems like a suitable input type.
> Error detection should be simpler, but not everyone is a fan of=20
> exceptions.
>
> And IMO skipping spaces should not be part of the parse function.
> There's also the question of what to do when not the entire input can=20
> be parsed. Return an error or not.
>
>
> So, what about this one?
>
> optional<T> parse(string_view, std::size_t* pos =3D 0, int base =3D 10);
Or
expected<T,error_code> parse(string_view, std::size_t* pos =3D 0, int base=
=20
=3D 10);
>
> An alternative could be:
>
> error_code parse(T&, string_view, std::size_t* pos =3D 0, int base =3D 10=
);
>
>
No please.
Vicente
--=20
---=20
You received this message because you are subscribed to the Google Groups "=
ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposa=
ls/.
.
Author: "'Jeffrey Yasskin' via ISO C++ Standard - Future Proposals" <std-proposals@isocpp.org>
Date: Mon, 18 May 2015 13:11:17 -0700
Raw View
Woot!
On Mon, May 18, 2015 at 11:34 AM, Olaf van der Spek
<olafvdspek@gmail.com> wrote:
> Let's get the party started.
>
> What have we got?
>
> We've got functions like strtol and stoi which take a const char* or
> std::string and return a number.
>
> long strtol(const char*, char **str_end, int base);
> int stoi(const std::string&, std::size_t* pos = 0, int base = 10);
>
> What do we want?
>
> Input should not be required to be null terminated, so string_view seems
> like a suitable input type.
> Error detection should be simpler, but not everyone is a fan of exceptions.
Also, errors in parsing aren't generally exceptional.
> And IMO skipping spaces should not be part of the parse function.
+1. We should have a way to skip a string_view past spaces, but I
agree that normal parsing should probably insist that the number
appear at the start of the string.
> There's also the question of what to do when not the entire input can be
> parsed. Return an error or not.
I believe "not", so that these functions can be used in parsing larger formats.
> So, what about this one?
>
> optional<T> parse(string_view, std::size_t* pos = 0, int base = 10);
>
> An alternative could be:
>
> error_code parse(T&, string_view, std::size_t* pos = 0, int base = 10);
I assume *pos gets the last position that was part of the parsed number?
In the paper that proposes this, it'd be good to see examples of
parsing code using each of the possible interfaces. That'll help
produce a more informed decision than just looking at the interfaces
abstractly.
Jeffrey
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
.
Author: "'Jeffrey Yasskin' via ISO C++ Standard - Future Proposals" <std-proposals@isocpp.org>
Date: Mon, 18 May 2015 13:12:00 -0700
Raw View
On Mon, May 18, 2015 at 1:11 PM, Jeffrey Yasskin <jyasskin@google.com> wrote:
> Woot!
>
> On Mon, May 18, 2015 at 11:34 AM, Olaf van der Spek
> <olafvdspek@gmail.com> wrote:
>> Let's get the party started.
>>
>> What have we got?
>>
>> We've got functions like strtol and stoi which take a const char* or
>> std::string and return a number.
>>
>> long strtol(const char*, char **str_end, int base);
>> int stoi(const std::string&, std::size_t* pos = 0, int base = 10);
>>
>> What do we want?
>>
>> Input should not be required to be null terminated, so string_view seems
>> like a suitable input type.
>> Error detection should be simpler, but not everyone is a fan of exceptions.
>
> Also, errors in parsing aren't generally exceptional.
>
>> And IMO skipping spaces should not be part of the parse function.
>
> +1. We should have a way to skip a string_view past spaces, but I
> agree that normal parsing should probably insist that the number
> appear at the start of the string.
>
>> There's also the question of what to do when not the entire input can be
>> parsed. Return an error or not.
>
> I believe "not", so that these functions can be used in parsing larger formats.
>
>> So, what about this one?
>>
>> optional<T> parse(string_view, std::size_t* pos = 0, int base = 10);
>>
>> An alternative could be:
>>
>> error_code parse(T&, string_view, std::size_t* pos = 0, int base = 10);
>
> I assume *pos gets the last position that was part of the parsed number?
Er, the position one after that.
> In the paper that proposes this, it'd be good to see examples of
> parsing code using each of the possible interfaces. That'll help
> produce a more informed decision than just looking at the interfaces
> abstractly.
>
> Jeffrey
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
.
Author: Jens Maurer <Jens.Maurer@gmx.net>
Date: Mon, 18 May 2015 23:08:08 +0200
Raw View
On 05/18/2015 08:34 PM, Olaf van der Spek wrote:
> So, what about this one?
>
> optional<T> parse(string_view, std::size_t* pos = 0, int base = 10);
>
> An alternative could be:
>
> error_code parse(T&, string_view, std::size_t* pos = 0, int base = 10);
I would appreciate if there would be a compile-time choice for base 2,
base 8, base 10, base 16, not (only) a facility with a runtime parameter.
Jens
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
.
Author: Magnus Fromreide <magfr@lysator.liu.se>
Date: Mon, 18 May 2015 23:16:35 +0200
Raw View
On Mon, May 18, 2015 at 11:34:07AM -0700, Olaf van der Spek wrote:
> Let's get the party started.
>
> What have we got?
>
> We've got functions like strtol and stoi which take a const char* or
> std::string and return a number.
>
> long strtol(const char*, char **str_end, int base);
> int stoi(const std::string&, std::size_t* pos = 0, int base = 10);
>
> What do we want?
>
> Input should not be required to be null terminated, so string_view seems
> like a suitable input type.
I think iterators/ranges are a better input type. Why should we require that
the input is consecutive?
> Error detection should be simpler, but not everyone is a fan of exceptions.
>
> And IMO skipping spaces should not be part of the parse function.
Here I agree fully - skipping spaces only prevents the use of the parse
function in contexts where space skipping shouldn't happen.
> There's also the question of what to do when not the entire input can be
> parsed. Return an error or not.
Return the result and the iterator that refer to the input position after
the number?
>
> So, what about this one?
>
> optional<T> parse(string_view, std::size_t* pos = 0, int base = 10);
>
> An alternative could be:
>
> error_code parse(T&, string_view, std::size_t* pos = 0, int base = 10);
Here I agree with Vincente, expected is a good return type.
expected<pair<T, Iterator>, error_code>
parse(Iterator, Iterator, int base = 10);
And yes, the content of the expected shouldn't be a pair but rather a more
descriptive type.
/MF
>
> http://en.cppreference.com/w/cpp/string/byte/strtol
> http://en.cppreference.com/w/cpp/string/byte/atoi
> http://en.cppreference.com/w/cpp/string/basic_string/stol
>
> --
>
> ---
> You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
> To post to this group, send email to std-proposals@isocpp.org.
> Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
.
Author: Matthew Fioravante <fmatthew5876@gmail.com>
Date: Mon, 18 May 2015 14:19:19 -0700 (PDT)
Raw View
------=_Part_1042_35527801.1431983959518
Content-Type: multipart/alternative;
boundary="----=_Part_1043_115297541.1431983959518"
------=_Part_1043_115297541.1431983959518
Content-Type: text/plain; charset=UTF-8
On Monday, May 18, 2015 at 2:34:07 PM UTC-4, Olaf van der Spek wrote:
>
> Let's get the party started.
>
> What have we got?
>
> We've got functions like strtol and stoi which take a const char* or
> std::string and return a number.
>
> long strtol(const char*, char **str_end, int base);
> int stoi(const std::string&, std::size_t* pos = 0, int base = 10);
>
> What do we want?
>
> Input should not be required to be null terminated, so string_view seems
> like a suitable input type.
> Error detection should be simpler, but not everyone is a fan of
> exceptions.
>
Exceptions must be optional. For high availability systems that should not
crash one often is forced to disable exceptions.
>
> And IMO skipping spaces should not be part of the parse function.
>
I agree with this very strongly. Let users write wrappers if they want
whitespace handling. By default not processing white space is more
efficient. Its also easier and more natural to add parsing rules rather
than trying to awkwardly back out of the defaults.
There's also the question of what to do when not the entire input can be
> parsed. Return an error or not.
>
For maximum efficiency the base implementation must allow a non-zero tail
string and also return it. If the tail string is handled by an out
parameter, we could have 2 overloads:
modern_return_thing<T> parse(string_view&& tail,string_view s); //Parses a
T from s.Sets tail to the end of the string
modern_return_thing<T> parse(string_view s); //Parses a T from s. Error if
there are extra characters after the parsed string.
We may also want to think generically. Can clients easily implement
efficient std::parse<T> routines for their own user defined types?
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
------=_Part_1043_115297541.1431983959518
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable
<div dir=3D"ltr"><br><br>On Monday, May 18, 2015 at 2:34:07 PM UTC-4, Olaf =
van der Spek wrote:<blockquote class=3D"gmail_quote" style=3D"margin: 0;mar=
gin-left: 0.8ex;border-left: 1px #ccc solid;padding-left: 1ex;"><div dir=3D=
"ltr"><div>Let's get the party started.</div><div><br></div><div>What have =
we got?</div><div><br></div><div>We've got functions like strtol and stoi w=
hich take a const char* or std::string and return a number. </div><div=
><br></div><div>long strtol(const char*, char **str_end, int base);</div><d=
iv>int stoi(const std::string&, std::size_t* pos =3D 0, int base =
=3D 10);</div><div><br></div><div>What do we want?</div><div><br></div><div=
>Input should not be required to be null terminated, so string_view seems l=
ike a suitable input type.</div><div>Error detection should be simpler, but=
not everyone is a fan of exceptions. </div></div></blockquote><div><b=
r></div><div>Exceptions must be optional. For high availability systems tha=
t should not crash one often is forced to disable exceptions. </div><d=
iv> </div><blockquote class=3D"gmail_quote" style=3D"margin: 0;margin-=
left: 0.8ex;border-left: 1px #ccc solid;padding-left: 1ex;"><div dir=3D"ltr=
"><div><br></div><div>And IMO skipping spaces should not be part of the par=
se function.</div></div></blockquote><div><br></div><div>I agree with this =
very strongly. Let users write wrappers if they want whitespace handling. B=
y default not processing white space is more efficient. Its also easier and=
more natural to add parsing rules rather than trying to awkwardly back out=
of the defaults.</div><div><br></div><blockquote class=3D"gmail_quote" sty=
le=3D"margin: 0;margin-left: 0.8ex;border-left: 1px #ccc solid;padding-left=
: 1ex;"><div dir=3D"ltr"><div>There's also the question of what to do when =
not the entire input can be parsed. Return an error or not.</div></div></bl=
ockquote><div><br></div><div>For maximum efficiency the base implementation=
must allow a non-zero tail string and also return it. If the tail string i=
s handled by an out parameter, we could have 2 overloads:</div><div><br></d=
iv><div><div class=3D"prettyprint" style=3D"border: 1px solid rgb(187, 187,=
187); word-wrap: break-word; background-color: rgb(250, 250, 250);"><code =
class=3D"prettyprint"><div class=3D"subprettyprint"><font color=3D"#666600"=
><span style=3D"color: #000;" class=3D"styled-by-prettify">modern_</span></=
font><span style=3D"color: #000;" class=3D"styled-by-prettify">return_thing=
</span><span style=3D"color: #660;" class=3D"styled-by-prettify"><</span=
><span style=3D"color: #000;" class=3D"styled-by-prettify">T</span><span st=
yle=3D"color: #660;" class=3D"styled-by-prettify">></span><span style=3D=
"color: #000;" class=3D"styled-by-prettify"> parse</span><span style=3D"col=
or: #660;" class=3D"styled-by-prettify">(</span><span style=3D"color: #000;=
" class=3D"styled-by-prettify">string_view</span><span style=3D"color: #660=
;" class=3D"styled-by-prettify">&&</span><span style=3D"color: #000=
;" class=3D"styled-by-prettify"> tail</span><span style=3D"color: #660;" cl=
ass=3D"styled-by-prettify">,</span><span style=3D"color: #000;" class=3D"st=
yled-by-prettify">string_</span><font color=3D"#666600"><span style=3D"colo=
r: #000;" class=3D"styled-by-prettify">view s</span><span style=3D"color: #=
660;" class=3D"styled-by-prettify">);</span><span style=3D"color: #000;" cl=
ass=3D"styled-by-prettify"> </span><span style=3D"color: #800;" class=3D"st=
yled-by-prettify">//Parses a T from s.Sets tail to the end of the string</s=
pan><span style=3D"color: #000;" class=3D"styled-by-prettify"><br>modern_re=
turn_thing</span><span style=3D"color: #660;" class=3D"styled-by-prettify">=
<</span><span style=3D"color: #000;" class=3D"styled-by-prettify">T</spa=
n><span style=3D"color: #660;" class=3D"styled-by-prettify">></span><spa=
n style=3D"color: #000;" class=3D"styled-by-prettify"> parse</span><span st=
yle=3D"color: #660;" class=3D"styled-by-prettify">(</span><span style=3D"co=
lor: #000;" class=3D"styled-by-prettify">string_view s</span><span style=3D=
"color: #660;" class=3D"styled-by-prettify">);</span><span style=3D"color: =
#000;" class=3D"styled-by-prettify"> </span><span style=3D"color: #800;" cl=
ass=3D"styled-by-prettify">//Parses a T from s. Error if there are extra ch=
aracters after the parsed string.</span></font></div></code></div><br><br><=
/div><div>We may also want to think generically. Can clients easily impleme=
nt efficient std::parse<T> routines for their own user defined types?=
</div></div>
<p></p>
-- <br />
<br />
--- <br />
You received this message because you are subscribed to the Google Groups &=
quot;ISO C++ Standard - Future Proposals" group.<br />
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to <a href=3D"mailto:std-proposals+unsubscribe@isocpp.org">std-proposa=
ls+unsubscribe@isocpp.org</a>.<br />
To post to this group, send email to <a href=3D"mailto:std-proposals@isocpp=
..org">std-proposals@isocpp.org</a>.<br />
Visit this group at <a href=3D"http://groups.google.com/a/isocpp.org/group/=
std-proposals/">http://groups.google.com/a/isocpp.org/group/std-proposals/<=
/a>.<br />
------=_Part_1043_115297541.1431983959518--
------=_Part_1042_35527801.1431983959518--
.
Author: Matthew Woehlke <mw_triad@users.sourceforge.net>
Date: Mon, 18 May 2015 17:40:13 -0400
Raw View
On 2015-05-18 14:34, Olaf van der Spek wrote:
> Let's get the party started.
That ship has sailed long ago :-). Please make sure you are up to speed
on the previous discussion on this topic.
> So, what about this one?
>
> optional<T> parse(string_view, std::size_t* pos = 0, int base = 10);
>
> An alternative could be:
>
> error_code parse(T&, string_view, std::size_t* pos = 0, int base = 10);
Neither. As Vicente pointed out, use std::expected. The original
discussion on this topic was the source of std::expected in the first
place; it would be rather disingenuous to not use it.
(Also... please spell invalid pointers as "nullptr" :-).)
> There's also the question of what to do when not the entire input can be
> parsed. Return an error or not.
I'll lean toward "not", but the user needs this information, either to
add that assertion themselves, or because after a number is consumed
they need to know where to resume parsing. So at minimum we need to know
how much was parsed.
You didn't specify, but I am guessing that maybe that's what you meant
by 'pos'? I would strongly consider however returning instead a new
string_view with the text that was not consumed. (If needed/useful, the
size_t* flavor can be a convenience overload, or the other way around.
It should be possible to implement one in terms of the other with
minimal overhead.)
Actually, I agree with the other Matthew Fioravante's suggestion of
mutating the input string_view / iterators. (Maybe we should just
support this like 'parse(in, &in)' and making sure that is efficient.)
--
Matthew
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
.
Author: Nicol Bolas <jmckesson@gmail.com>
Date: Mon, 18 May 2015 14:57:03 -0700 (PDT)
Raw View
------=_Part_997_119623929.1431986223080
Content-Type: multipart/alternative;
boundary="----=_Part_998_474681111.1431986223080"
------=_Part_998_474681111.1431986223080
Content-Type: text/plain; charset=UTF-8
On Monday, May 18, 2015 at 5:40:24 PM UTC-4, Matthew Woehlke wrote:
>
> On 2015-05-18 14:34, Olaf van der Spek wrote:
> > Let's get the party started.
>
> That ship has sailed long ago :-). Please make sure you are up to speed
> on the previous discussion on this topic.
>
> > So, what about this one?
> >
> > optional<T> parse(string_view, std::size_t* pos = 0, int base = 10);
> >
> > An alternative could be:
> >
> > error_code parse(T&, string_view, std::size_t* pos = 0, int base = 10);
>
> Neither. As Vicente pointed out, use std::expected. The original
> discussion on this topic was the source of std::expected in the first
> place; it would be rather disingenuous to not use it.
>
> (Also... please spell invalid pointers as "nullptr" :-).)
>
Before making this proposal dependent on `expected`, we should find out
where the committee is with that. I don't much care for `expected`, but if
it will get the error codes people off our API backs, I'll take it.
I just want to see the "sto*" functions get string_view ASAP. We shouldn't
wait on `expected` just to accomplish that.
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
------=_Part_998_474681111.1431986223080
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable
<div dir=3D"ltr"><br><br>On Monday, May 18, 2015 at 5:40:24 PM UTC-4, Matth=
ew Woehlke wrote:<blockquote class=3D"gmail_quote" style=3D"margin: 0;margi=
n-left: 0.8ex;border-left: 1px #ccc solid;padding-left: 1ex;">On 2015-05-18=
14:34, Olaf van der Spek wrote:
<br>> Let's get the party started.
<br>
<br>That ship has sailed long ago :-). Please make sure you are up to speed
<br>on the previous discussion on this topic.
<br>
<br>> So, what about this one?
<br>>=20
<br>> optional<T> parse(string_view, std::size_t* pos =3D 0, int b=
ase =3D 10);
<br>>=20
<br>> An alternative could be:
<br>>=20
<br>> error_code parse(T&, string_view, std::size_t* pos =3D 0, int =
base =3D 10);
<br>
<br>Neither. As Vicente pointed out, use std::expected. The original
<br>discussion on this topic was the source of std::expected in the first
<br>place; it would be rather disingenuous to not use it.
<br>
<br>(Also... please spell invalid pointers as "nullptr" :-).)<br></blockquo=
te><div><br>Before making this proposal dependent on `expected`, we should =
find out where the committee is with that. I don't much care for `expected`=
, but if it will get the error codes people off our API backs, I'll take it=
..<br><br>I just want to see the "sto*" functions get string_view ASAP. We s=
houldn't wait on `expected` just to accomplish that.</div></div>
<p></p>
-- <br />
<br />
--- <br />
You received this message because you are subscribed to the Google Groups &=
quot;ISO C++ Standard - Future Proposals" group.<br />
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to <a href=3D"mailto:std-proposals+unsubscribe@isocpp.org">std-proposa=
ls+unsubscribe@isocpp.org</a>.<br />
To post to this group, send email to <a href=3D"mailto:std-proposals@isocpp=
..org">std-proposals@isocpp.org</a>.<br />
Visit this group at <a href=3D"http://groups.google.com/a/isocpp.org/group/=
std-proposals/">http://groups.google.com/a/isocpp.org/group/std-proposals/<=
/a>.<br />
------=_Part_998_474681111.1431986223080--
------=_Part_997_119623929.1431986223080--
.
Author: "'Jeffrey Yasskin' via ISO C++ Standard - Future Proposals" <std-proposals@isocpp.org>
Date: Mon, 18 May 2015 15:09:17 -0700
Raw View
On Mon, May 18, 2015 at 2:40 PM, Matthew Woehlke
<mw_triad@users.sourceforge.net> wrote:
> Actually, I agree with the other Matthew Fioravante's suggestion of
> mutating the input string_view / iterators. (Maybe we should just
> support this like 'parse(in, &in)' and making sure that is efficient.)
This one gathered an objection at
https://groups.google.com/a/isocpp.org/d/msg/std-proposals/Hs1s2329FCo/dl9N2GnXfxQJ,
that the potential aliasing between the const char* and the
string_view itself can cause problems. I suspect any problems can be
fixed with a careful implementation, but I'm not certain, so it'd be a
good thing for the proposal to show tests of.
Jeffrey
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
.
Author: Olaf van der Spek <olafvdspek@gmail.com>
Date: Tue, 19 May 2015 00:11:44 +0200
Raw View
2015-05-19 0:09 GMT+02:00 'Jeffrey Yasskin' via ISO C++ Standard -
Future Proposals <std-proposals@isocpp.org>:
> On Mon, May 18, 2015 at 2:40 PM, Matthew Woehlke
> <mw_triad@users.sourceforge.net> wrote:
>> Actually, I agree with the other Matthew Fioravante's suggestion of
>> mutating the input string_view / iterators. (Maybe we should just
>> support this like 'parse(in, &in)' and making sure that is efficient.)
>
> This one gathered an objection at
> https://groups.google.com/a/isocpp.org/d/msg/std-proposals/Hs1s2329FCo/dl9N2GnXfxQJ,
> that the potential aliasing between the const char* and the
> string_view itself can cause problems. I suspect any problems can be
> fixed with a careful implementation, but I'm not certain, so it'd be a
> good thing for the proposal to show tests of.
I don't get that one, isn't aliasing only an issue in the presence of writes?
--
Olaf
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
.
Author: Olaf van der Spek <olafvdspek@gmail.com>
Date: Tue, 19 May 2015 00:16:26 +0200
Raw View
2015-05-18 22:11 GMT+02:00 'Jeffrey Yasskin' via ISO C++ Standard -
Future Proposals <std-proposals@isocpp.org>:
>> There's also the question of what to do when not the entire input can be
>> parsed. Return an error or not.
>
> I believe "not", so that these functions can be used in parsing larger formats.
Right, perhaps both should be supported. Require entire input to be
parsed if !pos
>> So, what about this one?
>>
>> optional<T> parse(string_view, std::size_t* pos = 0, int base = 10);
>>
>> An alternative could be:
>>
>> error_code parse(T&, string_view, std::size_t* pos = 0, int base = 10);
>
> I assume *pos gets the last position that was part of the parsed number?
Yes, I copied it from http://en.cppreference.com/w/cpp/string/basic_string/stol
> In the paper that proposes this, it'd be good to see examples of
> parsing code using each of the possible interfaces. That'll help
> produce a more informed decision than just looking at the interfaces
> abstractly.
The disadvantage of not having a variant with the number as an out
parameter is the requirement of specifying the type.
I've got a lot of cases where the number is stored into a pre-existing variable.
Then again, perhaps this is just a basic building block on which other
variants can be build.
--
Olaf
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
.
Author: Olaf van der Spek <olafvdspek@gmail.com>
Date: Tue, 19 May 2015 00:17:08 +0200
Raw View
2015-05-18 23:08 GMT+02:00 Jens Maurer <Jens.Maurer@gmx.net>:
> On 05/18/2015 08:34 PM, Olaf van der Spek wrote:
>> So, what about this one?
>>
>> optional<T> parse(string_view, std::size_t* pos = 0, int base = 10);
>>
>> An alternative could be:
>>
>> error_code parse(T&, string_view, std::size_t* pos = 0, int base = 10);
>
> I would appreciate if there would be a compile-time choice for base 2,
> base 8, base 10, base 16, not (only) a facility with a runtime parameter.
Why? Performance?
Shouldn't the compiler take care of that?
--
Olaf
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
.
Author: "'Jeffrey Yasskin' via ISO C++ Standard - Future Proposals" <std-proposals@isocpp.org>
Date: Mon, 18 May 2015 15:28:48 -0700
Raw View
On Mon, May 18, 2015 at 3:11 PM, Olaf van der Spek <olafvdspek@gmail.com> wrote:
> 2015-05-19 0:09 GMT+02:00 'Jeffrey Yasskin' via ISO C++ Standard -
> Future Proposals <std-proposals@isocpp.org>:
>> On Mon, May 18, 2015 at 2:40 PM, Matthew Woehlke
>> <mw_triad@users.sourceforge.net> wrote:
>>> Actually, I agree with the other Matthew Fioravante's suggestion of
>>> mutating the input string_view / iterators. (Maybe we should just
>>> support this like 'parse(in, &in)' and making sure that is efficient.)
>>
>> This one gathered an objection at
>> https://groups.google.com/a/isocpp.org/d/msg/std-proposals/Hs1s2329FCo/dl9N2GnXfxQJ,
>> that the potential aliasing between the const char* and the
>> string_view itself can cause problems. I suspect any problems can be
>> fixed with a careful implementation, but I'm not certain, so it'd be a
>> good thing for the proposal to show tests of.
>
> I don't get that one, isn't aliasing only an issue in the presence of writes?
I could imagine an implementation like:
double parse(string_view& s) {
for (; !s.empty(); s.remove_prefix(1)) {
whatever(s.front());
}
}
in which the compiler has to assume the s.front() is modified by the
s.remove_prefix(1) call. On the other hand, it's easy enough for the
implementation to act more like:
double parse(string_view& s) {
auto b = s.begin(), e = s.end();
for (; b != e; ++b) {
whatever(*b);
}
s = string_view(b, e);
}
which does seem to avoid any aliasing concerns. That's why it'd be
good for the proposal to include some tests.
On Mon, May 18, 2015 at 3:16 PM, Olaf van der Spek <olafvdspek@gmail.com> wrote:
> 2015-05-18 22:11 GMT+02:00 'Jeffrey Yasskin' via ISO C++ Standard -
>> In the paper that proposes this, it'd be good to see examples of
>> parsing code using each of the possible interfaces. That'll help
>> produce a more informed decision than just looking at the interfaces
>> abstractly.
>
> The disadvantage of not having a variant with the number as an out
> parameter is the requirement of specifying the type.
> I've got a lot of cases where the number is stored into a pre-existing variable.
> Then again, perhaps this is just a basic building block on which other
> variants can be build.
Sure, and for primitive types it doesn't really matter, but for
symmetry when people write their own parsing functions, it'd be nice
to let folks parse non-default-constructible types. Code examples will
help make either case.
On Mon, May 18, 2015 at 3:17 PM, Olaf van der Spek <olafvdspek@gmail.com> wrote:
> 2015-05-18 23:08 GMT+02:00 Jens Maurer <Jens.Maurer@gmx.net>:
>> I would appreciate if there would be a compile-time choice for base 2,
>> base 8, base 10, base 16, not (only) a facility with a runtime parameter.
>
> Why? Performance?
> Shouldn't the compiler take care of that?
If the implementer writes special cases for a few bases, and arranges
their function boundaries suitably, the compiler can inline just the
top level in order to delete the inactive options. This is something
to test though.
Jeffrey
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
.
Author: "'Jeffrey Yasskin' via ISO C++ Standard - Future Proposals" <std-proposals@isocpp.org>
Date: Mon, 18 May 2015 15:30:16 -0700
Raw View
On Mon, May 18, 2015 at 2:57 PM, Nicol Bolas <jmckesson@gmail.com> wrote:
> I just want to see the "sto*" functions get string_view ASAP. We shouldn't
> wait on `expected` just to accomplish that.
That seems like a straightforward paper to get through the committee.
If you write it, I'll try to prevent the group from creeping its
scope.
Jeffrey
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
.
Author: "Vicente J. Botet Escriba" <vicente.botet@wanadoo.fr>
Date: Tue, 19 May 2015 01:55:49 +0200
Raw View
This is a multi-part message in MIME format.
--------------020905010100000400050800
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: quoted-printable
Le 18/05/15 23:57, Nicol Bolas a =C3=A9crit :
>
>
> On Monday, May 18, 2015 at 5:40:24 PM UTC-4, Matthew Woehlke wrote:
>
> On 2015-05-18 14:34, Olaf van der Spek wrote:
> > Let's get the party started.
>
> That ship has sailed long ago :-). Please make sure you are up to
> speed
> on the previous discussion on this topic.
>
> > So, what about this one?
> >
> > optional<T> parse(string_view, std::size_t* pos =3D 0, int base =3D
> 10);
> >
> > An alternative could be:
> >
> > error_code parse(T&, string_view, std::size_t* pos =3D 0, int base
> =3D 10);
>
> Neither. As Vicente pointed out, use std::expected. The original
> discussion on this topic was the source of std::expected in the first
> place; it would be rather disingenuous to not use it.
>
> (Also... please spell invalid pointers as "nullptr" :-).)
>
>
> Before making this proposal dependent on `expected`, we should find=20
> out where the committee is with that. I don't much care for=20
> `expected`, but if it will get the error codes people off our API=20
> backs, I'll take it.
>
It was designed just for that :) but it seems this is not enough. Having=20
two (or an additional) alternative way to report errors is not something=20
the standard should support, as it would be confusing for the users. In=20
other words, what some says "this is not the C++ way".
> I just want to see the "sto*" functions get string_view ASAP. We=20
> shouldn't wait on `expected` just to accomplish that.
>
You are right, expected should/could not block any proposal. expected in=20
not in the standard, so we couldn't use it.
We must continue making/adopting new non controversial proposal with a=20
well defined scope, using the available abstractions. Maybe the PO could=20
use alternatively the FileSystem TS way, as today we don't have anything=20
else.
Again, my apologies for going outside the scope of the PO proposal.
Vicente
--=20
---=20
You received this message because you are subscribed to the Google Groups "=
ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposa=
ls/.
--------------020905010100000400050800
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable
<html>
<head>
<meta content=3D"text/html; charset=3Dutf-8" http-equiv=3D"Content-Type=
">
</head>
<body bgcolor=3D"#FFFFFF" text=3D"#000000">
<div class=3D"moz-cite-prefix">Le 18/05/15 23:57, Nicol Bolas a
=C3=A9crit=C2=A0:<br>
</div>
<blockquote
cite=3D"mid:9093725e-c3fe-497f-a18e-d4375dc6860d@isocpp.org"
type=3D"cite">
<div dir=3D"ltr"><br>
<br>
On Monday, May 18, 2015 at 5:40:24 PM UTC-4, Matthew Woehlke
wrote:
<blockquote class=3D"gmail_quote" style=3D"margin: 0;margin-left:
0.8ex;border-left: 1px #ccc solid;padding-left: 1ex;">On
2015-05-18 14:34, Olaf van der Spek wrote:
<br>
> Let's get the party started.
<br>
<br>
That ship has sailed long ago :-). Please make sure you are up
to speed
<br>
on the previous discussion on this topic.
<br>
<br>
> So, what about this one?
<br>
> <br>
> optional<T> parse(string_view, std::size_t* pos =3D
0, int base =3D 10);
<br>
> <br>
> An alternative could be:
<br>
> <br>
> error_code parse(T&, string_view, std::size_t* pos =3D
0, int base =3D 10);
<br>
<br>
Neither. As Vicente pointed out, use std::expected. The
original
<br>
discussion on this topic was the source of std::expected in
the first
<br>
place; it would be rather disingenuous to not use it.
<br>
<br>
(Also... please spell invalid pointers as "nullptr" :-).)<br>
</blockquote>
<div><br>
Before making this proposal dependent on `expected`, we should
find out where the committee is with that. I don't much care
for `expected`, but if it will get the error codes people off
our API backs, I'll take it.<br>
<br>
</div>
</div>
</blockquote>
It was designed just for that :) but it seems this is not enough.
Having two (or an additional) alternative way to report errors is
not something the standard should support, as it would be confusing
for the users. In other words, what some says "this is not the C++
way".<br>
<blockquote
cite=3D"mid:9093725e-c3fe-497f-a18e-d4375dc6860d@isocpp.org"
type=3D"cite">
<div dir=3D"ltr">
<div>I just want to see the "sto*" functions get string_view
ASAP. We shouldn't wait on `expected` just to accomplish that.</d=
iv>
</div>
<br>
</blockquote>
You are right, expected should/could not block any proposal.
expected in not in the standard, so we couldn't use it. <br>
We must continue making/adopting new non controversial proposal with
a well defined scope, using the available abstractions. Maybe the PO
could use alternatively the FileSystem TS way, as today we don't
have anything else.<br>
<br>
Again, my apologies for going outside the scope of the PO proposal.<br>
<br>
Vicente<br>
<br>
</body>
</html>
<p></p>
-- <br />
<br />
--- <br />
You received this message because you are subscribed to the Google Groups &=
quot;ISO C++ Standard - Future Proposals" group.<br />
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to <a href=3D"mailto:std-proposals+unsubscribe@isocpp.org">std-proposa=
ls+unsubscribe@isocpp.org</a>.<br />
To post to this group, send email to <a href=3D"mailto:std-proposals@isocpp=
..org">std-proposals@isocpp.org</a>.<br />
Visit this group at <a href=3D"http://groups.google.com/a/isocpp.org/group/=
std-proposals/">http://groups.google.com/a/isocpp.org/group/std-proposals/<=
/a>.<br />
--------------020905010100000400050800--
.
Author: Nicol Bolas <jmckesson@gmail.com>
Date: Mon, 18 May 2015 18:01:29 -0700 (PDT)
Raw View
------=_Part_1913_787833270.1431997289787
Content-Type: multipart/alternative;
boundary="----=_Part_1914_1877085135.1431997289787"
------=_Part_1914_1877085135.1431997289787
Content-Type: text/plain; charset=UTF-8
On Monday, May 18, 2015 at 6:30:38 PM UTC-4, Jeffrey Yasskin wrote:
>
> On Mon, May 18, 2015 at 2:57 PM, Nicol Bolas <jmck...@gmail.com
> <javascript:>> wrote:
> > I just want to see the "sto*" functions get string_view ASAP. We
> shouldn't
> > wait on `expected` just to accomplish that.
>
> That seems like a straightforward paper to get through the committee.
> If you write it, I'll try to prevent the group from creeping its
> scope.
>
Actually, that paper already exists (N4015
<http://www.open-std.org/JTC1/SC22/WG21/docs/papers/2014/n4015.pdf>), with
a revision (N4109
<http://www.open-std.org/JTC1/SC22/WG21/docs/papers/2014/n4109.pdf>).
According to Vicente Escriba, who both wrote those proposals and replied
here, scope creep has already started. Also, N4109 has been a while ago
(almost a year now), with no followup paper from any discussions based on
it.
I think at this point, we should focus on getting the core feature: parsing
strings via string_view. And those many malign the FileSystem TS solution,
it *is* prior art on dealing with error codes in standard library C++.
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
------=_Part_1914_1877085135.1431997289787
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable
<div dir=3D"ltr">On Monday, May 18, 2015 at 6:30:38 PM UTC-4, Jeffrey Yassk=
in wrote:<blockquote class=3D"gmail_quote" style=3D"margin: 0;margin-left: =
0.8ex;border-left: 1px #ccc solid;padding-left: 1ex;">On Mon, May 18, 2015 =
at 2:57 PM, Nicol Bolas <<a href=3D"javascript:" target=3D"_blank" gdf-o=
bfuscated-mailto=3D"D4FBlv7Hr1AJ" rel=3D"nofollow" onmousedown=3D"this.href=
=3D'javascript:';return true;" onclick=3D"this.href=3D'javascript:';return =
true;">jmck...@gmail.com</a>> wrote:
<br>> I just want to see the "sto*" functions get string_view ASAP. We s=
houldn't
<br>> wait on `expected` just to accomplish that.
<br>
<br>That seems like a straightforward paper to get through the committee.
<br>If you write it, I'll try to prevent the group from creeping its
<br>scope.
<br></blockquote><div><br>Actually, that paper already exists (<a href=3D"h=
ttp://www.open-std.org/JTC1/SC22/WG21/docs/papers/2014/n4015.pdf">N4015</a>=
), with a revision (<a href=3D"http://www.open-std.org/JTC1/SC22/WG21/docs/=
papers/2014/n4109.pdf">N4109</a>). According to Vicente Escriba, who both w=
rote those proposals and replied here, scope creep has already started. Als=
o, N4109 has been a while ago (almost a year now), with no followup paper f=
rom any discussions based on it.<br><br>I think at this point, we should fo=
cus on getting the core feature: parsing strings via string_view. And those=
many malign the FileSystem TS solution, it <i>is</i> prior art on dealing =
with error codes in standard library C++.<br></div></div>
<p></p>
-- <br />
<br />
--- <br />
You received this message because you are subscribed to the Google Groups &=
quot;ISO C++ Standard - Future Proposals" group.<br />
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to <a href=3D"mailto:std-proposals+unsubscribe@isocpp.org">std-proposa=
ls+unsubscribe@isocpp.org</a>.<br />
To post to this group, send email to <a href=3D"mailto:std-proposals@isocpp=
..org">std-proposals@isocpp.org</a>.<br />
Visit this group at <a href=3D"http://groups.google.com/a/isocpp.org/group/=
std-proposals/">http://groups.google.com/a/isocpp.org/group/std-proposals/<=
/a>.<br />
------=_Part_1914_1877085135.1431997289787--
------=_Part_1913_787833270.1431997289787--
.
Author: "'Jeffrey Yasskin' via ISO C++ Standard - Future Proposals" <std-proposals@isocpp.org>
Date: Mon, 18 May 2015 18:17:07 -0700
Raw View
On Mon, May 18, 2015 at 6:01 PM, Nicol Bolas <jmckesson@gmail.com> wrote:
> On Monday, May 18, 2015 at 6:30:38 PM UTC-4, Jeffrey Yasskin wrote:
>>
>> On Mon, May 18, 2015 at 2:57 PM, Nicol Bolas <jmck...@gmail.com> wrote:
>> > I just want to see the "sto*" functions get string_view ASAP. We
>> > shouldn't
>> > wait on `expected` just to accomplish that.
>>
>> That seems like a straightforward paper to get through the committee.
>> If you write it, I'll try to prevent the group from creeping its
>> scope.
>
>
> Actually, that paper already exists (N4015), with a revision (N4109).
> According to Vicente Escriba, who both wrote those proposals and replied
> here, scope creep has already started. Also, N4109 has been a while ago
> (almost a year now), with no followup paper from any discussions based on
> it.
>
> I think at this point, we should focus on getting the core feature: parsing
> strings via string_view. And those many malign the FileSystem TS solution,
> it is prior art on dealing with error codes in standard library C++.
You said you want the "sto*" functions to get string_view. You should
write the paper proposing that the "sto*" functions get string_view.
You should not write the expected<> paper.
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
.
Author: Thiago Macieira <thiago@macieira.org>
Date: Mon, 18 May 2015 23:49:28 -0700
Raw View
On Monday 18 May 2015 23:16:35 Magnus Fromreide wrote:
> > long strtol(const char*, char **str_end, int base);
> > int stoi(const std::string&, std::size_t* pos = 0, int base = 10);
> >
> > What do we want?
> >
> > Input should not be required to be null terminated, so string_view seems
> > like a suitable input type.
>
> I think iterators/ranges are a better input type. Why should we require that
> the input is consecutive?
Because at least one instantiation of those functions is not inline and will
require contiguous memory. Also note the requirement for char, not char16_t,
char32_t, MyChar, etc.. I'd even say that wchar_t need not be included.
All the unsigned instantiations can be implemented inline by calling an out-
of-line uintmax_t instantiation and the same goes for the signed versions and
intmax_t. Or, depending on the behaviour, all instantiations call a backend
that takes (u)intmax_t min and max of the type in question.
--
Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org
Software Architect - Intel Open Source Technology Center
PGP/GPG: 0x6EF45358; fingerprint:
E067 918B B660 DBD1 105C 966C 33F5 F005 6EF4 5358
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
.
Author: Jens Maurer <Jens.Maurer@gmx.net>
Date: Tue, 19 May 2015 08:53:35 +0200
Raw View
On 05/19/2015 12:17 AM, Olaf van der Spek wrote:
> 2015-05-18 23:08 GMT+02:00 Jens Maurer <Jens.Maurer@gmx.net>:
>> On 05/18/2015 08:34 PM, Olaf van der Spek wrote:
>>> So, what about this one?
>>>
>>> optional<T> parse(string_view, std::size_t* pos = 0, int base = 10);
>>>
>>> An alternative could be:
>>>
>>> error_code parse(T&, string_view, std::size_t* pos = 0, int base = 10);
>>
>> I would appreciate if there would be a compile-time choice for base 2,
>> base 8, base 10, base 16, not (only) a facility with a runtime parameter.
>
> Why? Performance?
Yes. In most use cases, you know at compile-time which base you
expect. I believe an appropriate interface should allow me to
convey that compile-time information to the callee.
Even if you don't know at compile-time, it's likely that you're
making a decision before the call (e.g. when parsing C-style
number prefixes such as 0x, 0b, 0). There is no need for the
callee to decide again.
> Shouldn't the compiler take care of that?
These parser functions are not templates, and at least for
the base-10 case, the required code is large, so I anticipate
these functions to be out-of-line.
http://www.cesura17.net/~will/Professional/Research/Papers/howtoread.pdf
From a modularity standpoint, the compile-time-base functions
can be invoked from the generic one with no unanticipated overhead,
but not vice versa.
Jens
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
.
Author: Thiago Macieira <thiago@macieira.org>
Date: Mon, 18 May 2015 23:55:36 -0700
Raw View
On Monday 18 May 2015 14:19:19 Matthew Fioravante wrote:
> We may also want to think generically. Can clients easily implement
> efficient std::parse<T> routines for their own user defined types?
I don't think so.
Integer parsing may be acceptable, but it's borderline already -- the FreeBSD
implementations of strtoull and strtoll are around 60 lines of C code, the
glibc implementation is 300 lines and supports parsing of locale-specified
number grouping. But floating point parsing can't reasonably be done inline.
--
Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org
Software Architect - Intel Open Source Technology Center
PGP/GPG: 0x6EF45358; fingerprint:
E067 918B B660 DBD1 105C 966C 33F5 F005 6EF4 5358
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
.
Author: Olaf van der Spek <olafvdspek@gmail.com>
Date: Tue, 19 May 2015 12:34:22 +0200
Raw View
2015-05-19 8:53 GMT+02:00 Jens Maurer <Jens.Maurer@gmx.net>:
>> Why? Performance?
>
> Yes. In most use cases, you know at compile-time which base you
> expect. I believe an appropriate interface should allow me to
> convey that compile-time information to the callee.
IMO the optimizer should take care of that.
> Even if you don't know at compile-time, it's likely that you're
> making a decision before the call (e.g. when parsing C-style
> number prefixes such as 0x, 0b, 0). There is no need for the
> callee to decide again.
Parsing the prefix is currently part of the number parser..
>> Shouldn't the compiler take care of that?
>
> These parser functions are not templates, and at least for
> the base-10 case, the required code is large, so I anticipate
> these functions to be out-of-line.
If the base is known at compile-time it shouldn't matter (to a good
compiler) whether it's a template or not.
> http://www.cesura17.net/~will/Professional/Research/Papers/howtoread.pdf
ETIMEOUT
> From a modularity standpoint, the compile-time-base functions
> can be invoked from the generic one with no unanticipated overhead,
> but not vice versa.
True
--
Olaf
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
.
Author: Matthew Woehlke <mw_triad@users.sourceforge.net>
Date: Tue, 19 May 2015 10:42:23 -0400
Raw View
On 2015-05-18 18:09, 'Jeffrey Yasskin' via ISO C++ Standard - Future
Proposals wrote:
> On Mon, May 18, 2015 at 2:40 PM, Matthew Woehlke wrote:
>> Actually, I agree with the other Matthew Fioravante's suggestion of
>> mutating the input string_view / iterators. (Maybe we should just
>> support this like 'parse(in, &in)' and making sure that is efficient.)
>
> This one gathered an objection at
> https://groups.google.com/a/isocpp.org/d/msg/std-proposals/Hs1s2329FCo/dl9N2GnXfxQJ,
> that the potential aliasing between the const char* and the
> string_view itself can cause problems.
That sounds like a QOI issue. It also sounds like something that vendors
should be able to fix / work around.
> I suspect any problems can be fixed with a careful implementation,
Yes. Don't modern compilers have a way to tell the compiler to assume
that two entities do not alias?
--
Matthew
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
.
Author: Thiago Macieira <thiago@macieira.org>
Date: Tue, 19 May 2015 08:29:15 -0700
Raw View
On Tuesday 19 May 2015 10:42:23 Matthew Woehlke wrote:
> > I suspect any problems can be fixed with a careful implementation,
>
> Yes. Don't modern compilers have a way to tell the compiler to assume
> that two entities do not alias?
C99 restricted pointers may help here.
--
Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org
Software Architect - Intel Open Source Technology Center
PGP/GPG: 0x6EF45358; fingerprint:
E067 918B B660 DBD1 105C 966C 33F5 F005 6EF4 5358
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
.
Author: "'Jeffrey Yasskin' via ISO C++ Standard - Future Proposals" <std-proposals@isocpp.org>
Date: Tue, 19 May 2015 08:37:36 -0700
Raw View
On Tue, May 19, 2015 at 3:34 AM, Olaf van der Spek <olafvdspek@gmail.com> wrote:
> 2015-05-19 8:53 GMT+02:00 Jens Maurer <Jens.Maurer@gmx.net>:
>>> Why? Performance?
>>
>> Yes. In most use cases, you know at compile-time which base you
>> expect. I believe an appropriate interface should allow me to
>> convey that compile-time information to the callee.
>
> IMO the optimizer should take care of that.
Your opinion (and Jens's opinion) don't actually matter here. Whoever
writes the paper should investigate what optimizers actually do and
describe it in the paper.
Jeffrey
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
.
Author: Matthew Fioravante <fmatthew5876@gmail.com>
Date: Tue, 19 May 2015 09:33:32 -0700 (PDT)
Raw View
------=_Part_398_1744040193.1432053212279
Content-Type: multipart/alternative;
boundary="----=_Part_399_1079225848.1432053212279"
------=_Part_399_1079225848.1432053212279
Content-Type: text/plain; charset=UTF-8
On Tuesday, May 19, 2015 at 11:37:58 AM UTC-4, Jeffrey Yasskin wrote:
>
> On Tue, May 19, 2015 at 3:34 AM, Olaf van der Spek <olafv...@gmail.com
> <javascript:>> wrote:
> > 2015-05-19 8:53 GMT+02:00 Jens Maurer <Jens....@gmx.net <javascript:>>:
> >>> Why? Performance?
> >>
> >> Yes. In most use cases, you know at compile-time which base you
> >> expect. I believe an appropriate interface should allow me to
> >> convey that compile-time information to the callee.
>
If the base was a template parameter, then we would need to provide a
second version which accepts base as a runtime parameter for those use
cases when we don't know base at compile time. Having one version of parse
with template parameters and another with normal runtimes parameters seems
like it would be bloated and confusing.
> >
> > IMO the optimizer should take care of that.
>
> Your opinion (and Jens's opinion) don't actually matter here. Whoever
> writes the paper should investigate what optimizers actually do and
> describe it in the paper.
>
>
Performance of these routines is paramount. I've had big data processing
applications who ended up spending a large portion of their runtimes inisde
of strtod().
Almost every time, we know the base at compile time so the parse routine
can just be an inline wrapper which calls the specific optimized versions
if they exist.
namespace detail {
ret<int> optimized_parse_int_base10(string_view s); //defined out of line
ret<int> optimized_parse_int_base16(string_view s); //defined out of line
ret<int> generic_parse_int(string_view s); //defined out of line
};
template <>
inline ret<int> parse<int>(string_view s, int base=10) {
if(base == 10) {
return detail::optimized_parse_int_base10(s);
}
if(base == 0x10) {
return detail::optimized_parse_int_base16(s);
}
return detail::generic_parse_int(s);
}
Removing the check for base==N and calling the correct underlying function
directly is very easy for any modern optimizer. My guess is that 99% of
uses cases will specify the base as a compile time constant so this
optimization is probably a good bet for performance in a generic interface.
Cases where this approach *might* actually be a performance loser:
- If the client is parsing numbers of many different bases, it might be
faster to just always use the generic routine even if it does a bit more
computation than the optimized methods. The reason being that the one
generic routine will stay hot in the Icache while many different optimized
routines can occupy different cache lines, all of which will be less hot
and therefore may be swapped back out to main memory more often. I've
worked with several projects where code size differences like this have a
measurable impact on performance.
- If the client is passing a base whose value is truly determined at
runtime (the compiler can't prove anything about it). In this scenario, its
possible that the overhead of checking for the optimized values of base
could be more expensive than the gains from the optimized routines
themselves. Even if it still makes sense to check for and call optimized
routines, it may be better to do the dispatch out of line to avoid code
bloat at all of the call sites (maybe the inliner can figure this out?).
Its easy enough to measure and figure out these performance concerns for a
single project but I'm not sure how you'd do it for a generic interface
intended to be used by the whole world.
If the overhead of dispatching turns out to be a real concern, then 2
functions can be introduced.
ret<T> parse_generic<T>(string_view s, int base); //parses a T from s
ret<T> parse<T>(string_view, int base); //same as parse_generic(), but may
do inline dispatch to optimized routines for specific values of base
But of course now we have a rather expensive feature creep for an arguably
dubious performance concern. If we had constexpr overloading or some method
of constexpr parameter detection, the implementation could choose to do the
inline dispatch only when base is constexpr (if that makes sense to do).
Aren't all of these points QOI issues anyway?
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
------=_Part_399_1079225848.1432053212279
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable
<div dir=3D"ltr"><br><br>On Tuesday, May 19, 2015 at 11:37:58 AM UTC-4, Jef=
frey Yasskin wrote:<blockquote class=3D"gmail_quote" style=3D"margin: 0;mar=
gin-left: 0.8ex;border-left: 1px #ccc solid;padding-left: 1ex;">On Tue, May=
19, 2015 at 3:34 AM, Olaf van der Spek <<a href=3D"javascript:" target=
=3D"_blank" gdf-obfuscated-mailto=3D"eegFivTRYpYJ" rel=3D"nofollow" onmouse=
down=3D"this.href=3D'javascript:';return true;" onclick=3D"this.href=3D'jav=
ascript:';return true;">olafv...@gmail.com</a>> wrote:
<br>> 2015-05-19 8:53 GMT+02:00 Jens Maurer <<a href=3D"javascript:" =
target=3D"_blank" gdf-obfuscated-mailto=3D"eegFivTRYpYJ" rel=3D"nofollow" o=
nmousedown=3D"this.href=3D'javascript:';return true;" onclick=3D"this.href=
=3D'javascript:';return true;">Jens....@gmx.net</a>>:
<br>>>> Why? Performance?
<br>>>
<br>>> Yes. In most use cases, you know at compile-time which b=
ase you
<br>>> expect. I believe an appropriate interface should allow =
me to
<br>>> convey that compile-time information to the callee.
<br></blockquote><div><br>If the base was a template parameter, then we wou=
ld need to provide a second version which accepts base as a runtime paramet=
er for those use cases when we don't know base at compile time. Having one =
version of parse with template parameters and another with normal runtimes =
parameters seems like it would be bloated and confusing.<br> </div><bl=
ockquote class=3D"gmail_quote" style=3D"margin: 0;margin-left: 0.8ex;border=
-left: 1px #ccc solid;padding-left: 1ex;">>
<br>> IMO the optimizer should take care of that.
<br>
<br>Your opinion (and Jens's opinion) don't actually matter here. Whoever
<br>writes the paper should investigate what optimizers actually do and
<br>describe it in the paper.
<br>
<br></blockquote><div><br>Performance of these routines is paramount. I've =
had big data processing applications who ended up spending a large portion =
of their runtimes inisde of strtod().<br><br>Almost every time, we know the=
base at compile time so the parse routine can just be an inline wrapper wh=
ich calls the specific optimized versions if they exist. <br><br><div class=
=3D"prettyprint" style=3D"background-color: rgb(250, 250, 250); border-colo=
r: rgb(187, 187, 187); border-style: solid; border-width: 1px; word-wrap: b=
reak-word;"><code class=3D"prettyprint"><div class=3D"subprettyprint"><span=
style=3D"color: #008;" class=3D"styled-by-prettify">namespace</span><span =
style=3D"color: #000;" class=3D"styled-by-prettify"> detail </span><span st=
yle=3D"color: #660;" class=3D"styled-by-prettify">{</span><span style=3D"co=
lor: #000;" class=3D"styled-by-prettify"><br> ret</span><span style=3D=
"color: #080;" class=3D"styled-by-prettify"><int></span><span style=
=3D"color: #000;" class=3D"styled-by-prettify"> optimized_parse_int_base10<=
/span><span style=3D"color: #660;" class=3D"styled-by-prettify">(</span><sp=
an style=3D"color: #000;" class=3D"styled-by-prettify">string_view s</span>=
<span style=3D"color: #660;" class=3D"styled-by-prettify">);</span><span st=
yle=3D"color: #000;" class=3D"styled-by-prettify"> </span><span style=3D"co=
lor: #800;" class=3D"styled-by-prettify">//defined out of line</span><span =
style=3D"color: #000;" class=3D"styled-by-prettify"><br> ret</span><sp=
an style=3D"color: #080;" class=3D"styled-by-prettify"><int></span><s=
pan style=3D"color: #000;" class=3D"styled-by-prettify"> optimized_parse_in=
t_base16</span><span style=3D"color: #660;" class=3D"styled-by-prettify">(<=
/span><span style=3D"color: #000;" class=3D"styled-by-prettify">string_view=
s</span><span style=3D"color: #660;" class=3D"styled-by-prettify">);</span=
><span style=3D"color: #000;" class=3D"styled-by-prettify"> </span><span st=
yle=3D"color: #800;" class=3D"styled-by-prettify">//defined out of line</sp=
an><span style=3D"color: #000;" class=3D"styled-by-prettify"><br> ret<=
/span><span style=3D"color: #080;" class=3D"styled-by-prettify"><int>=
</span><span style=3D"color: #000;" class=3D"styled-by-prettify"> generic_p=
arse_int</span><span style=3D"color: #660;" class=3D"styled-by-prettify">(<=
/span><span style=3D"color: #000;" class=3D"styled-by-prettify">string_view=
s</span><span style=3D"color: #660;" class=3D"styled-by-prettify">);</span=
><span style=3D"color: #000;" class=3D"styled-by-prettify"> </span><span st=
yle=3D"color: #800;" class=3D"styled-by-prettify">//defined out of line</sp=
an><span style=3D"color: #000;" class=3D"styled-by-prettify"><br></span><sp=
an style=3D"color: #660;" class=3D"styled-by-prettify">};</span><span style=
=3D"color: #000;" class=3D"styled-by-prettify"><br><br></span><span style=
=3D"color: #008;" class=3D"styled-by-prettify">template</span><span style=
=3D"color: #000;" class=3D"styled-by-prettify"> </span><span style=3D"color=
: #660;" class=3D"styled-by-prettify"><></span><span style=3D"color: =
#000;" class=3D"styled-by-prettify"><br></span><span style=3D"color: #008;"=
class=3D"styled-by-prettify">inline</span><span style=3D"color: #000;" cla=
ss=3D"styled-by-prettify"> ret</span><span style=3D"color: #080;" class=3D"=
styled-by-prettify"><int></span><span style=3D"color: #000;" class=3D=
"styled-by-prettify"> parse</span><span style=3D"color: #080;" class=3D"sty=
led-by-prettify"><int></span><span style=3D"color: #660;" class=3D"st=
yled-by-prettify">(</span><span style=3D"color: #000;" class=3D"styled-by-p=
rettify">string_view s</span><span style=3D"color: #660;" class=3D"styled-b=
y-prettify">,</span><span style=3D"color: #000;" class=3D"styled-by-prettif=
y"> </span><span style=3D"color: #008;" class=3D"styled-by-prettify">int</s=
pan><span style=3D"color: #000;" class=3D"styled-by-prettify"> </span><span=
style=3D"color: #008;" class=3D"styled-by-prettify">base</span><span style=
=3D"color: #660;" class=3D"styled-by-prettify">=3D</span><span style=3D"col=
or: #066;" class=3D"styled-by-prettify">10</span><span style=3D"color: #660=
;" class=3D"styled-by-prettify">)</span><span style=3D"color: #000;" class=
=3D"styled-by-prettify"> </span><span style=3D"color: #660;" class=3D"style=
d-by-prettify">{</span><span style=3D"color: #000;" class=3D"styled-by-pret=
tify"><br> </span><span style=3D"color: #008;" class=3D"styled-by-pre=
ttify">if</span><span style=3D"color: #660;" class=3D"styled-by-prettify">(=
</span><span style=3D"color: #008;" class=3D"styled-by-prettify">base</span=
><span style=3D"color: #000;" class=3D"styled-by-prettify"> </span><span st=
yle=3D"color: #660;" class=3D"styled-by-prettify">=3D=3D</span><span style=
=3D"color: #000;" class=3D"styled-by-prettify"> </span><span style=3D"color=
: #066;" class=3D"styled-by-prettify">10</span><span style=3D"color: #660;"=
class=3D"styled-by-prettify">)</span><span style=3D"color: #000;" class=3D=
"styled-by-prettify"> </span><span style=3D"color: #660;" class=3D"styled-b=
y-prettify">{</span><span style=3D"color: #000;" class=3D"styled-by-prettif=
y"><br> </span><span style=3D"color: #008;" class=3D"styled-by=
-prettify">return</span><span style=3D"color: #000;" class=3D"styled-by-pre=
ttify"> detail</span><span style=3D"color: #660;" class=3D"styled-by-pretti=
fy">::</span><span style=3D"color: #000;" class=3D"styled-by-prettify">opti=
mized_parse_int_base10</span><span style=3D"color: #660;" class=3D"styled-b=
y-prettify">(</span><span style=3D"color: #000;" class=3D"styled-by-prettif=
y">s</span><span style=3D"color: #660;" class=3D"styled-by-prettify">);</sp=
an><span style=3D"color: #000;" class=3D"styled-by-prettify"><br> </s=
pan><span style=3D"color: #660;" class=3D"styled-by-prettify">}</span><span=
style=3D"color: #000;" class=3D"styled-by-prettify"><br> </span><spa=
n style=3D"color: #008;" class=3D"styled-by-prettify">if</span><span style=
=3D"color: #660;" class=3D"styled-by-prettify">(</span><span style=3D"color=
: #008;" class=3D"styled-by-prettify">base</span><span style=3D"color: #000=
;" class=3D"styled-by-prettify"> </span><span style=3D"color: #660;" class=
=3D"styled-by-prettify">=3D=3D</span><span style=3D"color: #000;" class=3D"=
styled-by-prettify"> </span><span style=3D"color: #066;" class=3D"styled-by=
-prettify">0x10</span><span style=3D"color: #660;" class=3D"styled-by-prett=
ify">)</span><span style=3D"color: #000;" class=3D"styled-by-prettify"> </s=
pan><span style=3D"color: #660;" class=3D"styled-by-prettify">{</span><span=
style=3D"color: #000;" class=3D"styled-by-prettify"><br> </sp=
an><span style=3D"color: #008;" class=3D"styled-by-prettify">return</span><=
span style=3D"color: #000;" class=3D"styled-by-prettify"> detail</span><spa=
n style=3D"color: #660;" class=3D"styled-by-prettify">::</span><span style=
=3D"color: #000;" class=3D"styled-by-prettify">optimized_parse_int_base16</=
span><span style=3D"color: #660;" class=3D"styled-by-prettify">(</span><spa=
n style=3D"color: #000;" class=3D"styled-by-prettify">s</span><span style=
=3D"color: #660;" class=3D"styled-by-prettify">);</span><span style=3D"colo=
r: #000;" class=3D"styled-by-prettify"><br> </span><span style=3D"col=
or: #660;" class=3D"styled-by-prettify">}</span><span style=3D"color: #000;=
" class=3D"styled-by-prettify"><br> </span><span style=3D"color: #008=
;" class=3D"styled-by-prettify">return</span><span style=3D"color: #000;" c=
lass=3D"styled-by-prettify"> detail</span><span style=3D"color: #660;" clas=
s=3D"styled-by-prettify">::</span><span style=3D"color: #000;" class=3D"sty=
led-by-prettify">generic_parse_int</span><span style=3D"color: #660;" class=
=3D"styled-by-prettify">(</span><span style=3D"color: #000;" class=3D"style=
d-by-prettify">s</span><span style=3D"color: #660;" class=3D"styled-by-pret=
tify">);</span><span style=3D"color: #000;" class=3D"styled-by-prettify"><b=
r></span><span style=3D"color: #660;" class=3D"styled-by-prettify">}</span>=
<span style=3D"color: #000;" class=3D"styled-by-prettify"><br></span></div>=
</code></div><br><br>Removing the check for base=3D=3DN and calling the cor=
rect underlying function directly is very easy for any modern optimizer. My=
guess is that 99% of uses cases will specify the base as a compile time co=
nstant so this optimization is probably a good bet for performance in a gen=
eric interface.<br><br>Cases where this approach *might* actually be a perf=
ormance loser:<br>- If the client is parsing numbers of many different base=
s, it might be faster to just always use the generic routine even if it doe=
s a bit more computation than the optimized methods. The reason being that =
the one generic routine will stay hot in the Icache while many different op=
timized routines can occupy different cache lines, all of which will be les=
s hot and therefore may be swapped back out to main memory more often. I've=
worked with several projects where code size differences like this have a =
measurable impact on performance.<br>- If the client is passing a base whos=
e value is truly determined at runtime (the compiler can't prove anything a=
bout it). In this scenario, its possible that the overhead of checking for =
the optimized values of base could be more expensive than the gains from th=
e optimized routines themselves. Even if it still makes sense to check for =
and call optimized routines, it may be better to do the dispatch out of lin=
e to avoid code bloat at all of the call sites (maybe the inliner can figur=
e this out?).<br><br>Its easy enough to measure and figure out these perfor=
mance concerns for a single project but I'm not sure how you'd do it for a =
generic interface intended to be used by the whole world.<br><br>If the ove=
rhead of dispatching turns out to be a real concern, then 2 functions can b=
e introduced.<br><div class=3D"prettyprint" style=3D"background-color: rgb(=
250, 250, 250); border-color: rgb(187, 187, 187); border-style: solid; bord=
er-width: 1px; word-wrap: break-word;"><code class=3D"prettyprint"><div cla=
ss=3D"subprettyprint"><span style=3D"color: #000;" class=3D"styled-by-prett=
ify">ret</span><span style=3D"color: #660;" class=3D"styled-by-prettify">&l=
t;</span><span style=3D"color: #000;" class=3D"styled-by-prettify">T</span>=
<span style=3D"color: #660;" class=3D"styled-by-prettify">></span><span =
style=3D"color: #000;" class=3D"styled-by-prettify"> parse_generic</span><s=
pan style=3D"color: #660;" class=3D"styled-by-prettify"><</span><span st=
yle=3D"color: #000;" class=3D"styled-by-prettify">T</span><span style=3D"co=
lor: #660;" class=3D"styled-by-prettify">>(</span><span style=3D"color: =
#000;" class=3D"styled-by-prettify">string_view s</span><span style=3D"colo=
r: #660;" class=3D"styled-by-prettify">,</span><span style=3D"color: #000;"=
class=3D"styled-by-prettify"> </span><span style=3D"color: #008;" class=3D=
"styled-by-prettify">int</span><span style=3D"color: #000;" class=3D"styled=
-by-prettify"> </span><span style=3D"color: #008;" class=3D"styled-by-prett=
ify">base</span><span style=3D"color: #660;" class=3D"styled-by-prettify">)=
;</span><span style=3D"color: #000;" class=3D"styled-by-prettify"> </span><=
span style=3D"color: #800;" class=3D"styled-by-prettify">//parses a T from =
s</span><span style=3D"color: #000;" class=3D"styled-by-prettify"><br>ret</=
span><span style=3D"color: #660;" class=3D"styled-by-prettify"><</span><=
span style=3D"color: #000;" class=3D"styled-by-prettify">T</span><span styl=
e=3D"color: #660;" class=3D"styled-by-prettify">></span><span style=3D"c=
olor: #000;" class=3D"styled-by-prettify"> parse</span><span style=3D"color=
: #660;" class=3D"styled-by-prettify"><</span><span style=3D"color: #000=
;" class=3D"styled-by-prettify">T</span><span style=3D"color: #660;" class=
=3D"styled-by-prettify">>(</span><span style=3D"color: #000;" class=3D"s=
tyled-by-prettify">string_view</span><span style=3D"color: #660;" class=3D"=
styled-by-prettify">,</span><span style=3D"color: #000;" class=3D"styled-by=
-prettify"> </span><span style=3D"color: #008;" class=3D"styled-by-prettify=
">int</span><span style=3D"color: #000;" class=3D"styled-by-prettify"> </sp=
an><span style=3D"color: #008;" class=3D"styled-by-prettify">base</span><sp=
an style=3D"color: #660;" class=3D"styled-by-prettify">);</span><span style=
=3D"color: #000;" class=3D"styled-by-prettify"> </span><span style=3D"color=
: #800;" class=3D"styled-by-prettify">//same as parse_generic(), but may do=
inline dispatch to optimized routines for specific values of base</span><s=
pan style=3D"color: #000;" class=3D"styled-by-prettify"><br></span></div></=
code></div><br>But of course now we have a rather expensive feature creep f=
or an arguably dubious performance concern. If we had constexpr overloading=
or some method of constexpr parameter detection, the implementation could =
choose to do the inline dispatch only when base is constexpr (if that makes=
sense to do).<br><br>Aren't all of these points QOI issues anyway? <br></d=
iv></div>
<p></p>
-- <br />
<br />
--- <br />
You received this message because you are subscribed to the Google Groups &=
quot;ISO C++ Standard - Future Proposals" group.<br />
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to <a href=3D"mailto:std-proposals+unsubscribe@isocpp.org">std-proposa=
ls+unsubscribe@isocpp.org</a>.<br />
To post to this group, send email to <a href=3D"mailto:std-proposals@isocpp=
..org">std-proposals@isocpp.org</a>.<br />
Visit this group at <a href=3D"http://groups.google.com/a/isocpp.org/group/=
std-proposals/">http://groups.google.com/a/isocpp.org/group/std-proposals/<=
/a>.<br />
------=_Part_399_1079225848.1432053212279--
------=_Part_398_1744040193.1432053212279--
.
Author: Thiago Macieira <thiago@macieira.org>
Date: Tue, 19 May 2015 09:37:19 -0700
Raw View
On Tuesday 19 May 2015 09:33:32 Matthew Fioravante wrote:
> > >> Yes. In most use cases, you know at compile-time which base you
> > >> expect. I believe an appropriate interface should allow me to
> > >> convey that compile-time information to the callee.
>
> If the base was a template parameter, then we would need to provide a
> second version which accepts base as a runtime parameter for those use
> cases when we don't know base at compile time. Having one version of parse
> with template parameters and another with normal runtimes parameters seems
> like it would be bloated and confusing.
Agreed. Jens's use-case seems to be easily solved with std::bind,
std::function and/or lambdas.
--
Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org
Software Architect - Intel Open Source Technology Center
PGP/GPG: 0x6EF45358; fingerprint:
E067 918B B660 DBD1 105C 966C 33F5 F005 6EF4 5358
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
.
Author: Matthew Fioravante <fmatthew5876@gmail.com>
Date: Tue, 19 May 2015 19:41:05 -0700 (PDT)
Raw View
------=_Part_3282_1654848011.1432089665426
Content-Type: multipart/alternative;
boundary="----=_Part_3283_969808580.1432089665426"
------=_Part_3283_969808580.1432089665426
Content-Type: text/plain; charset=UTF-8
If out parameters are to be used and their presence (or lack thereof)
changes behavior, then we should pass them by rvalue-reference.
See example:
ret<T> parse(string_view&& tail,string_view s); //Parses a T from s.Sets
tail to the end of the string
ret<T> parse(string_view s); //Parses a T from s. Error if there are extra
characters after the parsed string.
auto a = parse<int>(tail, str); //Parse an int from str, storing the tail
of the string in tail
auto b = parse<int>(str); //Parse an int from str,is an error if str has
trailing characters after the value
auto c = parse<int>(string_view{},str); //Parse an int from str and ignore
any characters after the value
The last example would not be possible if tail was passed by lvalue
reference.
Using an rvalue reference allows this kind of mistake to pass the compiler
however.
string_view readStr();
//Oops
auto a = parse<int>(readStr(),tail);
If the API worked this way:
ret<T> parse(string_view& tail,string_view s); //Parses a T from s.Sets
tail to the end of the string
ret<T> parse(string_view s); //Parses a T from s and ignores the remaining
characters.
Then there is no reason to ever pass an rvalue tail so an lvalue reference
is probably more appropriate as it makes the above bug a compiler error.
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
------=_Part_3283_969808580.1432089665426
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable
<div dir=3D"ltr">If out parameters are to be used and their presence (or la=
ck thereof) changes behavior, then we should pass them by rvalue-reference.=
<div><br></div><div>See example:</div><div><br></div><div><div class=3D"pre=
ttyprint" style=3D"border: 1px solid rgb(187, 187, 187); word-wrap: break-w=
ord; background-color: rgb(250, 250, 250);"><code class=3D"prettyprint"><di=
v class=3D"subprettyprint"><span style=3D"color: #000;" class=3D"styled-by-=
prettify">ret</span><span style=3D"color: #660;" class=3D"styled-by-prettif=
y"><</span><span style=3D"color: #000;" class=3D"styled-by-prettify">T</=
span><span style=3D"color: #660;" class=3D"styled-by-prettify">></span><=
span style=3D"color: #000;" class=3D"styled-by-prettify"> parse</span><span=
style=3D"color: #660;" class=3D"styled-by-prettify">(</span><span style=3D=
"color: #000;" class=3D"styled-by-prettify">string_view</span><span style=
=3D"color: #660;" class=3D"styled-by-prettify">&&</span><span style=
=3D"color: #000;" class=3D"styled-by-prettify"> tail</span><span style=3D"c=
olor: #660;" class=3D"styled-by-prettify">,</span><span style=3D"color: #00=
0;" class=3D"styled-by-prettify">string_view s</span><span style=3D"color: =
#660;" class=3D"styled-by-prettify">);</span><span style=3D"color: #000;" c=
lass=3D"styled-by-prettify"> </span><span style=3D"color: #800;" class=3D"s=
tyled-by-prettify">//Parses a T from s.Sets tail to the end of the string</=
span><span style=3D"color: #000;" class=3D"styled-by-prettify"><br>ret</spa=
n><span style=3D"color: #660;" class=3D"styled-by-prettify"><</span><spa=
n style=3D"color: #000;" class=3D"styled-by-prettify">T</span><span style=
=3D"color: #660;" class=3D"styled-by-prettify">></span><span style=3D"co=
lor: #000;" class=3D"styled-by-prettify"> parse</span><span style=3D"color:=
#660;" class=3D"styled-by-prettify">(</span><span style=3D"color: #000;" c=
lass=3D"styled-by-prettify">string_view s</span><span style=3D"color: #660;=
" class=3D"styled-by-prettify">);</span><span style=3D"color: #000;" class=
=3D"styled-by-prettify"> </span><span style=3D"color: #800;" class=3D"style=
d-by-prettify">//Parses a T from s. Error if there are extra characters aft=
er the parsed string.</span></div></code></div><font color=3D"#666600" styl=
e=3D"font-family: monospace; background-color: rgb(250, 250, 250);"><span s=
tyle=3D"color: rgb(136, 0, 0);"><br></span></font></div><div><font style=3D=
"font-family: monospace; background-color: rgb(250, 250, 250);"><div class=
=3D"prettyprint" style=3D"border: 1px solid rgb(187, 187, 187); word-wrap: =
break-word; background-color: rgb(250, 250, 250);"><code class=3D"prettypri=
nt"><div class=3D"subprettyprint"><span style=3D"color: #008;" class=3D"sty=
led-by-prettify">auto</span><span style=3D"color: #000;" class=3D"styled-by=
-prettify"> a </span><span style=3D"color: #660;" class=3D"styled-by-pretti=
fy">=3D</span><span style=3D"color: #000;" class=3D"styled-by-prettify"> pa=
rse</span><span style=3D"color: #080;" class=3D"styled-by-prettify"><int=
></span><span style=3D"color: #660;" class=3D"styled-by-prettify">(</spa=
n><span style=3D"color: #000;" class=3D"styled-by-prettify">tail</span><spa=
n style=3D"color: #660;" class=3D"styled-by-prettify">,</span><span style=
=3D"color: #000;" class=3D"styled-by-prettify"> str</span><span style=3D"co=
lor: #660;" class=3D"styled-by-prettify">);</span><span style=3D"color: #00=
0;" class=3D"styled-by-prettify"> </span><span style=3D"color: #800;" class=
=3D"styled-by-prettify">//Parse an int from str, storing the tail of the st=
ring in tail</span><span style=3D"color: #000;" class=3D"styled-by-prettify=
"><br></span><span style=3D"color: #008;" class=3D"styled-by-prettify">auto=
</span><span style=3D"color: #000;" class=3D"styled-by-prettify"> b </span>=
<span style=3D"color: #660;" class=3D"styled-by-prettify">=3D</span><span s=
tyle=3D"color: #000;" class=3D"styled-by-prettify"> parse</span><span style=
=3D"color: #080;" class=3D"styled-by-prettify"><int</span><font color=3D=
"#666600"><span style=3D"color: #080;" class=3D"styled-by-prettify">></s=
pan><span style=3D"color: #660;" class=3D"styled-by-prettify">(</span><span=
style=3D"color: #000;" class=3D"styled-by-prettify">str</span><span style=
=3D"color: #660;" class=3D"styled-by-prettify">);</span><span style=3D"colo=
r: #000;" class=3D"styled-by-prettify"> </span><span style=3D"color: #800;"=
class=3D"styled-by-prettify">//Parse an int from str,is an error if str ha=
s trailing characters after the value</span><span style=3D"color: #000;" cl=
ass=3D"styled-by-prettify"><br></span><span style=3D"color: #008;" class=3D=
"styled-by-prettify">auto</span><span style=3D"color: #000;" class=3D"style=
d-by-prettify"> c </span><span style=3D"color: #660;" class=3D"styled-by-pr=
ettify">=3D</span><span style=3D"color: #000;" class=3D"styled-by-prettify"=
> parse</span><span style=3D"color: #080;" class=3D"styled-by-prettify"><=
;int></span><span style=3D"color: #660;" class=3D"styled-by-prettify">(<=
/span><span style=3D"color: #000;" class=3D"styled-by-prettify">string_view=
</span><span style=3D"color: #660;" class=3D"styled-by-prettify">{},</span>=
<span style=3D"color: #000;" class=3D"styled-by-prettify">str</span><span s=
tyle=3D"color: #660;" class=3D"styled-by-prettify">);</span><span style=3D"=
color: #000;" class=3D"styled-by-prettify"> </span><span style=3D"color: #8=
00;" class=3D"styled-by-prettify">//Parse an int from str and ignore any ch=
aracters after the value</span></font></div></code></div><br>The last examp=
le would not be possible if tail was passed by lvalue reference.</font></di=
v><div><font style=3D"font-family: monospace; background-color: rgb(250, 25=
0, 250);"><br></font></div><div><font style=3D"font-family: monospace; back=
ground-color: rgb(250, 250, 250);">Using an rvalue reference allows this ki=
nd of mistake to pass the compiler however.</font></div><div><font style=3D=
"font-family: monospace; background-color: rgb(250, 250, 250);"><br></font>=
</div><div><font style=3D"font-family: monospace; background-color: rgb(250=
, 250, 250);"><div class=3D"prettyprint" style=3D"border: 1px solid rgb(187=
, 187, 187); word-wrap: break-word; background-color: rgb(250, 250, 250);">=
<code class=3D"prettyprint"><div class=3D"subprettyprint"><font color=3D"#6=
60066"><span style=3D"color: #000;" class=3D"styled-by-prettify">string_vie=
w readStr</span><span style=3D"color: #660;" class=3D"styled-by-prettify">(=
);</span><span style=3D"color: #000;" class=3D"styled-by-prettify"><br></sp=
an><span style=3D"color: #800;" class=3D"styled-by-prettify">//Oops</span><=
span style=3D"color: #000;" class=3D"styled-by-prettify"><br></span><span s=
tyle=3D"color: #008;" class=3D"styled-by-prettify">auto</span><span style=
=3D"color: #000;" class=3D"styled-by-prettify"> a </span><span style=3D"col=
or: #660;" class=3D"styled-by-prettify">=3D</span><span style=3D"color: #00=
0;" class=3D"styled-by-prettify"> parse</span><span style=3D"color: #080;" =
class=3D"styled-by-prettify"><int></span><span style=3D"color: #660;"=
class=3D"styled-by-prettify">(</span><span style=3D"color: #000;" class=3D=
"styled-by-prettify">readStr</span><span style=3D"color: #660;" class=3D"st=
yled-by-prettify">(),</span><span style=3D"color: #000;" class=3D"styled-by=
-prettify">tail</span><span style=3D"color: #660;" class=3D"styled-by-prett=
ify">);</span></font></div></code></div></font></div><div><font style=3D"fo=
nt-family: monospace; background-color: rgb(250, 250, 250);"><br></font></d=
iv><div><font style=3D"font-family: monospace; background-color: rgb(250, 2=
50, 250);"><br></font></div><div><font style=3D"font-family: monospace; bac=
kground-color: rgb(250, 250, 250);">If the API worked this way:</font></div=
><div><font style=3D"font-family: monospace; background-color: rgb(250, 250=
, 250);"><div class=3D"prettyprint" style=3D"border: 1px solid rgb(187, 187=
, 187); word-wrap: break-word; background-color: rgb(250, 250, 250);"><code=
class=3D"prettyprint"><div class=3D"subprettyprint"><span style=3D"color: =
#000;" class=3D"styled-by-prettify">ret</span><span style=3D"color: #660;" =
class=3D"styled-by-prettify"><</span><span style=3D"color: #000;" class=
=3D"styled-by-prettify">T</span><span style=3D"color: #660;" class=3D"style=
d-by-prettify">></span><span style=3D"color: #000;" class=3D"styled-by-p=
rettify"> parse</span><span style=3D"color: #660;" class=3D"styled-by-prett=
ify">(</span><span style=3D"color: #000;" class=3D"styled-by-prettify">stri=
ng_view</span><span style=3D"color: #660;" class=3D"styled-by-prettify">&am=
p;</span><span style=3D"color: #000;" class=3D"styled-by-prettify"> tail</s=
pan><span style=3D"color: #660;" class=3D"styled-by-prettify">,</span><span=
style=3D"color: #000;" class=3D"styled-by-prettify">string_view s</span><s=
pan style=3D"color: #660;" class=3D"styled-by-prettify">);</span><span styl=
e=3D"color: #000;" class=3D"styled-by-prettify"> </span><span style=3D"colo=
r: #800;" class=3D"styled-by-prettify">//Parses a T from s.Sets tail to the=
end of the string</span><span style=3D"color: #000;" class=3D"styled-by-pr=
ettify"><br>ret</span><span style=3D"color: #660;" class=3D"styled-by-prett=
ify"><</span><span style=3D"color: #000;" class=3D"styled-by-prettify">T=
</span><span style=3D"color: #660;" class=3D"styled-by-prettify">></span=
><span style=3D"color: #000;" class=3D"styled-by-prettify"> parse</span><sp=
an style=3D"color: #660;" class=3D"styled-by-prettify">(</span><span style=
=3D"color: #000;" class=3D"styled-by-prettify">string_view s</span><span st=
yle=3D"color: #660;" class=3D"styled-by-prettify">);</span><span style=3D"c=
olor: #000;" class=3D"styled-by-prettify"> </span><span style=3D"color: #80=
0;" class=3D"styled-by-prettify">//Parses a T from s and ignores the remain=
ing characters.</span></div></code></div><span class=3D"styled-by-prettify"=
style=3D"color: rgb(136, 0, 0);"><div><font style=3D"font-family: monospac=
e; background-color: rgb(250, 250, 250);"><span class=3D"styled-by-prettify=
" style=3D"color: rgb(136, 0, 0);"><br></span></font></div><div style=3D"co=
lor: rgb(34, 34, 34); font-family: Arial, Helvetica, sans-serif; background=
-color: rgb(255, 255, 255);"><font style=3D"font-family: monospace; backgro=
und-color: rgb(250, 250, 250);">Then there is no reason to ever pass an rva=
lue tail so an lvalue reference is probably more appropriate as it makes th=
e above bug a compiler error.</font></div><div><font style=3D"font-family: =
monospace; background-color: rgb(250, 250, 250);"><br></font></div></span><=
br></font></div></div>
<p></p>
-- <br />
<br />
--- <br />
You received this message because you are subscribed to the Google Groups &=
quot;ISO C++ Standard - Future Proposals" group.<br />
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to <a href=3D"mailto:std-proposals+unsubscribe@isocpp.org">std-proposa=
ls+unsubscribe@isocpp.org</a>.<br />
To post to this group, send email to <a href=3D"mailto:std-proposals@isocpp=
..org">std-proposals@isocpp.org</a>.<br />
Visit this group at <a href=3D"http://groups.google.com/a/isocpp.org/group/=
std-proposals/">http://groups.google.com/a/isocpp.org/group/std-proposals/<=
/a>.<br />
------=_Part_3283_969808580.1432089665426--
------=_Part_3282_1654848011.1432089665426--
.
Author: Olaf van der Spek <olafvdspek@gmail.com>
Date: Wed, 20 May 2015 14:07:41 +0200
Raw View
2015-05-18 23:16 GMT+02:00 Magnus Fromreide <magfr@lysator.liu.se>:
> I think iterators/ranges are a better input type. Why should we require that
> the input is consecutive?
Simplicity?
I don't know, I presume iostream internals support parsing from
iterators so it might be good to expose that.
On the other hand the input is often contiguous and existing functions
mostly require contiguous input.
--
Olaf
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
.
Author: Olaf van der Spek <olafvdspek@gmail.com>
Date: Wed, 20 May 2015 14:08:24 +0200
Raw View
2015-05-20 4:41 GMT+02:00 Matthew Fioravante <fmatthew5876@gmail.com>:
> If out parameters are to be used and their presence (or lack thereof)
> changes behavior, then we should pass them by rvalue-reference.
>
> See example:
>
> ret<T> parse(string_view&& tail,string_view s); //Parses a T from s.Sets
> tail to the end of the string
> ret<T> parse(string_view s); //Parses a T from s. Error if there are extra
> characters after the parsed string.
>
> auto a = parse<int>(tail, str); //Parse an int from str, storing the tail of
> the string in tail
> auto b = parse<int>(str); //Parse an int from str,is an error if str has
> trailing characters after the value
> auto c = parse<int>(string_view{},str); //Parse an int from str and ignore
> any characters after the value
>
> The last example would not be possible if tail was passed by lvalue
> reference.
>
> Using an rvalue reference allows this kind of mistake to pass the compiler
> however.
>
> string_view readStr();
> //Oops
> auto a = parse<int>(readStr(),tail);
>
>
> If the API worked this way:
> ret<T> parse(string_view& tail,string_view s); //Parses a T from s.Sets tail
> to the end of the string
> ret<T> parse(string_view s); //Parses a T from s and ignores the remaining
> characters.
>
> Then there is no reason to ever pass an rvalue tail so an lvalue reference
> is probably more appropriate as it makes the above bug a compiler error.
The proposed function is like: parse(string_view in, size_t* pos =
nullptr, int base = 10);
Nothing is passed by lvalue (or rvalue).
Using string_view* tail might make sense in certain use cases but not
in others..
Let's have a look at some real-world use cases.
m_downloaded = to_int(value);
m_uploaded = to_int(value);
auto u = find_user(to_int(req_["u"]));
int y0 = to_int(req_["y0"]);
In most cases the entire input has to be a valid number, so pos / tail
isn't needed.
In most cases the number is used to initialize a new variable, so an
out parameter wouldn't work for the output number.
0 is used to signal a failed conversion.
I think my use cases are best served by an atoi-like convenience
function, so the actual interface of the real parse function wouldn't
matter.
Perhaps atoi/strtol-like convenience functions should be proposed too.
--
Olaf
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
.
Author: "Vicente J. Botet Escriba" <vicente.botet@wanadoo.fr>
Date: Wed, 20 May 2015 16:35:38 +0200
Raw View
Le 18/05/15 20:34, Olaf van der Spek a =C3=A9crit :
> Let's get the party started.
>
> What have we got?
>
> We've got functions like strtol and stoi which take a const char* or=20
> std::string and return a number.
>
> long strtol(const char*, char **str_end, int base);
> int stoi(const std::string&, std::size_t* pos =3D 0, int base =3D 10);
>
> What do we want?
>
> Input should not be required to be null terminated, so string_view=20
> seems like a suitable input type.
Maybe instead of using string_view the function should work on any model=20
of a given ParserState Concept. What are the operations a parser need=20
from this ParserState?
> Error detection should be simpler, but not everyone is a fan of=20
> exceptions.
We can question ourselves which interface we will had if exceptions were=20
acceptable. Without taking the base in account, should we have
template< class T, class ParserState>
T parse(ParserState& state);
or
template< class T, class ParserState>
pair<T, ParserState> parse(ParserState state);
I would prefer the second, as it compose better, but I'm biased by the=20
functional approach.
An alternative is to define a parser object and then apply a member extract
parser p(...);
p.extract<int>();
Would uniform syntax allows the following
extract<int>(p);
>
> And IMO skipping spaces should not be part of the parse function.
> There's also the question of what to do when not the entire input can=20
> be parsed. Return an error or not.
Parsing is not the same than matching. When you parse, you want to parse=20
several things, so you need a new state of the ParserState on which=20
apply again the function parser. When you want to match the whole input=20
must be consumed.
I suggest to use a different function for this use case.
>
>
> So, what about this one?
>
> optional<T> parse(string_view, std::size_t* pos =3D 0, int base =3D 10);
>
> An alternative could be:
>
> error_code parse(T&, string_view, std::size_t* pos =3D 0, int base =3D 10=
);
>
>
>
Sorry, but what the pos parameter is used for?
When exceptions can not be used we need to add an output for an error code.
FS TS adds it as output parameter
template< class T, class ParserState>
pair<T, ParserState> parse(error_code&, ParserState state, int base =3D 10=
);
But, why adding error_code as an out parameter? Is it because we are=20
used to it (C-style)? is it because is more efficient? What is wrong with
template< class T, class ParserState>
tuple<T, error_code, ParserState> parse(ParserState state, int base =3D 10=
);
?
If we finish by adopting variant, would the following be a better interface=
?
template< class T, class ParserState>
pair<variant<T, error_code>, ParserState> parse(ParserState state, int=20
base =3D 10);
Would a more specific type be preferable?
Vicente
--=20
---=20
You received this message because you are subscribed to the Google Groups "=
ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposa=
ls/.
.
Author: "Vicente J. Botet Escriba" <vicente.botet@wanadoo.fr>
Date: Wed, 20 May 2015 16:44:40 +0200
Raw View
Le 20/05/15 14:08, Olaf van der Spek a =C3=A9crit :
> 2015-05-20 4:41 GMT+02:00 Matthew Fioravante <fmatthew5876@gmail.com>:
>> If out parameters are to be used and their presence (or lack thereof)
>> changes behavior, then we should pass them by rvalue-reference.
>>
>> See example:
>>
>> ret<T> parse(string_view&& tail,string_view s); //Parses a T from s.Sets
>> tail to the end of the string
>> ret<T> parse(string_view s); //Parses a T from s. Error if there are ext=
ra
>> characters after the parsed string.
>>
>> auto a =3D parse<int>(tail, str); //Parse an int from str, storing the t=
ail of
>> the string in tail
>> auto b =3D parse<int>(str); //Parse an int from str,is an error if str h=
as
>> trailing characters after the value
>> auto c =3D parse<int>(string_view{},str); //Parse an int from str and ig=
nore
>> any characters after the value
>>
>> The last example would not be possible if tail was passed by lvalue
>> reference.
>>
>> Using an rvalue reference allows this kind of mistake to pass the compil=
er
>> however.
>>
>> string_view readStr();
>> //Oops
>> auto a =3D parse<int>(readStr(),tail);
>>
>>
>> If the API worked this way:
>> ret<T> parse(string_view& tail,string_view s); //Parses a T from s.Sets =
tail
>> to the end of the string
>> ret<T> parse(string_view s); //Parses a T from s and ignores the remaini=
ng
>> characters.
>>
>> Then there is no reason to ever pass an rvalue tail so an lvalue referen=
ce
>> is probably more appropriate as it makes the above bug a compiler error.
> The proposed function is like: parse(string_view in, size_t* pos =3D
> nullptr, int base =3D 10);
> Nothing is passed by lvalue (or rvalue).
How would you report errors?
>
> Using string_view* tail might make sense in certain use cases but not
> in others..
>
> Let's have a look at some real-world use cases.
>
> m_downloaded =3D to_int(value);
> m_uploaded =3D to_int(value);
> auto u =3D find_user(to_int(req_["u"]));
> int y0 =3D to_int(req_["y0"]);
>
> In most cases the entire input has to be a valid number, so pos / tail
> isn't needed.
> In most cases the number is used to initialize a new variable, so an
> out parameter wouldn't work for the output number.
Agreed.
> 0 is used to signal a failed conversion.
Ugh, I see now how do you report errors. I'm almost sure this wouldn't=20
be accepted by the C++ standard committee.
>
> I think my use cases are best served by an atoi-like convenience
> function, so the actual interface of the real parse function wouldn't
> matter.
> Perhaps atoi/strtol-like convenience functions should be proposed too.
>
>
You lost me here.
Vicente
--=20
---=20
You received this message because you are subscribed to the Google Groups "=
ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposa=
ls/.
.
Author: Matthew Fioravante <fmatthew5876@gmail.com>
Date: Wed, 20 May 2015 08:27:54 -0700 (PDT)
Raw View
------=_Part_4195_1994461266.1432135674503
Content-Type: multipart/alternative;
boundary="----=_Part_4196_838251639.1432135674503"
------=_Part_4196_838251639.1432135674503
Content-Type: text/plain; charset=UTF-8
On Wednesday, May 20, 2015 at 8:07:45 AM UTC-4, Olaf van der Spek wrote:
>
> 2015-05-18 23:16 GMT+02:00 Magnus Fromreide <ma...@lysator.liu.se>:
> > I think iterators/ranges are a better input type. Why should we require
> that
> > the input is consecutive?
>
> Simplicity?
>
I don't know, I presume iostream internals support parsing from
> iterators so it might be good to expose that.
>
> On the other hand the input is often contiguous and existing functions
> mostly require contiguous input.
>
Even if the base implementation uses iterators, string_view (or a
string_view compatible) overloads should be provided.
Because of that, we could just do a string_view proposal for now and if
later someone wants to propose an iterator version the string_view
functions could be just reimplemented using the iterator version. It would
be really great if this library could make it into the standard at the same
time string_view does. Not having string_view compatible number parsing is
a major hole in the string_view API.
On Wednesday, May 20, 2015 at 8:08:25 AM UTC-4, Olaf van der Spek wrote:
>
> The proposed function is like: parse(string_view in, size_t* pos =
> nullptr, int base = 10);
> Nothing is passed by lvalue (or rvalue).
>
> Using string_view* tail might make sense in certain use cases but not
> in others..
>
Not sure I follow you here.
Do you have any specific use case where a size_t* (or char*) is more
appropriate? A string_view object is a better tail string representation
because it is a range with invariants built in from the start.
Consider the following example, which would be a lot more clumsy if you
used a size_t* instead of a string_view*.
string_view s = "1 2 3";
auto a = parse<int>(s, &s);
s.pop_front();
auto b = parse<int>(s, &s);
s.pop_front();
auto c = parse<int>(s, &s);
Using a size_t* just means I'm probably going to be constructing a
string_view on the next line and now I have to awkwardly deal with stuff
like { in.begin() + pos, in.end() } without making mistakes.
The tail out param could also be passed as a string_view*, which nullptr
signifying "allow tail strings but throw them away".
> Let's have a look at some real-world use cases.
>
> m_downloaded = to_int(value);
> m_uploaded = to_int(value);
> auto u = find_user(to_int(req_["u"]));
> int y0 = to_int(req_["y0"]);
>
> In most cases the entire input has to be a valid number, so pos / tail
> isn't needed.
>
I agree that retrieving the tail should be optional and not get in the way
if you don't need it. This means that if tail is an out parameter we should
provide an overload without it. If tail is part of the returned object, it
should be easy to extract the value and error_code and ignore the tail.
> In most cases the number is used to initialize a new variable,
This is a major problem with expected, optional, etc... Its really nice to
just be able to say auto x = parse<int>(s); and have x be an int.
> so an
> out parameter wouldn't work for the output number.
>
0 is used to signal a failed conversion.
>
Generally, using 0 or any other perfectly valid value to signal failure is
a really bad idea. It only makes sense when your use case is "Parse the
value or give me some default if it fails". In that case, 0 may not be the
default value you want so its better to be able to actually specify it.
This use case why I invented parse_or() for my libraries. Regardless of how
the base version is implemented, this wrapper can still be added for atoi()
like convenience.
template <typename T>
T parse_or(string_view s, T val_if_error);
auto x = parse_or(s, 1.0);
//decltype(x) == double
Notice how also because of the default, you don't even need to specify the
type T. It can be deduced from the constant used to initialize val_if_error.
Using such a method emphasizes the value you are returning and that you
don't care about errors.
>
> I think my use cases are best served by an atoi-like convenience
>
function, so the actual interface of the real parse function wouldn't
> matter.
> Perhaps atoi/strtol-like convenience functions should be proposed too.
>
I have this use case often as well, but many other times I'm more focused
on correctness and want to report errors to users if they incorrectly
specify a number. Both idioms should be easily supported. My parse_or() (or
something similar) should serve the atoi() use case with minimal
complexity. Do you see any situation where it would not?
I think what is needed is for the paper to survey possible several
interfaces and show examples of all of the common use cases with each one,
carefully pointing out the pros and cons. Only when we have all of the data
in front of us with their trade offs of safety, convenience, and
readability can we choose one.
If you're planning to actually write this paper and want to collaborate,
I'd be happy to help with this.
What are all of the use cases for an API like this? Here is what I can
think of:
- Parse a number and throw an exception if error occurs
- Parse a number and let me handle errors without exceptions
- Parse a number and give me a default value if an error occurs
- Parse a number and give me the tail string so I can continue parsing the
next object.
- Initialize a variable using parse(), preferably using auto / type
deduction.
- Assign to a pre-existing variable using parse(), preferably with type
deduction.
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
------=_Part_4196_838251639.1432135674503
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable
<div dir=3D"ltr"><br><br>On Wednesday, May 20, 2015 at 8:07:45 AM UTC-4, Ol=
af van der Spek wrote:<blockquote class=3D"gmail_quote" style=3D"margin: 0;=
margin-left: 0.8ex;border-left: 1px #ccc solid;padding-left: 1ex;">2015-05-=
18 23:16 GMT+02:00 Magnus Fromreide <<a target=3D"_blank" rel=3D"nofollo=
w">ma...@lysator.liu.se</a>>:
<br>> I think iterators/ranges are a better input type. Why should we re=
quire that
<br>> the input is consecutive?
<br>
<br>Simplicity?
<br></blockquote><blockquote class=3D"gmail_quote" style=3D"margin: 0;margi=
n-left: 0.8ex;border-left: 1px #ccc solid;padding-left: 1ex;">I don't know,=
I presume iostream internals support parsing from
<br>iterators so it might be good to expose that.
<br>
<br>On the other hand the input is often contiguous and existing functions
<br>mostly require contiguous input.
<br></blockquote><div><br>Even if the base implementation uses iterators, s=
tring_view (or a string_view compatible) overloads should be provided.<br><=
br>Because of that, we could just do a string_view proposal for now and if =
later someone wants to propose an iterator version the string_view function=
s could be just reimplemented using the iterator version. It would be reall=
y great if this library could make it into the standard at the same time st=
ring_view does. Not having string_view compatible number parsing is a major=
hole in the string_view API.<br> </div><br>On Wednesday, May 20, 2015=
at 8:08:25 AM UTC-4, Olaf van der Spek wrote:<blockquote class=3D"gmail_qu=
ote" style=3D"margin: 0;margin-left: 0.8ex;border-left: 1px #ccc solid;padd=
ing-left: 1ex;">The proposed function is like: parse(string_view in, size_t=
* pos =3D
<br>nullptr, int base =3D 10);
<br>Nothing is passed by lvalue (or rvalue).
<br>
<br>Using string_view* tail might make sense in certain use cases but not
<br>in others..
<br></blockquote><div><br>Not sure I follow you here.<br>Do you have any sp=
ecific use case where a size_t* (or char*) is more appropriate? A string_vi=
ew object is a better tail string representation because it is a range with=
invariants built in from the start.<br><br>Consider the following example,=
which would be a lot more clumsy if you used a size_t* instead of a string=
_view*.<br><br><div class=3D"prettyprint" style=3D"background-color: rgb(25=
0, 250, 250); border-color: rgb(187, 187, 187); border-style: solid; border=
-width: 1px; word-wrap: break-word;"><code class=3D"prettyprint"><div class=
=3D"subprettyprint"><span style=3D"color: #000;" class=3D"styled-by-prettif=
y">string_view s </span><span style=3D"color: #660;" class=3D"styled-by-pre=
ttify">=3D</span><span style=3D"color: #000;" class=3D"styled-by-prettify">=
</span><span style=3D"color: #080;" class=3D"styled-by-prettify">"1 2 3"</=
span><span style=3D"color: #660;" class=3D"styled-by-prettify">;</span><spa=
n style=3D"color: #000;" class=3D"styled-by-prettify"><br></span><span styl=
e=3D"color: #008;" class=3D"styled-by-prettify">auto</span><span style=3D"c=
olor: #000;" class=3D"styled-by-prettify"> a </span><span style=3D"color: #=
660;" class=3D"styled-by-prettify">=3D</span><span style=3D"color: #000;" c=
lass=3D"styled-by-prettify"> parse</span><span style=3D"color: #080;" class=
=3D"styled-by-prettify"><int></span><span style=3D"color: #660;" clas=
s=3D"styled-by-prettify">(</span><span style=3D"color: #000;" class=3D"styl=
ed-by-prettify">s</span><span style=3D"color: #660;" class=3D"styled-by-pre=
ttify">,</span><span style=3D"color: #000;" class=3D"styled-by-prettify"> <=
/span><span style=3D"color: #660;" class=3D"styled-by-prettify">&</span=
><span style=3D"color: #000;" class=3D"styled-by-prettify">s</span><span st=
yle=3D"color: #660;" class=3D"styled-by-prettify">);</span><span style=3D"c=
olor: #000;" class=3D"styled-by-prettify"><br>s</span><span style=3D"color:=
#660;" class=3D"styled-by-prettify">.</span><span style=3D"color: #000;" c=
lass=3D"styled-by-prettify">pop_front</span><span style=3D"color: #660;" cl=
ass=3D"styled-by-prettify">();</span><span style=3D"color: #000;" class=3D"=
styled-by-prettify"><br></span><span style=3D"color: #008;" class=3D"styled=
-by-prettify">auto</span><span style=3D"color: #000;" class=3D"styled-by-pr=
ettify"> b </span><span style=3D"color: #660;" class=3D"styled-by-prettify"=
>=3D</span><span style=3D"color: #000;" class=3D"styled-by-prettify"> parse=
</span><span style=3D"color: #080;" class=3D"styled-by-prettify"><int>=
;</span><span style=3D"color: #660;" class=3D"styled-by-prettify">(</span><=
span style=3D"color: #000;" class=3D"styled-by-prettify">s</span><span styl=
e=3D"color: #660;" class=3D"styled-by-prettify">,</span><span style=3D"colo=
r: #000;" class=3D"styled-by-prettify"> </span><span style=3D"color: #660;"=
class=3D"styled-by-prettify">&</span><span style=3D"color: #000;" clas=
s=3D"styled-by-prettify">s</span><span style=3D"color: #660;" class=3D"styl=
ed-by-prettify">);</span><span style=3D"color: #000;" class=3D"styled-by-pr=
ettify"><br>s</span><span style=3D"color: #660;" class=3D"styled-by-prettif=
y">.</span><span style=3D"color: #000;" class=3D"styled-by-prettify">pop_fr=
ont</span><span style=3D"color: #660;" class=3D"styled-by-prettify">();</sp=
an><span style=3D"color: #000;" class=3D"styled-by-prettify"><br></span><sp=
an style=3D"color: #008;" class=3D"styled-by-prettify">auto</span><span sty=
le=3D"color: #000;" class=3D"styled-by-prettify"> c </span><span style=3D"c=
olor: #660;" class=3D"styled-by-prettify">=3D</span><span style=3D"color: #=
000;" class=3D"styled-by-prettify"> parse</span><span style=3D"color: #080;=
" class=3D"styled-by-prettify"><int></span><span style=3D"color: #660=
;" class=3D"styled-by-prettify">(</span><span style=3D"color: #000;" class=
=3D"styled-by-prettify">s</span><span style=3D"color: #660;" class=3D"style=
d-by-prettify">,</span><span style=3D"color: #000;" class=3D"styled-by-pret=
tify"> </span><span style=3D"color: #660;" class=3D"styled-by-prettify">&am=
p;</span><span style=3D"color: #000;" class=3D"styled-by-prettify">s</span>=
<span style=3D"color: #660;" class=3D"styled-by-prettify">);</span><span st=
yle=3D"color: #000;" class=3D"styled-by-prettify"><br></span></div></code><=
/div><br>Using a size_t* just means I'm probably going to be constructing a=
string_view on the next line and now I have to awkwardly deal with stuff l=
ike { in.begin() + pos, in.end() } without making mistakes.<br><br>The tail=
out param could also be passed as a string_view*, which nullptr signifying=
"allow tail strings but throw them away".<br><br></div><blockquote class=
=3D"gmail_quote" style=3D"margin: 0;margin-left: 0.8ex;border-left: 1px #cc=
c solid;padding-left: 1ex;">
<br>Let's have a look at some real-world use cases.
<br>
<br>m_downloaded =3D to_int(value);
<br>m_uploaded =3D to_int(value);
<br>auto u =3D find_user(to_int(req_["u"]));
<br>int y0 =3D to_int(req_["y0"]);
<br>
<br>In most cases the entire input has to be a valid number, so pos / tail
<br>isn't needed.
<br></blockquote><div><br>I agree that retrieving the tail should be option=
al and not get in the way if you don't need it. This means that if tail is =
an out parameter we should provide an overload without it. If tail is part =
of the returned object, it should be easy to extract the value and error_co=
de and ignore the tail.<br> </div><blockquote class=3D"gmail_quote" st=
yle=3D"margin: 0;margin-left: 0.8ex;border-left: 1px #ccc solid;padding-lef=
t: 1ex;">In most cases the number is used to initialize a new variable, </b=
lockquote><div><br>This is a major problem with expected, optional, etc... =
Its really nice to just be able to say auto x =3D parse<int>(s); and =
have x be an int.<br><br> </div><blockquote class=3D"gmail_quote" styl=
e=3D"margin: 0;margin-left: 0.8ex;border-left: 1px #ccc solid;padding-left:=
1ex;">so an
<br>out parameter wouldn't work for the output number.
<br></blockquote><blockquote class=3D"gmail_quote" style=3D"margin: 0;margi=
n-left: 0.8ex;border-left: 1px #ccc solid;padding-left: 1ex;">0 is used to =
signal a failed conversion.
<br></blockquote><div><br>Generally, using 0 or any other perfectly valid v=
alue to signal failure is a really bad idea. It only makes sense when your =
use case is "Parse the value or give me some default if it fails". In that =
case, 0 may not be the default value you want so its better to be able to a=
ctually specify it.<br><br>This use case why I invented parse_or() for my l=
ibraries. Regardless of how the base version is implemented, this wrapper c=
an still be added for atoi() like convenience.<br><br><div class=3D"prettyp=
rint" style=3D"background-color: rgb(250, 250, 250); border-color: rgb(187,=
187, 187); border-style: solid; border-width: 1px; word-wrap: break-word;"=
><code class=3D"prettyprint"><div class=3D"subprettyprint"><span style=3D"c=
olor: #008;" class=3D"styled-by-prettify">template</span><span style=3D"col=
or: #000;" class=3D"styled-by-prettify"> </span><span style=3D"color: #660;=
" class=3D"styled-by-prettify"><</span><span style=3D"color: #008;" clas=
s=3D"styled-by-prettify">typename</span><span style=3D"color: #000;" class=
=3D"styled-by-prettify"> T</span><span style=3D"color: #660;" class=3D"styl=
ed-by-prettify">></span><span style=3D"color: #000;" class=3D"styled-by-=
prettify"><br>T parse_or</span><span style=3D"color: #660;" class=3D"styled=
-by-prettify">(</span><span style=3D"color: #000;" class=3D"styled-by-prett=
ify">string_view s</span><span style=3D"color: #660;" class=3D"styled-by-pr=
ettify">,</span><span style=3D"color: #000;" class=3D"styled-by-prettify"> =
T val_if_error</span><span style=3D"color: #660;" class=3D"styled-by-pretti=
fy">);</span><span style=3D"color: #000;" class=3D"styled-by-prettify"><br>=
<br></span><span style=3D"color: #008;" class=3D"styled-by-prettify">auto</=
span><span style=3D"color: #000;" class=3D"styled-by-prettify"> x </span><s=
pan style=3D"color: #660;" class=3D"styled-by-prettify">=3D</span><span sty=
le=3D"color: #000;" class=3D"styled-by-prettify"> parse_or</span><span styl=
e=3D"color: #660;" class=3D"styled-by-prettify">(</span><span style=3D"colo=
r: #000;" class=3D"styled-by-prettify">s</span><span style=3D"color: #660;"=
class=3D"styled-by-prettify">,</span><span style=3D"color: #000;" class=3D=
"styled-by-prettify"> </span><span style=3D"color: #066;" class=3D"styled-b=
y-prettify">1.0</span><span style=3D"color: #660;" class=3D"styled-by-prett=
ify">);</span><span style=3D"color: #000;" class=3D"styled-by-prettify"> <b=
r></span><span style=3D"color: #800;" class=3D"styled-by-prettify">//declty=
pe(x) =3D=3D double</span><span style=3D"color: #000;" class=3D"styled-by-p=
rettify"><br></span></div></code></div><br>Notice how also because of the d=
efault, you don't even need to specify the type T. It can be deduced from t=
he constant used to initialize val_if_error.<br><br>Using such a method emp=
hasizes the value you are returning and that you don't care about errors.<b=
r><br> <br></div><blockquote class=3D"gmail_quote" style=3D"margin: 0=
;margin-left: 0.8ex;border-left: 1px #ccc solid;padding-left: 1ex;">
<br>I think my use cases are best served by an atoi-like convenience
<br></blockquote><blockquote class=3D"gmail_quote" style=3D"margin: 0;margi=
n-left: 0.8ex;border-left: 1px #ccc solid;padding-left: 1ex;">function, so =
the actual interface of the real parse function wouldn't
<br>matter.
<br>Perhaps atoi/strtol-like convenience functions should be proposed too.
<br></blockquote><div><br>I have this use case often as well, but many othe=
r times I'm more focused on correctness and want to report errors to users =
if they incorrectly specify a number. Both idioms should be easily supporte=
d. My parse_or() (or something similar) should serve the atoi() use case wi=
th minimal complexity. Do you see any situation where it would not?<br><br>=
I think what is needed is for the paper to survey possible several interfac=
es=20
and show examples of all of the common use cases with each one, carefully p=
ointing out the pros and cons. Only when we have all of the data in front o=
f us with their trade offs of safety, convenience,=20
and readability can we choose one. <br><br>If you're planning to actually w=
rite this paper and want to collaborate, I'd be happy to help with this.<br=
><br>What are all of the use cases for an API like this? Here is what I can=
think of:<br>- Parse a number and throw an exception if error occurs<br>- =
Parse a number and let me handle errors without exceptions<br>- Parse a num=
ber and give me a default value if an error occurs<br>- Parse a number and =
give me the tail string so I can continue parsing the next object.<br>- Ini=
tialize a variable using parse(), preferably using auto / type deduction.<b=
r>- Assign to a pre-existing variable using parse(), preferably with type d=
eduction.<br><br></div></div>
<p></p>
-- <br />
<br />
--- <br />
You received this message because you are subscribed to the Google Groups &=
quot;ISO C++ Standard - Future Proposals" group.<br />
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to <a href=3D"mailto:std-proposals+unsubscribe@isocpp.org">std-proposa=
ls+unsubscribe@isocpp.org</a>.<br />
To post to this group, send email to <a href=3D"mailto:std-proposals@isocpp=
..org">std-proposals@isocpp.org</a>.<br />
Visit this group at <a href=3D"http://groups.google.com/a/isocpp.org/group/=
std-proposals/">http://groups.google.com/a/isocpp.org/group/std-proposals/<=
/a>.<br />
------=_Part_4196_838251639.1432135674503--
------=_Part_4195_1994461266.1432135674503--
.
Author: Matthew Woehlke <mw_triad@users.sourceforge.net>
Date: Wed, 20 May 2015 11:43:00 -0400
Raw View
On 2015-05-20 11:27, Matthew Fioravante wrote:
> Generally, using 0 or any other perfectly valid value to signal failure is
> a really bad idea. It only makes sense when your use case is "Parse the
> value or give me some default if it fails". In that case, 0 may not be the
> default value you want so its better to be able to actually specify it.
Doesn't expected already handle this?
> This is a major problem with expected, optional, etc... Its really nice to
> just be able to say auto x = parse<int>(s); and have x be an int.
auto x = parse<int>(s).value_or(0);
Slightly more typing, but less API complexity. And you could trivially
write an inline parse_or that wraps this.
> What are all of the use cases for an API like this? Here is what I can
> think of:
> - Parse a number and throw an exception if error occurs
> - Parse a number and let me handle errors without exceptions
> - Parse a number and give me a default value if an error occurs
Returning an expected covers all of these. If you want the exception,
just blindly take the value of the expected; it will throw if there is
no value. If you want to check for errors without exceptions, check if
the expected contains a value. If you don't care about errors, use
expected::value_or (or whatever it's called).
> - Parse a number and give me the tail string so I can continue parsing the
> next object.
This is only possible with a default value? With a default value, an
inline wrapper can trivially provide it.
> - Initialize a variable using parse(), preferably using auto / type
> deduction.
Also possible with a trivial inline wrapper. (I don't think we need to
worry about having parse() write to the existing variable directly;
we're talking about numeric types; assigning the return value to an
existing variable is not expensive.)
> - Assign to a pre-existing variable using parse(), preferably with type
> deduction.
It seems to be that 'expected<T> parse(string_view in, size_t* pos, int
base)', or similar with some tweaking of how we communicate what was
consumed, covers all of the above use cases (with the addition of some
trivial inline convenience wrappers built on top of it).
--
Matthew
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
.
Author: "Vicente J. Botet Escriba" <vicente.botet@wanadoo.fr>
Date: Wed, 20 May 2015 18:29:48 +0200
Raw View
This is a multi-part message in MIME format.
--------------090003090906090606080404
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: quoted-printable
Le 20/05/15 04:41, Matthew Fioravante a =C3=A9crit :
> If out parameters are to be used and their presence (or lack thereof)=20
> changes behavior, then we should pass them by rvalue-reference.
>
> See example:
>
> |
> ret<T>parse(string_view&&tail,string_view s);//Parses a T from s.Sets=20
> tail to the end of the string
> ret<T>parse(string_view s);//Parses a T from s. Error if there are=20
> extra characters after the parsed string.
> |
>
What about returning the tail also? Why passing it by rvalue or lvalue?
|
pair<ret<T>,|string_view>| parse(string_view s);//Parses a T from s.=20
Return s the tail
ret<T>parse_exact(string_view s);//Parses a T from s. Error if there are=20
extra characters after the parsed string.
|
> |
> autoa =3Dparse<int>(tail,str);//Parse an int from str, storing the tail=
=20
> of the string in tail
> autob =3Dparse<int>(str);//Parse an int from str,is an error if str has=
=20
> trailing characters after the value
> autoc =3Dparse<int>(string_view{},str);//Parse an int from str and=20
> ignore any characters after the value
> |
>
|
tie(a, str) =3Dparse<int>(str);
autob =3Dparse_exact<int>(str);//Parse an int from str,is an error if str=
=20
has trailing characters after the value
tie(c,|ignore|) =3Dparse<int>(str);//Parse an int from str and ignore the=
=20
characters after the value
|
Not yet there, but I would find quire readable to assign multiple values=20
and even declare them in situ as well
|{auto a, str} =3Dparse<int>(str);
||
{auto c,||||||ignore}|| =3Dparse<int>(str);|
The advantage of a functional interface (no out/in-out parameters)
parse<T> :: S -> (T, S)
is that this function can be composed using fold like functions, that=20
consume from the parser state and folds e.g. on a list.
The tail out parameter makes the signature too specific
parse<T> :: S&, S -> T
that composes less easily.
> The last example would not be possible if tail was passed by lvalue=20
> reference.
>
> Using an rvalue reference allows this kind of mistake to pass the=20
> compiler however.
>
> |
> string_view readStr();
> //Oops
> autoa =3Dparse<int>(readStr(),tail);
> |
>
Having only input parameters makes this error not possible as tie=20
expects a reference
|
string_view readStr();
//Oops
tie(a, |readStr())| =3Dparse<int>(tail); // compile error
|
>
> If the API worked this way:
> |
> ret<T>parse(string_view&tail,string_view s);//Parses a T from s.Sets=20
> tail to the end of the string
> ret<T>parse(string_view s);//Parses a T from s and ignores the=20
> remaining characters.
> |
>
> Then there is no reason to ever pass an rvalue tail so an lvalue=20
> reference is probably more appropriate as it makes the above bug a=20
> compiler error.
>
Yes the rvalue tail parameter doesn't seam a good idea.
Vicente
--=20
---=20
You received this message because you are subscribed to the Google Groups "=
ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposa=
ls/.
--------------090003090906090606080404
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable
<html>
<head>
<meta content=3D"text/html; charset=3Dutf-8" http-equiv=3D"Content-Type=
">
</head>
<body text=3D"#000000" bgcolor=3D"#FFFFFF">
<div class=3D"moz-cite-prefix">Le 20/05/15 04:41, Matthew Fioravante a
=C3=A9crit=C2=A0:<br>
</div>
<blockquote
cite=3D"mid:010af437-2f84-4afe-ae09-3e6f6edeb9b8@isocpp.org"
type=3D"cite">
<div dir=3D"ltr">If out parameters are to be used and their presence
(or lack thereof) changes behavior, then we should pass them by
rvalue-reference.
<div><br>
</div>
<div>See example:</div>
<div><br>
</div>
<div>
<div class=3D"prettyprint" style=3D"border: 1px solid rgb(187,
187, 187); word-wrap: break-word; background-color: rgb(250,
250, 250);"><code class=3D"prettyprint">
<div class=3D"subprettyprint"><span style=3D"color: #000;"
class=3D"styled-by-prettify">ret</span><span
style=3D"color: #660;" class=3D"styled-by-prettify"><<=
/span><span
style=3D"color: #000;" class=3D"styled-by-prettify">T</sp=
an><span
style=3D"color: #660;" class=3D"styled-by-prettify">><=
/span><span
style=3D"color: #000;" class=3D"styled-by-prettify"> pars=
e</span><span
style=3D"color: #660;" class=3D"styled-by-prettify">(</sp=
an><span
style=3D"color: #000;" class=3D"styled-by-prettify">strin=
g_view</span><span
style=3D"color: #660;" class=3D"styled-by-prettify">&=
&</span><span
style=3D"color: #000;" class=3D"styled-by-prettify"> tail=
</span><span
style=3D"color: #660;" class=3D"styled-by-prettify">,</sp=
an><span
style=3D"color: #000;" class=3D"styled-by-prettify">strin=
g_view
s</span><span style=3D"color: #660;"
class=3D"styled-by-prettify">);</span><span
style=3D"color: #000;" class=3D"styled-by-prettify"> </sp=
an><span
style=3D"color: #800;" class=3D"styled-by-prettify">//Par=
ses
a T from s.Sets tail to the end of the string</span><span
style=3D"color: #000;" class=3D"styled-by-prettify"><br>
ret</span><span style=3D"color: #660;"
class=3D"styled-by-prettify"><</span><span
style=3D"color: #000;" class=3D"styled-by-prettify">T</sp=
an><span
style=3D"color: #660;" class=3D"styled-by-prettify">><=
/span><span
style=3D"color: #000;" class=3D"styled-by-prettify"> pars=
e</span><span
style=3D"color: #660;" class=3D"styled-by-prettify">(</sp=
an><span
style=3D"color: #000;" class=3D"styled-by-prettify">strin=
g_view
s</span><span style=3D"color: #660;"
class=3D"styled-by-prettify">);</span><span
style=3D"color: #000;" class=3D"styled-by-prettify"> </sp=
an><span
style=3D"color: #800;" class=3D"styled-by-prettify">//Par=
ses
a T from s. Error if there are extra characters after
the parsed string.</span></div>
</code></div>
<font style=3D"font-family: monospace; background-color:
rgb(250, 250, 250);" color=3D"#666600"><span style=3D"color:
rgb(136, 0, 0);"><br>
</span></font></div>
</div>
</blockquote>
<br>
What about returning the tail also? Why passing it by rvalue or
lvalue?<br>
<br>
<div>
<div class=3D"prettyprint" style=3D"border: 1px solid rgb(187, 187,
187); word-wrap: break-word; background-color: rgb(250, 250,
250);"><code class=3D"prettyprint">
<div class=3D"subprettyprint"><span style=3D"color: #000;"
class=3D"styled-by-prettify">pair<ret</span><span
style=3D"color: #660;" class=3D"styled-by-prettify"><</spa=
n><span
style=3D"color: #000;" class=3D"styled-by-prettify">T</span><=
span
style=3D"color: #660;" class=3D"styled-by-prettify">></spa=
n><span
style=3D"color: #000;" class=3D"styled-by-prettify">,</span><=
span
style=3D"color: #000;" class=3D"styled-by-prettify"><code
class=3D"prettyprint"><span style=3D"color: #660;"
class=3D"styled-by-prettify"></span><span style=3D"color:
#000;" class=3D"styled-by-prettify">string_view</span><sp=
an
style=3D"color: #660;" class=3D"styled-by-prettify">><=
/span></code>
parse</span><span style=3D"color: #660;"
class=3D"styled-by-prettify">(</span><span style=3D"color:
#660;" class=3D"styled-by-prettify"></span><span
style=3D"color: #000;" class=3D"styled-by-prettify">string_vi=
ew
s</span><span style=3D"color: #660;"
class=3D"styled-by-prettify">);</span><span style=3D"color:
#000;" class=3D"styled-by-prettify"> </span><span
style=3D"color: #800;" class=3D"styled-by-prettify">//Parses =
a
T from s. Return s the tail</span><span style=3D"color:
#000;" class=3D"styled-by-prettify"><br>
ret</span><span style=3D"color: #660;"
class=3D"styled-by-prettify"><</span><span style=3D"color:
#000;" class=3D"styled-by-prettify">T</span><span
style=3D"color: #660;" class=3D"styled-by-prettify">></spa=
n><span
style=3D"color: #000;" class=3D"styled-by-prettify">
parse_exact</span><span style=3D"color: #660;"
class=3D"styled-by-prettify">(</span><span style=3D"color:
#000;" class=3D"styled-by-prettify">string_view s</span><span
style=3D"color: #660;" class=3D"styled-by-prettify">);</span>=
<span
style=3D"color: #000;" class=3D"styled-by-prettify"> </span><=
span
style=3D"color: #800;" class=3D"styled-by-prettify">//Parses =
a
T from s. Error if there are extra characters after the
parsed string.</span></div>
</code></div>
<font style=3D"font-family: monospace; background-color: rgb(250,
250, 250);" color=3D"#666600"><span style=3D"color: rgb(136, 0, 0);=
"><br>
</span></font></div>
<blockquote
cite=3D"mid:010af437-2f84-4afe-ae09-3e6f6edeb9b8@isocpp.org"
type=3D"cite">
<div dir=3D"ltr">
<div><font style=3D"font-family: monospace; background-color:
rgb(250, 250, 250);">
<div class=3D"prettyprint" style=3D"border: 1px solid rgb(187,
187, 187); word-wrap: break-word; background-color:
rgb(250, 250, 250);"><code class=3D"prettyprint">
<div class=3D"subprettyprint"><span style=3D"color: #008;"
class=3D"styled-by-prettify">auto</span><span
style=3D"color: #000;" class=3D"styled-by-prettify"> a =
</span><span
style=3D"color: #660;" class=3D"styled-by-prettify">=3D=
</span><span
style=3D"color: #000;" class=3D"styled-by-prettify">
parse</span><span style=3D"color: #080;"
class=3D"styled-by-prettify"><int></span><span
style=3D"color: #660;" class=3D"styled-by-prettify">(</=
span><span
style=3D"color: #000;" class=3D"styled-by-prettify">tai=
l</span><span
style=3D"color: #660;" class=3D"styled-by-prettify">,</=
span><span
style=3D"color: #000;" class=3D"styled-by-prettify"> st=
r</span><span
style=3D"color: #660;" class=3D"styled-by-prettify">);<=
/span><span
style=3D"color: #000;" class=3D"styled-by-prettify"> </=
span><span
style=3D"color: #800;" class=3D"styled-by-prettify">//P=
arse
an int from str, storing the tail of the string in
tail</span><span style=3D"color: #000;"
class=3D"styled-by-prettify"><br>
</span><span style=3D"color: #008;"
class=3D"styled-by-prettify">auto</span><span
style=3D"color: #000;" class=3D"styled-by-prettify"> b =
</span><span
style=3D"color: #660;" class=3D"styled-by-prettify">=3D=
</span><span
style=3D"color: #000;" class=3D"styled-by-prettify">
parse</span><span style=3D"color: #080;"
class=3D"styled-by-prettify"><int</span><font
color=3D"#666600"><span style=3D"color: #080;"
class=3D"styled-by-prettify">></span><span
style=3D"color: #660;" class=3D"styled-by-prettify">(=
</span><span
style=3D"color: #000;" class=3D"styled-by-prettify">s=
tr</span><span
style=3D"color: #660;" class=3D"styled-by-prettify">)=
;</span><span
style=3D"color: #000;" class=3D"styled-by-prettify"> =
</span><span
style=3D"color: #800;" class=3D"styled-by-prettify">/=
/Parse
an int from str,is an error if str has trailing
characters after the value</span><span
style=3D"color: #000;" class=3D"styled-by-prettify"><=
br>
</span><span style=3D"color: #008;"
class=3D"styled-by-prettify">auto</span><span
style=3D"color: #000;" class=3D"styled-by-prettify"> =
c
</span><span style=3D"color: #660;"
class=3D"styled-by-prettify">=3D</span><span
style=3D"color: #000;" class=3D"styled-by-prettify">
parse</span><span style=3D"color: #080;"
class=3D"styled-by-prettify"><int></span><span
style=3D"color: #660;" class=3D"styled-by-prettify">(=
</span><span
style=3D"color: #000;" class=3D"styled-by-prettify">s=
tring_view</span><span
style=3D"color: #660;" class=3D"styled-by-prettify">{=
},</span><span
style=3D"color: #000;" class=3D"styled-by-prettify">s=
tr</span><span
style=3D"color: #660;" class=3D"styled-by-prettify">)=
;</span><span
style=3D"color: #000;" class=3D"styled-by-prettify"> =
</span><span
style=3D"color: #800;" class=3D"styled-by-prettify">/=
/Parse
an int from str and ignore any characters after
the value</span></font></div>
</code></div>
<br>
</font></div>
</div>
</blockquote>
<font style=3D"font-family: monospace; background-color: rgb(250, 250,
250);">
<div class=3D"prettyprint" style=3D"border: 1px solid rgb(187, 187,
187); word-wrap: break-word; background-color: rgb(250, 250,
250);"><code class=3D"prettyprint">
<div class=3D"subprettyprint"><span style=3D"color: #008;"
class=3D"styled-by-prettify">tie(</span><span style=3D"color:
#000;" class=3D"styled-by-prettify">a, str) </span><span
style=3D"color: #660;" class=3D"styled-by-prettify">=3D</span=
><span
style=3D"color: #000;" class=3D"styled-by-prettify"> parse</s=
pan><span
style=3D"color: #080;" class=3D"styled-by-prettify"><int&g=
t;</span><span
style=3D"color: #660;" class=3D"styled-by-prettify">(</span><=
span
style=3D"color: #000;" class=3D"styled-by-prettify">str</span=
><span
style=3D"color: #660;" class=3D"styled-by-prettify">);</span>=
<span
style=3D"color: #000;" class=3D"styled-by-prettify"> </span><=
span
style=3D"color: #000;" class=3D"styled-by-prettify"><br>
</span><span style=3D"color: #008;" class=3D"styled-by-prettify=
">auto</span><span
style=3D"color: #000;" class=3D"styled-by-prettify"> b </span=
><span
style=3D"color: #660;" class=3D"styled-by-prettify">=3D</span=
><span
style=3D"color: #000;" class=3D"styled-by-prettify"> parse</s=
pan><span
style=3D"color: #080;" class=3D"styled-by-prettify">_exact<=
;int</span><font
color=3D"#666600"><span style=3D"color: #080;"
class=3D"styled-by-prettify">></span><span
style=3D"color: #660;" class=3D"styled-by-prettify">(</span=
><span
style=3D"color: #000;" class=3D"styled-by-prettify">str</sp=
an><span
style=3D"color: #660;" class=3D"styled-by-prettify">);</spa=
n><span
style=3D"color: #000;" class=3D"styled-by-prettify"> </span=
><span
style=3D"color: #800;" class=3D"styled-by-prettify">//Parse
an int from str,is an error if str has trailing
characters after the value</span><span style=3D"color:
#000;" class=3D"styled-by-prettify"><br>
</span><span style=3D"color: #008;"
class=3D"styled-by-prettify">tie(</span><span
style=3D"color: #000;" class=3D"styled-by-prettify">c,</spa=
n></font><code>
ignore</code><font color=3D"#666600"><span style=3D"color:
#000;" class=3D"styled-by-prettify"></span></font><font
color=3D"#666600"><span style=3D"color: #000;"
class=3D"styled-by-prettify"> ) </span><span
style=3D"color: #660;" class=3D"styled-by-prettify">=3D</sp=
an><span
style=3D"color: #000;" class=3D"styled-by-prettify"> parse<=
/span><span
style=3D"color: #080;" class=3D"styled-by-prettify"><int=
></span><span
style=3D"color: #660;" class=3D"styled-by-prettify">(</span=
><span
style=3D"color: #000;" class=3D"styled-by-prettify">str</sp=
an><span
style=3D"color: #660;" class=3D"styled-by-prettify">);</spa=
n><span
style=3D"color: #000;" class=3D"styled-by-prettify"> </span=
><span
style=3D"color: #800;" class=3D"styled-by-prettify">//Parse
an int from str and ignore the characters after the
value</span></font></div>
</code></div>
</font><br>
Not yet there, but I would find quire readable to assign multiple
values and even declare them in situ as well<br>
<br>
<font style=3D"font-family: monospace; background-color: rgb(250, 250,
250);"><code class=3D"prettyprint"><span style=3D"color: #008;"
class=3D"styled-by-prettify">{auto </span><span style=3D"color:
#000;" class=3D"styled-by-prettify">a, str} </span><span
style=3D"color: #660;" class=3D"styled-by-prettify">=3D</span><sp=
an
style=3D"color: #000;" class=3D"styled-by-prettify"> parse</span>=
<span
style=3D"color: #080;" class=3D"styled-by-prettify"><int></=
span><span
style=3D"color: #660;" class=3D"styled-by-prettify">(</span><span
style=3D"color: #000;" class=3D"styled-by-prettify">str</span><sp=
an
style=3D"color: #660;" class=3D"styled-by-prettify">);</span><spa=
n
style=3D"color: #000;" class=3D"styled-by-prettify"> </span><span
style=3D"color: #000;" class=3D"styled-by-prettify"><br>
</span></code></font><font style=3D"font-family: monospace;
background-color: rgb(250, 250, 250);"><code class=3D"prettyprint"><f=
ont
color=3D"#666600"><span style=3D"color: #000;"
class=3D"styled-by-prettify"><br>
</span><span style=3D"color: #008;" class=3D"styled-by-prettify">=
{</span><span
style=3D"color: #000;" class=3D"styled-by-prettify"><font
color=3D"#666600">auto </font>c,</span></font><code> </code><=
font
color=3D"#666600"><span style=3D"color: #000;"
class=3D"styled-by-prettify"></span></font></code></font><font
style=3D"font-family: monospace; background-color: rgb(250, 250,
250);"><code class=3D"prettyprint"><font color=3D"#666600"><span
style=3D"color: #000;" class=3D"styled-by-prettify"><font
style=3D"font-family: monospace; background-color: rgb(250,
250, 250);"><code class=3D"prettyprint"><code>ignore}</code><=
/code></font>
</span><span style=3D"color: #660;" class=3D"styled-by-prettify">=
=3D</span><span
style=3D"color: #000;" class=3D"styled-by-prettify"> parse</spa=
n><span
style=3D"color: #080;" class=3D"styled-by-prettify"><int>=
</span><span
style=3D"color: #660;" class=3D"styled-by-prettify">(</span><sp=
an
style=3D"color: #000;" class=3D"styled-by-prettify">str</span><=
span
style=3D"color: #660;" class=3D"styled-by-prettify">);</span></=
font></code></font><br>
<br>
The advantage of a functional interface (no out/in-out parameters) <br>
<br>
parse<T> :: S -> (T, S)=C2=A0 <br>
<br>
is that this function can be composed using fold like functions,
that consume from the parser state and folds e.g. on a list.<br>
<br>
The tail out parameter makes the signature too specific <br>
<br>
parse<T> :: S&, S -> T=C2=A0 <br>
<br>
that composes less easily.<br>
<br>
<blockquote
cite=3D"mid:010af437-2f84-4afe-ae09-3e6f6edeb9b8@isocpp.org"
type=3D"cite">
<div dir=3D"ltr">
<div><font style=3D"font-family: monospace; background-color:
rgb(250, 250, 250);">The last example would not be possible
if tail was passed by lvalue reference.</font></div>
<div><font style=3D"font-family: monospace; background-color:
rgb(250, 250, 250);"><br>
</font></div>
<div><font style=3D"font-family: monospace; background-color:
rgb(250, 250, 250);">Using an rvalue reference allows this
kind of mistake to pass the compiler however.</font></div>
<div><font style=3D"font-family: monospace; background-color:
rgb(250, 250, 250);"><br>
</font></div>
<div><font style=3D"font-family: monospace; background-color:
rgb(250, 250, 250);">
<div class=3D"prettyprint" style=3D"border: 1px solid rgb(187,
187, 187); word-wrap: break-word; background-color:
rgb(250, 250, 250);"><code class=3D"prettyprint">
<div class=3D"subprettyprint"><font color=3D"#660066"><span
style=3D"color: #000;" class=3D"styled-by-prettify">s=
tring_view
readStr</span><span style=3D"color: #660;"
class=3D"styled-by-prettify">();</span><span
style=3D"color: #000;" class=3D"styled-by-prettify"><=
br>
</span><span style=3D"color: #800;"
class=3D"styled-by-prettify">//Oops</span><span
style=3D"color: #000;" class=3D"styled-by-prettify"><=
br>
</span><span style=3D"color: #008;"
class=3D"styled-by-prettify">auto</span><span
style=3D"color: #000;" class=3D"styled-by-prettify"> =
a
</span><span style=3D"color: #660;"
class=3D"styled-by-prettify">=3D</span><span
style=3D"color: #000;" class=3D"styled-by-prettify">
parse</span><span style=3D"color: #080;"
class=3D"styled-by-prettify"><int></span><span
style=3D"color: #660;" class=3D"styled-by-prettify">(=
</span><span
style=3D"color: #000;" class=3D"styled-by-prettify">r=
eadStr</span><span
style=3D"color: #660;" class=3D"styled-by-prettify">(=
),</span><span
style=3D"color: #000;" class=3D"styled-by-prettify">t=
ail</span><span
style=3D"color: #660;" class=3D"styled-by-prettify">)=
;</span></font></div>
</code></div>
</font></div>
<div><font style=3D"font-family: monospace; background-color:
rgb(250, 250, 250);"><br>
</font></div>
</div>
</blockquote>
Having only input parameters makes this error not possible as tie
expects a reference<br>
<br>
<div><font style=3D"font-family: monospace; background-color: rgb(250,
250, 250);">
<div class=3D"prettyprint" style=3D"border: 1px solid rgb(187, 187,
187); word-wrap: break-word; background-color: rgb(250, 250,
250);"><code class=3D"prettyprint">
<div class=3D"subprettyprint"><font color=3D"#660066"><span
style=3D"color: #000;" class=3D"styled-by-prettify">strin=
g_view
readStr</span><span style=3D"color: #660;"
class=3D"styled-by-prettify">();</span><span
style=3D"color: #000;" class=3D"styled-by-prettify"><br>
</span><span style=3D"color: #800;"
class=3D"styled-by-prettify">//Oops</span><span
style=3D"color: #000;" class=3D"styled-by-prettify"><br>
</span><span style=3D"color: #008;"
class=3D"styled-by-prettify">tie(</span><span
style=3D"color: #000;" class=3D"styled-by-prettify">a, </=
span></font><font
color=3D"#660066"><span style=3D"color: #000;"
class=3D"styled-by-prettify"><font style=3D"font-family:
monospace; background-color: rgb(250, 250, 250);"><code
class=3D"prettyprint"><font color=3D"#660066"><span
style=3D"color: #660;"
class=3D"styled-by-prettify"></span><span
style=3D"color: #000;"
class=3D"styled-by-prettify">readStr</span><span
style=3D"color: #660;"
class=3D"styled-by-prettify">()<font
color=3D"#660066">)</font></span><span
style=3D"color: #000;"
class=3D"styled-by-prettify"></span></font></code=
></font>
</span><span style=3D"color: #660;"
class=3D"styled-by-prettify">=3D</span><span style=3D"col=
or:
#000;" class=3D"styled-by-prettify"> parse</span><span
style=3D"color: #080;" class=3D"styled-by-prettify"><i=
nt></span><span
style=3D"color: #660;" class=3D"styled-by-prettify">(</sp=
an><span
style=3D"color: #000;" class=3D"styled-by-prettify">tail<=
/span><span
style=3D"color: #660;" class=3D"styled-by-prettify">); //
compile error<br>
</span></font></div>
</code></div>
</font></div>
<div><font style=3D"font-family: monospace; background-color: rgb(250,
250, 250);"><br>
</font></div>
<br>
<blockquote
cite=3D"mid:010af437-2f84-4afe-ae09-3e6f6edeb9b8@isocpp.org"
type=3D"cite">
<div dir=3D"ltr">
<div><font style=3D"font-family: monospace; background-color:
rgb(250, 250, 250);"><br>
</font></div>
<div><font style=3D"font-family: monospace; background-color:
rgb(250, 250, 250);">If the API worked this way:</font></div>
<div><font style=3D"font-family: monospace; background-color:
rgb(250, 250, 250);">
<div class=3D"prettyprint" style=3D"border: 1px solid rgb(187,
187, 187); word-wrap: break-word; background-color:
rgb(250, 250, 250);"><code class=3D"prettyprint">
<div class=3D"subprettyprint"><span style=3D"color: #000;"
class=3D"styled-by-prettify">ret</span><span
style=3D"color: #660;" class=3D"styled-by-prettify"><=
;</span><span
style=3D"color: #000;" class=3D"styled-by-prettify">T</=
span><span
style=3D"color: #660;" class=3D"styled-by-prettify">>=
;</span><span
style=3D"color: #000;" class=3D"styled-by-prettify">
parse</span><span style=3D"color: #660;"
class=3D"styled-by-prettify">(</span><span
style=3D"color: #000;" class=3D"styled-by-prettify">str=
ing_view</span><span
style=3D"color: #660;" class=3D"styled-by-prettify">&am=
p;</span><span
style=3D"color: #000;" class=3D"styled-by-prettify">
tail</span><span style=3D"color: #660;"
class=3D"styled-by-prettify">,</span><span
style=3D"color: #000;" class=3D"styled-by-prettify">str=
ing_view
s</span><span style=3D"color: #660;"
class=3D"styled-by-prettify">);</span><span
style=3D"color: #000;" class=3D"styled-by-prettify"> </=
span><span
style=3D"color: #800;" class=3D"styled-by-prettify">//P=
arses
a T from s.Sets tail to the end of the string</span><sp=
an
style=3D"color: #000;" class=3D"styled-by-prettify"><br=
>
ret</span><span style=3D"color: #660;"
class=3D"styled-by-prettify"><</span><span
style=3D"color: #000;" class=3D"styled-by-prettify">T</=
span><span
style=3D"color: #660;" class=3D"styled-by-prettify">>=
;</span><span
style=3D"color: #000;" class=3D"styled-by-prettify">
parse</span><span style=3D"color: #660;"
class=3D"styled-by-prettify">(</span><span
style=3D"color: #000;" class=3D"styled-by-prettify">str=
ing_view
s</span><span style=3D"color: #660;"
class=3D"styled-by-prettify">);</span><span
style=3D"color: #000;" class=3D"styled-by-prettify"> </=
span><span
style=3D"color: #800;" class=3D"styled-by-prettify">//P=
arses
a T from s and ignores the remaining characters.</span>=
</div>
</code></div>
<span class=3D"styled-by-prettify" style=3D"color: rgb(136, 0,
0);">
<div><font style=3D"font-family: monospace;
background-color: rgb(250, 250, 250);"><span
class=3D"styled-by-prettify" style=3D"color: rgb(136, 0=
,
0);"><br>
</span></font></div>
<div style=3D"color: rgb(34, 34, 34); font-family: Arial,
Helvetica, sans-serif; background-color: rgb(255, 255,
255);"><font style=3D"font-family: monospace;
background-color: rgb(250, 250, 250);">Then there is
no reason to ever pass an rvalue tail so an lvalue
reference is probably more appropriate as it makes the
above bug a compiler error.</font></div>
</span><br>
</font></div>
</div>
</blockquote>
Yes the rvalue tail parameter doesn't seam a good idea.<br>
<br>
Vicente<br>
</body>
</html>
<p></p>
-- <br />
<br />
--- <br />
You received this message because you are subscribed to the Google Groups &=
quot;ISO C++ Standard - Future Proposals" group.<br />
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to <a href=3D"mailto:std-proposals+unsubscribe@isocpp.org">std-proposa=
ls+unsubscribe@isocpp.org</a>.<br />
To post to this group, send email to <a href=3D"mailto:std-proposals@isocpp=
..org">std-proposals@isocpp.org</a>.<br />
Visit this group at <a href=3D"http://groups.google.com/a/isocpp.org/group/=
std-proposals/">http://groups.google.com/a/isocpp.org/group/std-proposals/<=
/a>.<br />
--------------090003090906090606080404--
.
Author: "Vicente J. Botet Escriba" <vicente.botet@wanadoo.fr>
Date: Wed, 20 May 2015 18:30:10 +0200
Raw View
Le 20/05/15 17:27, Matthew Fioravante a =C3=A9crit :
>
>
> If you're planning to actually write this paper and want to=20
> collaborate, I'd be happy to help with this.
>
> What are all of the use cases for an API like this? Here is what I can=20
> think of:
> - Parse a number and throw an exception if error occurs
Do we really want this?
> - Parse a number and let me handle errors without exceptions
Agreed.
> - Parse a number and give me a default value if an error occurs
Agreed with the use case, not sure we need something specific..
> - Parse a number and give me the tail string so I can continue parsing=20
> the next object.
This seems to me the normal case.
> - Initialize a variable using parse(), preferably using auto / type=20
> deduction.
Do you mean a variable of type int e.g.?
auto i =3D parse<int>(p); // decltype(i) int
> - Assign to a pre-existing variable using parse(), preferably with=20
> type deduction.
You are requesting that the following should be valid
i =3D parse<int>(p);
?
I believe that I don't understand what do you want here?
Vicente
--=20
---=20
You received this message because you are subscribed to the Google Groups "=
ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposa=
ls/.
.
Author: Olaf van der Spek <olafvdspek@gmail.com>
Date: Wed, 20 May 2015 20:16:12 +0200
Raw View
2015-05-20 16:35 GMT+02:00 Vicente J. Botet Escriba <vicente.botet@wanadoo.fr>:
> Maybe instead of using string_view the function should work on any model of
> a given ParserState Concept. What are the operations a parser need from this
> ParserState?
Maybe, what's a ParserState?
>>
>> Error detection should be simpler, but not everyone is a fan of
>> exceptions.
>
> We can question ourselves which interface we will had if exceptions were
> acceptable.
We could but what's the point?
> Parsing is not the same than matching. When you parse, you want to parse
> several things, so you need a new state of the ParserState on which apply
> again the function parser. When you want to match the whole input must be
> consumed.
> I suggest to use a different function for this use case.
What should such a function be named?
> Sorry, but what the pos parameter is used for?
http://en.cppreference.com/w/cpp/string/basic_string/stol
--
Olaf
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
.
Author: Matthew Fioravante <fmatthew5876@gmail.com>
Date: Wed, 20 May 2015 11:28:14 -0700 (PDT)
Raw View
------=_Part_4566_747858066.1432146494270
Content-Type: multipart/alternative;
boundary="----=_Part_4567_28729371.1432146494270"
------=_Part_4567_28729371.1432146494270
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable
On Wednesday, May 20, 2015 at 12:29:51 PM UTC-4, Vicente J. Botet Escriba=
=20
wrote:
> Not yet there, but I would find quire readable to assign multiple values=
=20
> and even declare them in situ as well
>
> {auto a, str} =3D parse<int>(str);=20
>
> {auto c, ignore} =3D parse<int>(str);
>
Its quite a different proposal, but I think a syntax like this is badly=20
needed. We have tie() which is only 3 characters but its actually 8=20
characters because pretty much in all of my code I'd be saying std::tie().=
=20
Also the tuple/tie solution is not as well known as a good way to return=20
multiple values. This kind of thing is good enough to be a core language=20
feature IMO.
Maybe not braces since current use of braces always introduce a scope and=
=20
that auto a lives in the outer scope. Maybe [] can be reused?
[ auto a, str ] =3D parse<int>(str);
=20
> is that this function can be composed using fold like functions, that=20
consume from the parser state and folds e.g. on a list.
It would probably be good to have a specific example or 2 in the paper=20
showing the strengths of the functional programming approach.
> Yes the rvalue tail parameter doesn't seam a good idea.
The tail can also be included in the returned object and maybe that is the=
=20
superior approach. If for whatever reason out parameter is deemed the best=
=20
solution, I tried to outline some concerns as to whether it should be=20
passed by lvalue or rvalue.
=20
On Wednesday, May 20, 2015 at 12:30:12 PM UTC-4, Vicente J. Botet Escriba=
=20
wrote:
>
> Le 20/05/15 17:27, Matthew Fioravante a =C3=A9crit :=20
> >=20
> >=20
> > If you're planning to actually write this paper and want to=20
> > collaborate, I'd be happy to help with this.=20
> >=20
> > What are all of the use cases for an API like this? Here is what I can=
=20
> > think of:=20
> > - Parse a number and throw an exception if error occurs=20
> Do we really want this?=20
>
I think so, at least as a wrapper if nothing else. The use case of "I'm=20
just going to try to parse a bunch of stuff and bail out if any one fails"=
=20
is a not uncommon.
=20
> > - Parse a number and let me handle errors without exceptions=20
> Agreed.=20
> > - Parse a number and give me a default value if an error occurs=20
> Agreed with the use case, not sure we need something specific..=20
> > - Parse a number and give me the tail string so I can continue parsing=
=20
> > the next object.=20
> This seems to me the normal case.=20
> > - Initialize a variable using parse(), preferably using auto / type=20
> > deduction.=20
> Do you mean a variable of type int e.g.?=20
>
> auto i =3D parse<int>(p); // decltype(i) int=20
>
Yes. The idea is that if we specify the type int in the parse() invocation=
=20
then we should be able to take advantage of type deduction and skip it in=
=20
the declaration of the result variable. Using your multiple return values=
=20
can achieve this goal as well.
=20
> > - Assign to a pre-existing variable using parse(), preferably with=20
> > type deduction.=20
> You are requesting that the following should be valid=20
>
> i =3D parse<int>(p);=20
>
> ?=20
> I believe that I don't understand what do you want here?=20
>
Even more nice to have. I mean like this:
template <typename T>
error_code parse(T& val, string_view s);
double x =3D 1.0;
if(!parse(x, s)) {
//handle error
}
Here we did not need to specify the type we want to parse because its=20
deduced from the out parameter.=20
If I later change x to a float or int or whatever, the parsing code will=20
update itself to be correct, vs this scenario:
//double x; //old code
int x; //new code
x =3D parse<double>(s); //oops!
x =3D parse<decltype(x)>(s); //Workaround, but ugly
--=20
---=20
You received this message because you are subscribed to the Google Groups "=
ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposa=
ls/.
------=_Part_4567_28729371.1432146494270
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable
<div dir=3D"ltr"><br><br>On Wednesday, May 20, 2015 at 12:29:51 PM UTC-4, V=
icente J. Botet Escriba wrote:<br><blockquote class=3D"gmail_quote" style=
=3D"margin: 0;margin-left: 0.8ex;border-left: 1px #ccc solid;padding-left: =
1ex;"><div bgcolor=3D"#FFFFFF">
Not yet there, but I would find quire readable to assign multiple
values and even declare them in situ as well<br>
<br>
<font style=3D"font-family:monospace;background-color:rgb(250,250,250)"=
><code><span style=3D"color:#008">{auto </span><span style=3D"color:#000">a=
, str} </span><span style=3D"color:#660">=3D</span><span style=3D"color:#00=
0"> parse</span><span style=3D"color:#080"><int></span><span style=3D=
"color:#660">(</span><span style=3D"color:#000">str</span><span style=3D"co=
lor:#660">);</span><span style=3D"color:#000"> </span><span style=3D"color:=
#000"><br>
</span></code></font><font style=3D"font-family:monospace;backgroun=
d-color:rgb(250,250,250)"><code><font color=3D"#666600"><span style=3D"colo=
r:#000"><br>
</span><span style=3D"color:#008">{</span><span style=3D"color:#0=
00"><font color=3D"#666600">auto </font>c,</span></font><code> </code><font=
color=3D"#666600"><span style=3D"color:#000"></span></font></code></font><=
font style=3D"font-family:monospace;background-color:rgb(250,250,250)"><cod=
e><font color=3D"#666600"><span style=3D"color:#000"><font style=3D"font-fa=
mily:monospace;background-color:rgb(250,250,250)"><code><code>ignore}</code=
></code></font>
</span><span style=3D"color:#660">=3D</span><span style=3D"color:=
#000"> parse</span><span style=3D"color:#080"><int></span><span style=
=3D"color:#660">(</span><span style=3D"color:#000">str</span><span style=3D=
"color:#660">);</span></font></code></font><br></div></blockquote><div bgco=
lor=3D"#FFFFFF">Its quite a different proposal, but I think a syntax like t=
his is badly needed. We have tie() which is only 3 characters but its actua=
lly 8 characters because pretty much in all of my code I'd be saying std::t=
ie(). Also the tuple/tie solution is not as well known as a good way to ret=
urn multiple values. This kind of thing is good enough to be a core languag=
e feature IMO.<br><br>Maybe not braces since current use of braces always i=
ntroduce a scope and that auto a lives in the outer scope. Maybe [] can be =
reused?<br><br><div class=3D"prettyprint" style=3D"background-color: rgb(25=
0, 250, 250); border-color: rgb(187, 187, 187); border-style: solid; border=
-width: 1px; word-wrap: break-word;"><code class=3D"prettyprint"><div class=
=3D"subprettyprint"><span style=3D"color: #660;" class=3D"styled-by-prettif=
y">[</span><span style=3D"color: #000;" class=3D"styled-by-prettify"> </spa=
n><span style=3D"color: #008;" class=3D"styled-by-prettify">auto</span><spa=
n style=3D"color: #000;" class=3D"styled-by-prettify"> a</span><span style=
=3D"color: #660;" class=3D"styled-by-prettify">,</span><span style=3D"color=
: #000;" class=3D"styled-by-prettify"> str </span><span style=3D"color: #66=
0;" class=3D"styled-by-prettify">]</span><span style=3D"color: #000;" class=
=3D"styled-by-prettify"> </span><span style=3D"color: #660;" class=3D"style=
d-by-prettify">=3D</span><span style=3D"color: #000;" class=3D"styled-by-pr=
ettify"> parse</span><span style=3D"color: #080;" class=3D"styled-by-pretti=
fy"><int></span><span style=3D"color: #660;" class=3D"styled-by-prett=
ify">(</span><span style=3D"color: #000;" class=3D"styled-by-prettify">str<=
/span><span style=3D"color: #660;" class=3D"styled-by-prettify">);</span><s=
pan style=3D"color: #000;" class=3D"styled-by-prettify"><br></span></div></=
code></div><br><br> <br>
> is that this function can be composed using fold like functions,
that consume from the parser state and folds e.g. on a list.<br></div><=
div bgcolor=3D"#FFFFFF"><br>It would probably be good to have a specific ex=
ample or 2 in the paper showing the strengths of the functional programming=
approach.<br><br>
> Yes the rvalue tail parameter doesn't seam a good idea.<br></div><=
div><br>The tail can also be included in the returned object and maybe that=
is the superior approach. If for whatever reason out parameter is deemed t=
he best solution, I tried to outline some concerns as to whether it should =
be passed by lvalue or rvalue.<br> </div><br>On Wednesday, May 20, 201=
5 at 12:30:12 PM UTC-4, Vicente J. Botet Escriba wrote:<blockquote class=3D=
"gmail_quote" style=3D"margin: 0;margin-left: 0.8ex;border-left: 1px #ccc s=
olid;padding-left: 1ex;">Le 20/05/15 17:27, Matthew Fioravante a =C3=A9crit=
:
<br>>
<br>>
<br>> If you're planning to actually write this paper and want to=20
<br>> collaborate, I'd be happy to help with this.
<br>>
<br>> What are all of the use cases for an API like this? Here is what I=
can=20
<br>> think of:
<br>> - Parse a number and throw an exception if error occurs
<br>Do we really want this?
<br></blockquote><div><br>I think so, at least as a wrapper if nothing else=
.. The use case of "I'm just going to try to parse a bunch of stuff and bail=
out if any one fails" is a not uncommon.<br> </div><blockquote class=
=3D"gmail_quote" style=3D"margin: 0;margin-left: 0.8ex;border-left: 1px #cc=
c solid;padding-left: 1ex;">> - Parse a number and let me handle errors =
without exceptions
<br>Agreed.
<br>> - Parse a number and give me a default value if an error occurs
<br>Agreed with the use case, not sure we need something specific..
<br>> - Parse a number and give me the tail string so I can continue par=
sing=20
<br>> the next object.
<br>This seems to me the normal case.
<br>> - Initialize a variable using parse(), preferably using auto / typ=
e=20
<br>> deduction.
<br>Do you mean a variable of type int e.g.?
<br>
<br>auto i =3D parse<int>(p); // decltype(i) int
<br></blockquote><div><br>Yes. The idea is that if we specify the type int =
in the parse() invocation then we should be able to take advantage of type =
deduction and skip it in the declaration of the result variable. Using your=
multiple return values can achieve this goal as well.<br> <br></div><=
blockquote class=3D"gmail_quote" style=3D"margin: 0;margin-left: 0.8ex;bord=
er-left: 1px #ccc solid;padding-left: 1ex;">> - Assign to a pre-existing=
variable using parse(), preferably with=20
<br>> type deduction.
<br>You are requesting that the following should be valid
<br>
<br>i =3D parse<int>(p);
<br>
<br>?
<br>I believe that I don't understand what do you want here?
<br></blockquote><div><br>Even more nice to have. I mean like this:<br><br>=
<div class=3D"prettyprint" style=3D"background-color: rgb(250, 250, 250); b=
order-color: rgb(187, 187, 187); border-style: solid; border-width: 1px; wo=
rd-wrap: break-word;"><code class=3D"prettyprint"><div class=3D"subprettypr=
int"><span style=3D"color: #008;" class=3D"styled-by-prettify">template</sp=
an><span style=3D"color: #000;" class=3D"styled-by-prettify"> </span><span =
style=3D"color: #660;" class=3D"styled-by-prettify"><</span><span style=
=3D"color: #008;" class=3D"styled-by-prettify">typename</span><span style=
=3D"color: #000;" class=3D"styled-by-prettify"> T</span><span style=3D"colo=
r: #660;" class=3D"styled-by-prettify">></span><span style=3D"color: #00=
0;" class=3D"styled-by-prettify"><br>error_code parse</span><span style=3D"=
color: #660;" class=3D"styled-by-prettify">(</span><span style=3D"color: #0=
00;" class=3D"styled-by-prettify">T</span><span style=3D"color: #660;" clas=
s=3D"styled-by-prettify">&</span><span style=3D"color: #000;" class=3D"=
styled-by-prettify"> val</span><span style=3D"color: #660;" class=3D"styled=
-by-prettify">,</span><span style=3D"color: #000;" class=3D"styled-by-prett=
ify"> string_view s</span><span style=3D"color: #660;" class=3D"styled-by-p=
rettify">);</span><span style=3D"color: #000;" class=3D"styled-by-prettify"=
><br><br></span><span style=3D"color: #008;" class=3D"styled-by-prettify">d=
ouble</span><span style=3D"color: #000;" class=3D"styled-by-prettify"> x </=
span><span style=3D"color: #660;" class=3D"styled-by-prettify">=3D</span><s=
pan style=3D"color: #000;" class=3D"styled-by-prettify"> </span><span style=
=3D"color: #066;" class=3D"styled-by-prettify">1.0</span><span style=3D"col=
or: #660;" class=3D"styled-by-prettify">;</span><span style=3D"color: #000;=
" class=3D"styled-by-prettify"><br></span><span style=3D"color: #008;" clas=
s=3D"styled-by-prettify">if</span><span style=3D"color: #660;" class=3D"sty=
led-by-prettify">(!</span><span style=3D"color: #000;" class=3D"styled-by-p=
rettify">parse</span><span style=3D"color: #660;" class=3D"styled-by-pretti=
fy">(</span><span style=3D"color: #000;" class=3D"styled-by-prettify">x</sp=
an><span style=3D"color: #660;" class=3D"styled-by-prettify">,</span><span =
style=3D"color: #000;" class=3D"styled-by-prettify"> s</span><span style=3D=
"color: #660;" class=3D"styled-by-prettify">))</span><span style=3D"color: =
#000;" class=3D"styled-by-prettify"> </span><span style=3D"color: #660;" cl=
ass=3D"styled-by-prettify">{</span><span style=3D"color: #000;" class=3D"st=
yled-by-prettify"><br> </span><span style=3D"color: #800;" class=3D"s=
tyled-by-prettify">//handle error</span><span style=3D"color: #000;" class=
=3D"styled-by-prettify"><br></span><span style=3D"color: #660;" class=3D"st=
yled-by-prettify">}</span><span style=3D"color: #000;" class=3D"styled-by-p=
rettify"><br></span></div></code></div><br>Here we did not need to specify =
the type we want to parse because its deduced from the out parameter. <br><=
br>If I later change x to a float or int or whatever, the parsing code will=
update itself to be correct, vs this scenario:<br><br><div class=3D"pretty=
print" style=3D"background-color: rgb(250, 250, 250); border-color: rgb(187=
, 187, 187); border-style: solid; border-width: 1px; word-wrap: break-word;=
"><code class=3D"prettyprint"><div class=3D"subprettyprint"><span style=3D"=
color: #800;" class=3D"styled-by-prettify">//double x; //old code</span><sp=
an style=3D"color: #000;" class=3D"styled-by-prettify"><br></span><span sty=
le=3D"color: #008;" class=3D"styled-by-prettify">int</span><span style=3D"c=
olor: #000;" class=3D"styled-by-prettify"> x</span><span style=3D"color: #6=
60;" class=3D"styled-by-prettify">;</span><span style=3D"color: #000;" clas=
s=3D"styled-by-prettify"> </span><span style=3D"color: #800;" class=3D"styl=
ed-by-prettify">//new code</span><span style=3D"color: #000;" class=3D"styl=
ed-by-prettify"><br>x </span><span style=3D"color: #660;" class=3D"styled-b=
y-prettify">=3D</span><span style=3D"color: #000;" class=3D"styled-by-prett=
ify"> parse</span><span style=3D"color: #080;" class=3D"styled-by-prettify"=
><double></span><span style=3D"color: #660;" class=3D"styled-by-prett=
ify">(</span><span style=3D"color: #000;" class=3D"styled-by-prettify">s</s=
pan><span style=3D"color: #660;" class=3D"styled-by-prettify">);</span><spa=
n style=3D"color: #000;" class=3D"styled-by-prettify"> </span><span style=
=3D"color: #800;" class=3D"styled-by-prettify">//oops!</span><span style=3D=
"color: #000;" class=3D"styled-by-prettify"><br><br>x </span><span style=3D=
"color: #660;" class=3D"styled-by-prettify">=3D</span><span style=3D"color:=
#000;" class=3D"styled-by-prettify"> parse</span><span style=3D"color: #66=
0;" class=3D"styled-by-prettify"><</span><span style=3D"color: #008;" cl=
ass=3D"styled-by-prettify">decltype</span><span style=3D"color: #660;" clas=
s=3D"styled-by-prettify">(</span><span style=3D"color: #000;" class=3D"styl=
ed-by-prettify">x</span><span style=3D"color: #660;" class=3D"styled-by-pre=
ttify">)>(</span><span style=3D"color: #000;" class=3D"styled-by-prettif=
y">s</span><span style=3D"color: #660;" class=3D"styled-by-prettify">);</sp=
an><span style=3D"color: #000;" class=3D"styled-by-prettify"> </span><span =
style=3D"color: #800;" class=3D"styled-by-prettify">//Workaround, but ugly<=
/span><span style=3D"color: #000;" class=3D"styled-by-prettify"><br></span>=
</div></code></div><br><br><br></div></div>
<p></p>
-- <br />
<br />
--- <br />
You received this message because you are subscribed to the Google Groups &=
quot;ISO C++ Standard - Future Proposals" group.<br />
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to <a href=3D"mailto:std-proposals+unsubscribe@isocpp.org">std-proposa=
ls+unsubscribe@isocpp.org</a>.<br />
To post to this group, send email to <a href=3D"mailto:std-proposals@isocpp=
..org">std-proposals@isocpp.org</a>.<br />
Visit this group at <a href=3D"http://groups.google.com/a/isocpp.org/group/=
std-proposals/">http://groups.google.com/a/isocpp.org/group/std-proposals/<=
/a>.<br />
------=_Part_4567_28729371.1432146494270--
------=_Part_4566_747858066.1432146494270--
.
Author: Olaf van der Spek <olafvdspek@gmail.com>
Date: Wed, 20 May 2015 20:28:25 +0200
Raw View
> On Wednesday, May 20, 2015 at 8:08:25 AM UTC-4, Olaf van der Spek wrote:
> Not sure I follow you here.
> Do you have any specific use case where a size_t* (or char*) is more
> appropriate? A string_view object is a better tail string representation
> because it is a range with invariants built in from the start.
>
> Consider the following example, which would be a lot more clumsy if you used
> a size_t* instead of a string_view*.
>
> string_view s = "1 2 3";
> auto a = parse<int>(s, &s);
> s.pop_front();
> auto b = parse<int>(s, &s);
> s.pop_front();
> auto c = parse<int>(s, &s);
>
> Using a size_t* just means I'm probably going to be constructing a
> string_view on the next line and now I have to awkwardly deal with stuff
> like { in.begin() + pos, in.end() } without making mistakes.
s.drop_front(pos); ? Or was drop/pop_front(N) dropped from string_view?
If one wants the input view to be updated it makes sense to have a
function that takes it by reference though.
I'm just wondering whether a string_view* tail makes sense if the
input isn't a string_view.
> The tail out param could also be passed as a string_view*, which nullptr
> signifying "allow tail strings but throw them away".
I'd go for don't allow tails in that case.
>> 0 is used to signal a failed conversion.
>
>
> Generally, using 0 or any other perfectly valid value to signal failure is a
> really bad idea. It only makes sense when your use case is "Parse the value
Of course. The default could be passed in via an optional parameter.
These functions only make sense if there's such a default.
> I have this use case often as well, but many other times I'm more focused on
> correctness and want to report errors to users if they incorrectly specify a
> number. Both idioms should be easily supported. My parse_or() (or something
> similar) should serve the atoi() use case with minimal complexity. Do you
> see any situation where it would not?
parse_or() seems good.
> I think what is needed is for the paper to survey possible several
> interfaces and show examples of all of the common use cases with each one,
> carefully pointing out the pros and cons. Only when we have all of the data
> in front of us with their trade offs of safety, convenience, and readability
> can we choose one.
>
> If you're planning to actually write this paper and want to collaborate, I'd
> be happy to help with this.
Sounds like a plan.
--
Olaf
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
.
Author: Olaf van der Spek <olafvdspek@gmail.com>
Date: Wed, 20 May 2015 20:49:48 +0200
Raw View
2015-05-20 17:43 GMT+02:00 Matthew Woehlke <mw_triad@users.sourceforge.net>:
> On 2015-05-20 11:27, Matthew Fioravante wrote:
>> Generally, using 0 or any other perfectly valid value to signal failure is
>> a really bad idea. It only makes sense when your use case is "Parse the
>> value or give me some default if it fails". In that case, 0 may not be the
>> default value you want so its better to be able to actually specify it.
>
> Doesn't expected already handle this?
Does expected exist?
And does it have something like test_and_set()?
What would a parse date (Y-M-D) function look like?
// returning expected or optional
optional<Date> parse_date1(string_view is)
{
int year;
int month;
int day;
return true
&& test_and_set(year, parse<decltype(year)>(is, &is))
&& parse_separator(is, &is)
&& test_and_set(month, parse<decltype(month)>(is, &is))
&& parse_separator(is, &is)
&& test_and_set(day, parse<decltype(day)>(is, &is)))
? { year, month, day }
: {};
}
// returning error_code
optional<Date> parse_date2(string_view is)
{
int year;
int month;
int day;
return true
&& !parse(year, is, &is))
&& !parse_separator(is, &is)
&& !parse(month, is, &is))
&& !parse_separator(is, &is)
&& !parse(day, is, &is)))
? { year, month, day }
: {};
}
One of these functions looks cleaner to me. Did I do something wrong?
--
Olaf
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
.
Author: Matthew Fioravante <fmatthew5876@gmail.com>
Date: Wed, 20 May 2015 12:02:30 -0700 (PDT)
Raw View
------=_Part_1252_1580559798.1432148550933
Content-Type: multipart/alternative;
boundary="----=_Part_1253_95811596.1432148550933"
------=_Part_1253_95811596.1432148550933
Content-Type: text/plain; charset=UTF-8
On Wednesday, May 20, 2015 at 2:28:27 PM UTC-4, Olaf van der Spek wrote:
>
> > On Wednesday, May 20, 2015 at 8:08:25 AM UTC-4, Olaf van der Spek wrote:
> > Not sure I follow you here.
> > Do you have any specific use case where a size_t* (or char*) is more
> > appropriate? A string_view object is a better tail string representation
> > because it is a range with invariants built in from the start.
> >
> > Consider the following example, which would be a lot more clumsy if you
> used
> > a size_t* instead of a string_view*.
> >
> > string_view s = "1 2 3";
> > auto a = parse<int>(s, &s);
> > s.pop_front();
> > auto b = parse<int>(s, &s);
> > s.pop_front();
> > auto c = parse<int>(s, &s);
> >
> > Using a size_t* just means I'm probably going to be constructing a
> > string_view on the next line and now I have to awkwardly deal with stuff
> > like { in.begin() + pos, in.end() } without making mistakes.
>
> s.drop_front(pos); ? Or was drop/pop_front(N) dropped from string_view?
>
> If one wants the input view to be updated it makes sense to have a
> function that takes it by reference though.
>
I think passing s as the tail param is cleaner than using pop_front(n).
If the tail is going to be returned then definately a string_view is better
than just a size_t. We can pass that string_view along directly and it will
compose better.
>
> I'm just wondering whether a string_view* tail makes sense if the
> input isn't a string_view.
>
Sure it makes sense. Just like if you want to efficiently take a substring
of any string type such as std::string you get a string_view. Thats all the
tail is, a sub string of the original.
Also the input is converted to a string_view, so the interface only knows
the input string is string_view compatible and thats the only guarantee it
can provide on the way back out.
>
> > The tail out param could also be passed as a string_view*, which nullptr
> > signifying "allow tail strings but throw them away".
>
> I'd go for don't allow tails in that case.
>
Not sure that's a good approach. We completely change the behavior of the
function if the the tail string pointer happens to be null.
I imagine such an interface could result in very surprising bugs.
I think a better way to require an exact parse is with an overload which
doesn't accept a tail. Now the rules are described by the interface instead
of the value of the parameters passed to it.
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
------=_Part_1253_95811596.1432148550933
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable
<div dir=3D"ltr"><br><br>On Wednesday, May 20, 2015 at 2:28:27 PM UTC-4, Ol=
af van der Spek wrote:<blockquote class=3D"gmail_quote" style=3D"margin: 0;=
margin-left: 0.8ex;border-left: 1px #ccc solid;padding-left: 1ex;">> On =
Wednesday, May 20, 2015 at 8:08:25 AM UTC-4, Olaf van der Spek wrote:
<br>> Not sure I follow you here.
<br>> Do you have any specific use case where a size_t* (or char*) is mo=
re
<br>> appropriate? A string_view object is a better tail string represen=
tation
<br>> because it is a range with invariants built in from the start.
<br>>
<br>> Consider the following example, which would be a lot more clumsy i=
f you used
<br>> a size_t* instead of a string_view*.
<br>>
<br>> string_view s =3D "1 2 3";
<br>> auto a =3D parse<int>(s, &s);
<br>> s.pop_front();
<br>> auto b =3D parse<int>(s, &s);
<br>> s.pop_front();
<br>> auto c =3D parse<int>(s, &s);
<br>>
<br>> Using a size_t* just means I'm probably going to be constructing a
<br>> string_view on the next line and now I have to awkwardly deal with=
stuff
<br>> like { in.begin() + pos, in.end() } without making mistakes.
<br>
<br>s.drop_front(pos); ? Or was drop/pop_front(N) dropped from string_view?
<br>
<br>If one wants the input view to be updated it makes sense to have a
<br>function that takes it by reference though.
<br></blockquote><div><br>I think passing s as the tail param is cleaner th=
an using pop_front(n). <br><br>If the tail is going to be returned then def=
inately a string_view is better than just a size_t. We can pass that string=
_view along directly and it will compose better.<br> <br></div><blockq=
uote class=3D"gmail_quote" style=3D"margin: 0;margin-left: 0.8ex;border-lef=
t: 1px #ccc solid;padding-left: 1ex;">
<br>I'm just wondering whether a string_view* tail makes sense if the
<br>input isn't a string_view.
<br></blockquote><div><br>Sure it makes sense. Just like if you want to eff=
iciently take a substring of any string type such as std::string you get a =
string_view. Thats all the tail is, a sub string of the original.<br><br>Al=
so the input is converted to a string_view, so the interface only knows the=
input string is string_view compatible and thats the only guarantee it can=
provide on the way back out.<br> </div><blockquote class=3D"gmail_quo=
te" style=3D"margin: 0;margin-left: 0.8ex;border-left: 1px #ccc solid;paddi=
ng-left: 1ex;">
<br>> The tail out param could also be passed as a string_view*, which n=
ullptr
<br>> signifying "allow tail strings but throw them away".
<br>
<br>I'd go for don't allow tails in that case.
<br></blockquote><div><br>Not sure that's a good approach. We completely ch=
ange the behavior of the function if the the tail string pointer happens to=
be null. <br>I imagine such an interface could result in very surprising b=
ugs. <br><br>I think a better way to require an exact parse is with an over=
load which doesn't accept a tail. Now the rules are described by the interf=
ace instead of the value of the parameters passed to it.<br><br></div></div=
>
<p></p>
-- <br />
<br />
--- <br />
You received this message because you are subscribed to the Google Groups &=
quot;ISO C++ Standard - Future Proposals" group.<br />
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to <a href=3D"mailto:std-proposals+unsubscribe@isocpp.org">std-proposa=
ls+unsubscribe@isocpp.org</a>.<br />
To post to this group, send email to <a href=3D"mailto:std-proposals@isocpp=
..org">std-proposals@isocpp.org</a>.<br />
Visit this group at <a href=3D"http://groups.google.com/a/isocpp.org/group/=
std-proposals/">http://groups.google.com/a/isocpp.org/group/std-proposals/<=
/a>.<br />
------=_Part_1253_95811596.1432148550933--
------=_Part_1252_1580559798.1432148550933--
.
Author: Jens Maurer <Jens.Maurer@gmx.net>
Date: Wed, 20 May 2015 21:18:07 +0200
Raw View
On 05/18/2015 08:34 PM, Olaf van der Spek wrote:
> Let's get the party started.
>
> What have we got?
>
> We've got functions like strtol and stoi which take a const char* or std::string and return a number.
>
> long strtol(const char*, char **str_end, int base);
> int stoi(const std::string&, std::size_t* pos = 0, int base = 10);
>
> What do we want?
>
> Input should not be required to be null terminated, so string_view seems like a suitable input type.
> Error detection should be simpler, but not everyone is a fan of exceptions.
>
> And IMO skipping spaces should not be part of the parse function.
> There's also the question of what to do when not the entire input can be parsed. Return an error or not.
>
>
> So, what about this one?
>
> optional<T> parse(string_view, std::size_t* pos = 0, int base = 10);
>
> An alternative could be:
>
> error_code parse(T&, string_view, std::size_t* pos = 0, int base = 10);
My suggestion:
const char * ret = parse(T& v, const char * first, const char * last, int base, error_code&);
[ret, last[ is the unparsed part of the string. If ret == first, v is not
overwritten and we have an error, otherwise v contains the parsed value.
(Feel free to switch the base vs. error_code& parameters to allow a defaulted base.)
This allows some symmetry with similar output operations:
char * p = output(char * first, char * last, T v);
which outputs "v" into the space provided by [first, last[,
returning the remaining space as [p, last[.
(This doesn't work with string_view, because it's read-only.)
This does not use arbitrary iterators: It doesn't make a lot of sense
to have "char" buffers that are not (at least) partially contiguous.
std::list<char>? No, thanks.
This does not use or consider locales: That's for iostreams to deal with.
Here's some (totally untested) code that shows a composition: Parse
a comma-separated list of "int"s into a std::vector<int>. It seems
having some space-skipping function and a "parse this expected
sequence of chars" function might be helpful.
#include <vector>
struct error_code;
const char * parse(int&, const char * first, const char * last, int base, error_code& ec);
const char * parse(std::vector<int>& res, const char * first, const char * last, int base, error_code& ec)
{
res.clear();
while (first != last) {
int v;
const char * p = parse(v, first, last, base, ec);
if (p == first) // error; ec is already set
return p;
res.push_back(v);
if (p == last)
break;
if (*p != ',') { // error
ec = /* whatever */;
return p;
}
first = p + 1;
}
}
Jens
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
.
Author: Olaf van der Spek <olafvdspek@gmail.com>
Date: Wed, 20 May 2015 21:38:29 +0200
Raw View
2015-05-20 21:18 GMT+02:00 Jens Maurer <Jens.Maurer@gmx.net>:
> On 05/18/2015 08:34 PM, Olaf van der Spek wrote:
> My suggestion:
>
> const char * ret = parse(T& v, const char * first, const char * last, int base, error_code&);
>
> If ret == first, v is not
> overwritten and we have an error, otherwise v contains the parsed value.
Your function fails these requirements..
--
Olaf
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
.
Author: Miro Knejp <miro.knejp@gmail.com>
Date: Wed, 20 May 2015 22:20:08 +0200
Raw View
This is a multi-part message in MIME format.
--------------000103050701050100060000
Content-Type: text/plain; charset=UTF-8; format=flowed
The Functional Template Library ( https://github.com/beark/ftl ) has an
example of monadic parser generators inspired by Haskell.
https://github.com/beark/ftl/blob/master/docs/Parsec-I.md
Now that library also has overloads of operator>>= and others, which is
not part of "turn a string into an int" problem, but with all parsing
functions returning parser monads the composition is much easier to do.
It starts of by introducing the monad itself:
template<typename T>
using parser = ftl::eitherT<error,ftl::function<T(std::istream&)>>;
and a function to execute the actual parser
template<typename T>
ftl::either<error,T>run(parser<T> p, std::istream& is);
Each parsing function then returns a parser object.
parser<int>parseNatural();
This obviously serves more than a simple "turn a string into an int" but
is a prime example of composability that really shines with the
combining operators like >> or << etc. It makes things like this easy
parser<std::vector<int>>parseLispList() {
using namesapce ftl;
return parseChar('(')
>>parseList()
<<parseChar(')');
}
I thought I'd throw this in just as an example. We had these discussions
earlier without any consensus and the only new viewpoint brought in this
time is the functional approach mentioned by Vicente J. Whether this is
something the standard library can/should follow, or provides the
performance people need, or it can solve all the use cases people can
come up with and make everyone happy I don't know.
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
--------------000103050701050100060000
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable
<html>
<head>
<meta content=3D"text/html; charset=3Dutf-8" http-equiv=3D"Content-Type=
">
</head>
<body bgcolor=3D"#FFFFFF" text=3D"#000000">
The Functional Template Library ( <a class=3D"moz-txt-link-freetext" hr=
ef=3D"https://github.com/beark/ftl">https://github.com/beark/ftl</a> ) has
an example of monadic parser generators inspired by Haskell.<br>
<a class=3D"moz-txt-link-freetext" href=3D"https://github.com/beark/ftl=
/blob/master/docs/Parsec-I.md">https://github.com/beark/ftl/blob/master/doc=
s/Parsec-I.md</a><br>
<br>
Now that library also has overloads of operator>>=3D and others,
which is not part of "turn a string into an int" problem, but with
all parsing functions returning parser monads the composition is
much easier to do.<br>
<br>
It starts of by introducing the monad itself:<br>
<pre style=3D"box-sizing: border-box; overflow: auto; font-family: Cons=
olas, 'Liberation Mono', Menlo, Courier, monospace; font-size: 13.600000381=
4697px; margin-top: 0px; margin-bottom: 0px; font-style: normal; font-varia=
nt: normal; font-weight: normal; font-stretch: normal; line-height: 1.45; p=
adding: 16px; border-radius: 3px; word-wrap: normal; word-break: normal; co=
lor: rgb(51, 51, 51); letter-spacing: normal; orphans: auto; text-align: st=
art; text-indent: 0px; text-transform: none; widows: 1; word-spacing: 0px; =
-webkit-text-stroke-width: 0px; background-color: rgb(247, 247, 247);"><spa=
n class=3D"pl-k" style=3D"box-sizing: border-box; color: rgb(167, 29, 93);"=
>template</span><<span class=3D"pl-k" style=3D"box-sizing: border-box; c=
olor: rgb(167, 29, 93);">typename</span> T>
<span class=3D"pl-k" style=3D"box-sizing: border-box; color: rgb(167, 2=
9, 93);">using</span> parser =3D ftl::eitherT<error,ftl::function<T(s=
td::istream&)>>;<span class=3D"pl-c" style=3D"box-sizing: border-=
box; color: rgb(150, 152, 150);"></span></pre>
<br>
and a function to execute the actual parser<br>
<pre style=3D"box-sizing: border-box; overflow: auto; font-family: Cons=
olas, 'Liberation Mono', Menlo, Courier, monospace; font-size: 13.600000381=
4697px; margin-top: 0px; margin-bottom: 0px; font-style: normal; font-varia=
nt: normal; font-weight: normal; font-stretch: normal; line-height: 1.45; p=
adding: 16px; border-radius: 3px; word-wrap: normal; word-break: normal; co=
lor: rgb(51, 51, 51); letter-spacing: normal; orphans: auto; text-align: st=
art; text-indent: 0px; text-transform: none; widows: 1; word-spacing: 0px; =
-webkit-text-stroke-width: 0px; background-color: rgb(247, 247, 247);"><spa=
n class=3D"pl-k" style=3D"box-sizing: border-box; color: rgb(167, 29, 93);"=
>template</span><<span class=3D"pl-k" style=3D"box-sizing: border-box; c=
olor: rgb(167, 29, 93);">typename</span> T>
ftl::either<error,T> <span class=3D"pl-en" style=3D"box-sizing: b=
order-box; color: rgb(121, 93, 163);">run</span>(parser<T> p, std::is=
tream& is);</pre>
<br>
Each parsing function then returns a parser object.<br>
<pre style=3D"box-sizing: border-box; overflow: auto; font-family: Cons=
olas, 'Liberation Mono', Menlo, Courier, monospace; font-size: 13.600000381=
4697px; margin-top: 0px; margin-bottom: 0px; font-style: normal; font-varia=
nt: normal; font-weight: normal; font-stretch: normal; line-height: 1.45; p=
adding: 16px; border-radius: 3px; word-wrap: normal; word-break: normal; co=
lor: rgb(51, 51, 51); letter-spacing: normal; orphans: auto; text-align: st=
art; text-indent: 0px; text-transform: none; widows: 1; word-spacing: 0px; =
-webkit-text-stroke-width: 0px; background-color: rgb(247, 247, 247);">pars=
er<<span class=3D"pl-k" style=3D"box-sizing: border-box; color: rgb(167,=
29, 93);">int</span>> <span class=3D"pl-en" style=3D"box-sizing: border=
-box; color: rgb(121, 93, 163);">parseNatural</span>();</pre>
<br>
This obviously serves more than a simple "turn a string into an int"
but is a prime example of composability that really shines with the
combining operators like >> or << etc. It makes things
like this easy<br>
<pre style=3D"box-sizing: border-box; overflow: auto; font-family: Cons=
olas, 'Liberation Mono', Menlo, Courier, monospace; font-size: 13.600000381=
4697px; margin-top: 0px; margin-bottom: 0px; font-style: normal; font-varia=
nt: normal; font-weight: normal; font-stretch: normal; line-height: 1.45; p=
adding: 16px; border-radius: 3px; word-wrap: normal; word-break: normal; co=
lor: rgb(51, 51, 51); letter-spacing: normal; orphans: auto; text-align: st=
art; text-indent: 0px; text-transform: none; widows: 1; word-spacing: 0px; =
-webkit-text-stroke-width: 0px; background-color: rgb(247, 247, 247);">pars=
er<std::vector<<span class=3D"pl-k" style=3D"box-sizing: border-box; =
color: rgb(167, 29, 93);">int</span>>> <span class=3D"pl-en" style=3D=
"box-sizing: border-box; color: rgb(121, 93, 163);">parseLispList</span>() =
{
<span class=3D"pl-k" style=3D"box-sizing: border-box; color: rgb(167, 2=
9, 93);">using</span> namesapce ftl;
<span class=3D"pl-k" style=3D"box-sizing: border-box; color: rgb(167, 2=
9, 93);">return</span> <span class=3D"pl-c1" style=3D"box-sizing: border-bo=
x; color: rgb(0, 134, 179);">parseChar</span>(<span class=3D"pl-s" style=3D=
"box-sizing: border-box; color: rgb(24, 54, 145);"><span class=3D"pl-pds" s=
tyle=3D"box-sizing: border-box; color: rgb(24, 54, 145);">'</span>(<span cl=
ass=3D"pl-pds" style=3D"box-sizing: border-box; color: rgb(24, 54, 145);">'=
</span></span>)
>> <span class=3D"pl-c1" style=3D"box-sizing: border-box; col=
or: rgb(0, 134, 179);">parseList</span>()
<< <span class=3D"pl-c1" style=3D"box-sizing: border-box; col=
or: rgb(0, 134, 179);">parseChar</span>(<span class=3D"pl-s" style=3D"box-s=
izing: border-box; color: rgb(24, 54, 145);"><span class=3D"pl-pds" style=
=3D"box-sizing: border-box; color: rgb(24, 54, 145);">'</span>)<span class=
=3D"pl-pds" style=3D"box-sizing: border-box; color: rgb(24, 54, 145);">'</s=
pan></span>);
}</pre>
<br>
I thought I'd throw this in just as an example. We had these
discussions earlier without any consensus and the only new viewpoint
brought in this time is the functional approach mentioned by Vicente
J. Whether this is something the standard library can/should follow,
or provides the performance people need, or it can solve all the use
cases people can come up with and make everyone happy I don't know.<br>
<br>
</body>
</html>
<p></p>
-- <br />
<br />
--- <br />
You received this message because you are subscribed to the Google Groups &=
quot;ISO C++ Standard - Future Proposals" group.<br />
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to <a href=3D"mailto:std-proposals+unsubscribe@isocpp.org">std-proposa=
ls+unsubscribe@isocpp.org</a>.<br />
To post to this group, send email to <a href=3D"mailto:std-proposals@isocpp=
..org">std-proposals@isocpp.org</a>.<br />
Visit this group at <a href=3D"http://groups.google.com/a/isocpp.org/group/=
std-proposals/">http://groups.google.com/a/isocpp.org/group/std-proposals/<=
/a>.<br />
--------------000103050701050100060000--
.
Author: "Vicente J. Botet Escriba" <vicente.botet@wanadoo.fr>
Date: Thu, 21 May 2015 00:00:18 +0200
Raw View
This is a multi-part message in MIME format.
--------------080507040607030000090303
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: quoted-printable
Le 20/05/15 20:49, Olaf van der Spek a =C3=A9crit :
> 2015-05-20 17:43 GMT+02:00 Matthew Woehlke <mw_triad@users.sourceforge.ne=
t>:
>> On 2015-05-20 11:27, Matthew Fioravante wrote:
>>> Generally, using 0 or any other perfectly valid value to signal failure=
is
>>> a really bad idea. It only makes sense when your use case is "Parse the
>>> value or give me some default if it fails". In that case, 0 may not be =
the
>>> default value you want so its better to be able to actually specify it.
>> Doesn't expected already handle this?
> Does expected exist?
Do you mean an implementation to play with? Yes=20
(https://github.com/ptal/expected)
> And does it have something like test_and_set()?
No. What test_and_set would have as parameters and what would be the effect=
?
> What would a parse date (Y-M-D) function look like?
>
> // returning expected or optional
> optional<Date> parse_date1(string_view is)
> {
> int year;
> int month;
> int day;
> return true
> && test_and_set(year, parse<decltype(year)>(is, &is))
> && parse_separator(is, &is)
> && test_and_set(month, parse<decltype(month)>(is, &is))
> && parse_separator(is, &is)
> && test_and_set(day, parse<decltype(day)>(is, &is)))
> ? { year, month, day }
> : {};
> }
>
> // returning error_code
> optional<Date> parse_date2(string_view is)
> {
> int year;
> int month;
> int day;
> return true
> && !parse(year, is, &is))
> && !parse_separator(is, &is)
> && !parse(month, is, &is))
> && !parse_separator(is, &is)
> && !parse(day, is, &is)))
> ? { year, month, day }
> : {};
> }
>
> One of these functions looks cleaner to me. Did I do something wrong?
>
Using a tail as out parameter it should be someting like
expected<Date, error_code> parse_date(string_view is, string_view& out)
{
return make_date % // fmap
( parse<int>(is, &is) // year
>> parse_separator(is, &is) ) *
( parse<int>(is, &is) // month
>> parse_separator(is, &is) ) *
parse<int>(is, &out) // day
;
}
or using the functional form with a parameter in-out it could be
expected<Date> parse_date(string_view& is)
{
return fmap(make_date,
mdo( parse<int>(is), // year
parse_separator(is) ),
mdo( parse<int>(is), // month
parse_separator(is) ),
parse<int>(is) // day
);
}
There other possibilities (e.g. using Parser) as Miro pointed out.
Vicente
--=20
---=20
You received this message because you are subscribed to the Google Groups "=
ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposa=
ls/.
--------------080507040607030000090303
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable
<html>
<head>
<meta content=3D"text/html; charset=3Dutf-8" http-equiv=3D"Content-Type=
">
</head>
<body text=3D"#000000" bgcolor=3D"#FFFFFF">
<div class=3D"moz-cite-prefix">Le 20/05/15 20:49, Olaf van der Spek a
=C3=A9crit=C2=A0:<br>
</div>
<blockquote
cite=3D"mid:CAA7U3HOdZmT78Wtpny7SN52ESuwYL5Lv6f=3DTgNk6eZ6sL6tASg@mail.gmai=
l.com"
type=3D"cite">
<pre wrap=3D"">2015-05-20 17:43 GMT+02:00 Matthew Woehlke <a class=3D=
"moz-txt-link-rfc2396E" href=3D"mailto:mw_triad@users.sourceforge.net"><=
mw_triad@users.sourceforge.net></a>:
</pre>
<blockquote type=3D"cite">
<pre wrap=3D"">On 2015-05-20 11:27, Matthew Fioravante wrote:
</pre>
<blockquote type=3D"cite">
<pre wrap=3D"">Generally, using 0 or any other perfectly valid va=
lue to signal failure is
a really bad idea. It only makes sense when your use case is "Parse the
value or give me some default if it fails". In that case, 0 may not be the
default value you want so its better to be able to actually specify it.
</pre>
</blockquote>
<pre wrap=3D"">
Doesn't expected already handle this?
</pre>
</blockquote>
<pre wrap=3D"">
Does expected exist?</pre>
</blockquote>
Do you mean an implementation to play with? Yes
(<a class=3D"moz-txt-link-freetext" href=3D"https://github.com/ptal/exp=
ected">https://github.com/ptal/expected</a>)<br>
<blockquote
cite=3D"mid:CAA7U3HOdZmT78Wtpny7SN52ESuwYL5Lv6f=3DTgNk6eZ6sL6tASg@mail.gmai=
l.com"
type=3D"cite">
<pre wrap=3D"">
And does it have something like test_and_set()?
</pre>
</blockquote>
No. What test_and_set would have as parameters and what would be the
effect?<br>
<br>
<blockquote
cite=3D"mid:CAA7U3HOdZmT78Wtpny7SN52ESuwYL5Lv6f=3DTgNk6eZ6sL6tASg@mail.gmai=
l.com"
type=3D"cite">
<pre wrap=3D"">
What would a parse date (Y-M-D) function look like?
// returning expected or optional
optional<Date> parse_date1(string_view is)
{
int year;
int month;
int day;
return true
&& test_and_set(year, parse<decltype(year)>(is, &is))
&& parse_separator(is, &is)
&& test_and_set(month, parse<decltype(month)>(is, &is=
))
&& parse_separator(is, &is)
&& test_and_set(day, parse<decltype(day)>(is, &is)))
? { year, month, day }
: {};
}
// returning error_code
optional<Date> parse_date2(string_view is)
{
int year;
int month;
int day;
return true
&& !parse(year, is, &is))
&& !parse_separator(is, &is)
&& !parse(month, is, &is))
&& !parse_separator(is, &is)
&& !parse(day, is, &is)))
? { year, month, day }
: {};
}
One of these functions looks cleaner to me. Did I do something wrong?
</pre>
</blockquote>
<font size=3D"+1">Using a tail as out parameter it should be someting
like</font><br>
<br>
<pre wrap=3D"">expected<Date, error_code> parse_date(string_view =
is, string_view& out)
{
return make_date % // fmap
( parse<int>(is, &is) // year
>> parse_separator(is, &is) ) *=20
( parse<int>(is, &is) // month
>> parse_separator(is, &is) ) *
parse<int>(is, &out) // day
;=20
}
</pre>
or using the functional form with a parameter in-out it could be<br>
<pre wrap=3D"">expected<Date> parse_date(string_view& is)
{
return fmap(make_date,=20
mdo( parse<int>(is), // year
parse_separator(is) ),=20
mdo( parse<int>(is), // month
parse_separator(is) ),
parse<int>(is) // day
);
}
</pre>
There other possibilities (e.g. using Parser) as Miro pointed out.<br>
<br>
Vicente<br>
<br>
</body>
</html>
<p></p>
-- <br />
<br />
--- <br />
You received this message because you are subscribed to the Google Groups &=
quot;ISO C++ Standard - Future Proposals" group.<br />
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to <a href=3D"mailto:std-proposals+unsubscribe@isocpp.org">std-proposa=
ls+unsubscribe@isocpp.org</a>.<br />
To post to this group, send email to <a href=3D"mailto:std-proposals@isocpp=
..org">std-proposals@isocpp.org</a>.<br />
Visit this group at <a href=3D"http://groups.google.com/a/isocpp.org/group/=
std-proposals/">http://groups.google.com/a/isocpp.org/group/std-proposals/<=
/a>.<br />
--------------080507040607030000090303--
.
Author: "Vicente J. Botet Escriba" <vicente.botet@wanadoo.fr>
Date: Thu, 21 May 2015 00:07:30 +0200
Raw View
Le 20/05/15 20:16, Olaf van der Spek a =C3=A9crit :
> 2015-05-20 16:35 GMT+02:00 Vicente J. Botet Escriba <vicente.botet@wanado=
o.fr>:
>> Maybe instead of using string_view the function should work on any model=
of
>> a given ParserState Concept. What are the operations a parser need from =
this
>> ParserState?
> Maybe, what's a ParserState?
As said just above, is the Concept of a Parser State. string_view could=20
be a model of this concept.
>
>>> Error detection should be simpler, but not everyone is a fan of
>>> exceptions.
>> We can question ourselves which interface we will had if exceptions were
>> acceptable.
> We could but what's the point?
Interface using exceptions compose quite well. The interface that=20
doesn't use exceptions should be based on it. We need to see just how to=20
report the error_code. As a result, as out parameter or as a TLS.
>
>> Parsing is not the same than matching. When you parse, you want to parse
>> several things, so you need a new state of the ParserState on which appl=
y
>> again the function parser. When you want to match the whole input must b=
e
>> consumed.
>> I suggest to use a different function for this use case.
> What should such a function be named?
I would not be against parse_exact.
>
>> Sorry, but what the pos parameter is used for?
> http://en.cppreference.com/w/cpp/string/basic_string/stol
>
>
Oh, I missed this link. as Matthew, I don't see the point to use it if=20
there is an output of a ParserState or a string_view.
Vicente
--=20
---=20
You received this message because you are subscribed to the Google Groups "=
ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposa=
ls/.
.
Author: "Vicente J. Botet Escriba" <vicente.botet@wanadoo.fr>
Date: Thu, 21 May 2015 00:28:05 +0200
Raw View
This is a multi-part message in MIME format.
--------------050200000302080301080001
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: quoted-printable
Le 20/05/15 20:28, Matthew Fioravante a =C3=A9crit :
>
>
> On Wednesday, May 20, 2015 at 12:29:51 PM UTC-4, Vicente J. Botet=20
> Escriba wrote:
>
> Not yet there, but I would find quire readable to assign multiple
> values and even declare them in situ as well
>
> |{auto a, str} =3Dparse<int>(str);
> ||
> {auto c,||||||ignore}|| =3Dparse<int>(str);|
>
> Its quite a different proposal, but I think a syntax like this is=20
> badly needed. We have tie() which is only 3 characters but its=20
> actually 8 characters because pretty much in all of my code I'd be=20
> saying std::tie(). Also the tuple/tie solution is not as well known as=20
> a good way to return multiple values. This kind of thing is good=20
> enough to be a core language feature IMO.
>
> Maybe not braces since current use of braces always introduce a scope=20
> and that auto a lives in the outer scope. Maybe [] can be reused?
>
> |
> [autoa,str ]=3Dparse<int>(str);
> |
>
>
Forget it, it is out of the scope.
>
> > is that this function can be composed using fold like functions,=20
> that consume from the parser state and folds e.g. on a list.
>
> It would probably be good to have a specific example or 2 in the paper=20
> showing the strengths of the functional programming approach.
The FTL library is a good example of what can be done. I've to learn a=20
lot from it.
>
> > Yes the rvalue tail parameter doesn't seam a good idea.
>
> The tail can also be included in the returned object and maybe that is=20
> the superior approach. If for whatever reason out parameter is deemed=20
> the best solution, I tried to outline some concerns as to whether it=20
> should be passed by lvalue or rvalue.
Understood.
>
> On Wednesday, May 20, 2015 at 12:30:12 PM UTC-4, Vicente J. Botet=20
> Escriba wrote:
>
> Le 20/05/15 17:27, Matthew Fioravante a =C3=A9crit :
> >
> >
> > If you're planning to actually write this paper and want to
> > collaborate, I'd be happy to help with this.
> >
> > What are all of the use cases for an API like this? Here is what
> I can
> > think of:
> > - Parse a number and throw an exception if error occurs
> Do we really want this?
>
>
> I think so, at least as a wrapper if nothing else. The use case of=20
> "I'm just going to try to parse a bunch of stuff and bail out if any=20
> one fails" is a not uncommon.
The idiom in this case would be to get the value
auto r =3D parse(...).value(); // throws if there is no value :)
>
> > - Assign to a pre-existing variable using parse(), preferably with
> > type deduction.
> You are requesting that the following should be valid
>
> i =3D parse<int>(p);
>
> ?
> I believe that I don't understand what do you want here?
>
>
> Even more nice to have. I mean like this:
>
> |
> template<typenameT>
> error_code parse(T&val,string_view s);
>
> doublex =3D1.0;
> if(!parse(x,s)){
> //handle error
> }
> |
>
> Here we did not need to specify the type we want to parse because its=20
> deduced from the out parameter.
>
> If I later change x to a float or int or whatever, the parsing code=20
> will update itself to be correct, vs this scenario:
>
> |
> //double x; //old code
> intx;//new code
> x =3Dparse<double>(s);//oops!
>
> x =3Dparse<decltype(x)>(s);//Workaround, but ugly
> |
>
I will not bay it, but I have no major problem, as this can be built on=20
top of the basic interface. Is just a different syntax.
Vicente
--=20
---=20
You received this message because you are subscribed to the Google Groups "=
ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposa=
ls/.
--------------050200000302080301080001
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable
<html>
<head>
<meta content=3D"text/html; charset=3Dutf-8" http-equiv=3D"Content-Type=
">
</head>
<body text=3D"#000000" bgcolor=3D"#FFFFFF">
<div class=3D"moz-cite-prefix">Le 20/05/15 20:28, Matthew Fioravante a
=C3=A9crit=C2=A0:<br>
</div>
<blockquote
cite=3D"mid:be484273-f893-4a6d-98e9-85303b076b77@isocpp.org"
type=3D"cite">
<div dir=3D"ltr"><br>
<br>
On Wednesday, May 20, 2015 at 12:29:51 PM UTC-4, Vicente J.
Botet Escriba wrote:<br>
<blockquote class=3D"gmail_quote" style=3D"margin: 0;margin-left:
0.8ex;border-left: 1px #ccc solid;padding-left: 1ex;">
<div bgcolor=3D"#FFFFFF"> Not yet there, but I would find quire
readable to assign multiple values and even declare them in
situ as well<br>
<br>
<font
style=3D"font-family:monospace;background-color:rgb(250,250,2=
50)"><code><span
style=3D"color:#008">{auto </span><span
style=3D"color:#000">a, str} </span><span
style=3D"color:#660">=3D</span><span style=3D"color:#000"=
>
parse</span><span style=3D"color:#080"><int></span>=
<span
style=3D"color:#660">(</span><span style=3D"color:#000">s=
tr</span><span
style=3D"color:#660">);</span><span style=3D"color:#000">
</span><span style=3D"color:#000"><br>
</span></code></font><font
style=3D"font-family:monospace;background-color:rgb(250,250,2=
50)"><code><font
color=3D"#666600"><span style=3D"color:#000"><br>
</span><span style=3D"color:#008">{</span><span
style=3D"color:#000"><font color=3D"#666600">auto </fon=
t>c,</span></font><code>
</code><font color=3D"#666600"><span style=3D"color:#000"><=
/span></font></code></font><font
style=3D"font-family:monospace;background-color:rgb(250,250,250)"><code><fo=
nt
color=3D"#666600"><span style=3D"color:#000"><font
style=3D"font-family:monospace;background-color:rgb(2=
50,250,250)"><code><code>ignore}</code></code></font>
</span><span style=3D"color:#660">=3D</span><span
style=3D"color:#000"> parse</span><span
style=3D"color:#080"><int></span><span
style=3D"color:#660">(</span><span style=3D"color:#000"=
>str</span><span
style=3D"color:#660">);</span></font></code></font><br>
</div>
</blockquote>
<div bgcolor=3D"#FFFFFF">Its quite a different proposal, but I
think a syntax like this is badly needed. We have tie() which
is only 3 characters but its actually 8 characters because
pretty much in all of my code I'd be saying std::tie(). Also
the tuple/tie solution is not as well known as a good way to
return multiple values. This kind of thing is good enough to
be a core language feature IMO.<br>
<br>
Maybe not braces since current use of braces always introduce
a scope and that auto a lives in the outer scope. Maybe [] can
be reused?<br>
<br>
<div class=3D"prettyprint" style=3D"background-color: rgb(250,
250, 250); border-color: rgb(187, 187, 187); border-style:
solid; border-width: 1px; word-wrap: break-word;"><code
class=3D"prettyprint">
<div class=3D"subprettyprint"><span style=3D"color: #660;"
class=3D"styled-by-prettify">[</span><span style=3D"color=
:
#000;" class=3D"styled-by-prettify"> </span><span
style=3D"color: #008;" class=3D"styled-by-prettify">auto<=
/span><span
style=3D"color: #000;" class=3D"styled-by-prettify"> a</s=
pan><span
style=3D"color: #660;" class=3D"styled-by-prettify">,</sp=
an><span
style=3D"color: #000;" class=3D"styled-by-prettify"> str =
</span><span
style=3D"color: #660;" class=3D"styled-by-prettify">]</sp=
an><span
style=3D"color: #000;" class=3D"styled-by-prettify"> </sp=
an><span
style=3D"color: #660;" class=3D"styled-by-prettify">=3D</=
span><span
style=3D"color: #000;" class=3D"styled-by-prettify"> pars=
e</span><span
style=3D"color: #080;" class=3D"styled-by-prettify"><i=
nt></span><span
style=3D"color: #660;" class=3D"styled-by-prettify">(</sp=
an><span
style=3D"color: #000;" class=3D"styled-by-prettify">str</=
span><span
style=3D"color: #660;" class=3D"styled-by-prettify">);</s=
pan><span
style=3D"color: #000;" class=3D"styled-by-prettify"><br>
</span></div>
</code></div>
<br>
<br>
</div>
</div>
</blockquote>
Forget it, it is out of the scope.<br>
<blockquote
cite=3D"mid:be484273-f893-4a6d-98e9-85303b076b77@isocpp.org"
type=3D"cite">
<div dir=3D"ltr">
<div bgcolor=3D"#FFFFFF">=C2=A0<br>
> is that this function can be composed using fold like
functions, that consume from the parser state and folds e.g.
on a list.<br>
</div>
<div bgcolor=3D"#FFFFFF"><br>
It would probably be good to have a specific example or 2 in
the paper showing the strengths of the functional programming
approach.<br>
</div>
</div>
</blockquote>
The FTL library is a good example of what can be done. I've to learn
a lot from it.<br>
<blockquote
cite=3D"mid:be484273-f893-4a6d-98e9-85303b076b77@isocpp.org"
type=3D"cite">
<div dir=3D"ltr">
<div bgcolor=3D"#FFFFFF"><br>
> Yes the rvalue tail parameter doesn't seam a good idea.<br>
</div>
<div><br>
The tail can also be included in the returned object and maybe
that is the superior approach. If for whatever reason out
parameter is deemed the best solution, I tried to outline some
concerns as to whether it should be passed by lvalue or
rvalue.<br>
</div>
</div>
</blockquote>
Understood.<br>
<blockquote
cite=3D"mid:be484273-f893-4a6d-98e9-85303b076b77@isocpp.org"
type=3D"cite">
<div dir=3D"ltr">
<div>=C2=A0<br>
</div>
On Wednesday, May 20, 2015 at 12:30:12 PM UTC-4, Vicente J.
Botet Escriba wrote:
<blockquote class=3D"gmail_quote" style=3D"margin: 0;margin-left:
0.8ex;border-left: 1px #ccc solid;padding-left: 1ex;">Le
20/05/15 17:27, Matthew Fioravante a =C3=A9crit :
<br>
>
<br>
>
<br>
> If you're planning to actually write this paper and want
to <br>
> collaborate, I'd be happy to help with this.
<br>
>
<br>
> What are all of the use cases for an API like this? Here
is what I can <br>
> think of:
<br>
> - Parse a number and throw an exception if error occurs
<br>
Do we really want this?
<br>
</blockquote>
<div><br>
I think so, at least as a wrapper if nothing else. The use
case of "I'm just going to try to parse a bunch of stuff and
bail out if any one fails" is a not uncommon.<br>
</div>
</div>
</blockquote>
The idiom in this case would be to get the value<br>
<br>
auto r =3D parse(...).value(); // throws if there is no value :) <br>
<blockquote
cite=3D"mid:be484273-f893-4a6d-98e9-85303b076b77@isocpp.org"
type=3D"cite">
<div dir=3D"ltr">=C2=A0<br>
<blockquote class=3D"gmail_quote" style=3D"margin: 0;margin-left:
0.8ex;border-left: 1px #ccc solid;padding-left: 1ex;">> -
Assign to a pre-existing variable using parse(), preferably
with <br>
> type deduction.
<br>
You are requesting that the following should be valid
<br>
<br>
i =3D parse<int>(p);
<br>
<br>
?
<br>
I believe that I don't understand what do you want here?
<br>
</blockquote>
<div><br>
Even more nice to have. I mean like this:<br>
<br>
<div class=3D"prettyprint" style=3D"background-color: rgb(250,
250, 250); border-color: rgb(187, 187, 187); border-style:
solid; border-width: 1px; word-wrap: break-word;"><code
class=3D"prettyprint">
<div class=3D"subprettyprint"><span style=3D"color: #008;"
class=3D"styled-by-prettify">template</span><span
style=3D"color: #000;" class=3D"styled-by-prettify"> </sp=
an><span
style=3D"color: #660;" class=3D"styled-by-prettify"><<=
/span><span
style=3D"color: #008;" class=3D"styled-by-prettify">typen=
ame</span><span
style=3D"color: #000;" class=3D"styled-by-prettify"> T</s=
pan><span
style=3D"color: #660;" class=3D"styled-by-prettify">><=
/span><span
style=3D"color: #000;" class=3D"styled-by-prettify"><br>
error_code parse</span><span style=3D"color: #660;"
class=3D"styled-by-prettify">(</span><span style=3D"color=
:
#000;" class=3D"styled-by-prettify">T</span><span
style=3D"color: #660;" class=3D"styled-by-prettify">&=
</span><span
style=3D"color: #000;" class=3D"styled-by-prettify"> val<=
/span><span
style=3D"color: #660;" class=3D"styled-by-prettify">,</sp=
an><span
style=3D"color: #000;" class=3D"styled-by-prettify">
string_view s</span><span style=3D"color: #660;"
class=3D"styled-by-prettify">);</span><span
style=3D"color: #000;" class=3D"styled-by-prettify"><br>
<br>
</span><span style=3D"color: #008;"
class=3D"styled-by-prettify">double</span><span
style=3D"color: #000;" class=3D"styled-by-prettify"> x </=
span><span
style=3D"color: #660;" class=3D"styled-by-prettify">=3D</=
span><span
style=3D"color: #000;" class=3D"styled-by-prettify"> </sp=
an><span
style=3D"color: #066;" class=3D"styled-by-prettify">1.0</=
span><span
style=3D"color: #660;" class=3D"styled-by-prettify">;</sp=
an><span
style=3D"color: #000;" class=3D"styled-by-prettify"><br>
</span><span style=3D"color: #008;"
class=3D"styled-by-prettify">if</span><span
style=3D"color: #660;" class=3D"styled-by-prettify">(!</s=
pan><span
style=3D"color: #000;" class=3D"styled-by-prettify">parse=
</span><span
style=3D"color: #660;" class=3D"styled-by-prettify">(</sp=
an><span
style=3D"color: #000;" class=3D"styled-by-prettify">x</sp=
an><span
style=3D"color: #660;" class=3D"styled-by-prettify">,</sp=
an><span
style=3D"color: #000;" class=3D"styled-by-prettify"> s</s=
pan><span
style=3D"color: #660;" class=3D"styled-by-prettify">))</s=
pan><span
style=3D"color: #000;" class=3D"styled-by-prettify"> </sp=
an><span
style=3D"color: #660;" class=3D"styled-by-prettify">{</sp=
an><span
style=3D"color: #000;" class=3D"styled-by-prettify"><br>
=C2=A0 </span><span style=3D"color: #800;"
class=3D"styled-by-prettify">//handle error</span><span
style=3D"color: #000;" class=3D"styled-by-prettify"><br>
</span><span style=3D"color: #660;"
class=3D"styled-by-prettify">}</span><span style=3D"color=
:
#000;" class=3D"styled-by-prettify"><br>
</span></div>
</code></div>
<br>
Here we did not need to specify the type we want to parse
because its deduced from the out parameter. <br>
<br>
If I later change x to a float or int or whatever, the parsing
code will update itself to be correct, vs this scenario:<br>
<br>
<div class=3D"prettyprint" style=3D"background-color: rgb(250,
250, 250); border-color: rgb(187, 187, 187); border-style:
solid; border-width: 1px; word-wrap: break-word;"><code
class=3D"prettyprint">
<div class=3D"subprettyprint"><span style=3D"color: #800;"
class=3D"styled-by-prettify">//double x; //old code</span=
><span
style=3D"color: #000;" class=3D"styled-by-prettify"><br>
</span><span style=3D"color: #008;"
class=3D"styled-by-prettify">int</span><span
style=3D"color: #000;" class=3D"styled-by-prettify"> x</s=
pan><span
style=3D"color: #660;" class=3D"styled-by-prettify">;</sp=
an><span
style=3D"color: #000;" class=3D"styled-by-prettify"> </sp=
an><span
style=3D"color: #800;" class=3D"styled-by-prettify">//new
code</span><span style=3D"color: #000;"
class=3D"styled-by-prettify"><br>
x </span><span style=3D"color: #660;"
class=3D"styled-by-prettify">=3D</span><span style=3D"col=
or:
#000;" class=3D"styled-by-prettify"> parse</span><span
style=3D"color: #080;" class=3D"styled-by-prettify"><d=
ouble></span><span
style=3D"color: #660;" class=3D"styled-by-prettify">(</sp=
an><span
style=3D"color: #000;" class=3D"styled-by-prettify">s</sp=
an><span
style=3D"color: #660;" class=3D"styled-by-prettify">);</s=
pan><span
style=3D"color: #000;" class=3D"styled-by-prettify"> </sp=
an><span
style=3D"color: #800;" class=3D"styled-by-prettify">//oop=
s!</span><span
style=3D"color: #000;" class=3D"styled-by-prettify"><br>
<br>
x </span><span style=3D"color: #660;"
class=3D"styled-by-prettify">=3D</span><span style=3D"col=
or:
#000;" class=3D"styled-by-prettify"> parse</span><span
style=3D"color: #660;" class=3D"styled-by-prettify"><<=
/span><span
style=3D"color: #008;" class=3D"styled-by-prettify">declt=
ype</span><span
style=3D"color: #660;" class=3D"styled-by-prettify">(</sp=
an><span
style=3D"color: #000;" class=3D"styled-by-prettify">x</sp=
an><span
style=3D"color: #660;" class=3D"styled-by-prettify">)>=
(</span><span
style=3D"color: #000;" class=3D"styled-by-prettify">s</sp=
an><span
style=3D"color: #660;" class=3D"styled-by-prettify">);</s=
pan><span
style=3D"color: #000;" class=3D"styled-by-prettify"> </sp=
an><span
style=3D"color: #800;" class=3D"styled-by-prettify">//Wor=
karound,
but ugly</span><span style=3D"color: #000;"
class=3D"styled-by-prettify"><br>
</span></div>
</code></div>
<br>
</div>
</div>
</blockquote>
I will not bay it, but I have no major problem, as this can be built
on top of the basic interface. Is just a different syntax.<br>
<br>
Vicente<br>
</body>
</html>
<p></p>
-- <br />
<br />
--- <br />
You received this message because you are subscribed to the Google Groups &=
quot;ISO C++ Standard - Future Proposals" group.<br />
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to <a href=3D"mailto:std-proposals+unsubscribe@isocpp.org">std-proposa=
ls+unsubscribe@isocpp.org</a>.<br />
To post to this group, send email to <a href=3D"mailto:std-proposals@isocpp=
..org">std-proposals@isocpp.org</a>.<br />
Visit this group at <a href=3D"http://groups.google.com/a/isocpp.org/group/=
std-proposals/">http://groups.google.com/a/isocpp.org/group/std-proposals/<=
/a>.<br />
--------------050200000302080301080001--
.
Author: "Vicente J. Botet Escriba" <vicente.botet@wanadoo.fr>
Date: Thu, 21 May 2015 00:40:59 +0200
Raw View
This is a multi-part message in MIME format.
--------------040904070109040701080307
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: quoted-printable
Le 20/05/15 21:18, Jens Maurer a =C3=A9crit :
> On 05/18/2015 08:34 PM, Olaf van der Spek wrote:
>>
>> optional<T> parse(string_view, std::size_t* pos =3D 0, int base =3D 10);
>>
>> An alternative could be:
>>
>> error_code parse(T&, string_view, std::size_t* pos =3D 0, int base =3D 1=
0);
>
> My suggestion:
>
> const char * ret =3D parse(T& v, const char * first, const char * last, i=
nt base, error_code&);
>
> [ret, last[ is the unparsed part of the string. If ret =3D=3D first, v i=
s not
> overwritten and we have an error, otherwise v contains the parsed value.
> (Feel free to switch the base vs. error_code& parameters to allow a defau=
lted base.)
>
>
> This allows some symmetry with similar output operations:
>
> char * p =3D output(char * first, char * last, T v);
>
> which outputs "v" into the space provided by [first, last[,
> returning the remaining space as [p, last[.
>
> (This doesn't work with string_view, because it's read-only.)
>
>
> This does not use arbitrary iterators: It doesn't make a lot of sense
> to have "char" buffers that are not (at least) partially contiguous.
> std::list<char>? No, thanks.
>
> This does not use or consider locales: That's for iostreams to deal with.
>
>
> Here's some (totally untested) code that shows a composition: Parse
> a comma-separated list of "int"s into a std::vector<int>. It seems
> having some space-skipping function and a "parse this expected
> sequence of chars" function might be helpful.
>
>
> #include <vector>
>
> struct error_code;
>
> const char * parse(int&, const char * first, const char * last, int base,=
error_code& ec);
>
> const char * parse(std::vector<int>& res, const char * first, const char =
* last, int base, error_code& ec)
> {
> res.clear();
> while (first !=3D last) {
> int v;
> const char * p =3D parse(v, first, last, base, ec);
> if (p =3D=3D first) // error; ec is already set
> return p;
> res.push_back(v);
> if (p =3D=3D last)
> break;
> if (*p !=3D ',') { // error
> ec =3D /* whatever */;
> return p;
> }
> first =3D p + 1;
> }
> }
>
>
>
An alternative solution using a no-raw loops with FTL [1,2] could be=20
something like
parser<std::vector<int>>parseVectorInt() {
return curry(cons)
%parse<int>()
*option(whitespace() >>lazy(parseList), std::vector<int>());
}
where
parser<std::string>whitespace() {
return many1(oneOf(" \t\r\n"));
}
or using functional notation
parser<std::vector<int>>parseVectorInt() {
return fmap(curry(cons),
parse<int>(),
option(mdo(whitespace(),lazy(parseList)), std::vector<int>());
}
Vicente
[1]https://github.com/beark/ftl/blob/master/docs/Parsec-I.md
[2] https://github.com/beark/ftl/blob/master/docs/Parsec-II.md
--=20
---=20
You received this message because you are subscribed to the Google Groups "=
ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposa=
ls/.
--------------040904070109040701080307
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable
<html>
<head>
<meta content=3D"text/html; charset=3Dutf-8" http-equiv=3D"Content-Type=
">
</head>
<body text=3D"#000000" bgcolor=3D"#FFFFFF">
<div class=3D"moz-cite-prefix">Le 20/05/15 21:18, Jens Maurer a
=C3=A9crit=C2=A0:<br>
</div>
<blockquote cite=3D"mid:555CDDEF.8000002@gmx.net" type=3D"cite">
<pre wrap=3D"">On 05/18/2015 08:34 PM, Olaf van der Spek wrote:
</pre>
<blockquote type=3D"cite">
<pre wrap=3D"">
optional<T> parse(string_view, std::size_t* pos =3D 0, int base =3D 1=
0);
An alternative could be:
error_code parse(T&, string_view, std::size_t* pos =3D 0, int base =3D =
10);
</pre>
</blockquote>
<pre wrap=3D"">
My suggestion:
const char * ret =3D parse(T& v, const char * first, const char * last,=
int base, error_code&);
[ret, last[ is the unparsed part of the string. If ret =3D=3D first, v is =
not
overwritten and we have an error, otherwise v contains the parsed value.
(Feel free to switch the base vs. error_code& parameters to allow a def=
aulted base.)
This allows some symmetry with similar output operations:
char * p =3D output(char * first, char * last, T v);
which outputs "v" into the space provided by [first, last[,
returning the remaining space as [p, last[.
(This doesn't work with string_view, because it's read-only.)
This does not use arbitrary iterators: It doesn't make a lot of sense
to have "char" buffers that are not (at least) partially contiguous.
std::list<char>? No, thanks.
This does not use or consider locales: That's for iostreams to deal with.
Here's some (totally untested) code that shows a composition: Parse
a comma-separated list of "int"s into a std::vector<int>. It seems
having some space-skipping function and a "parse this expected
sequence of chars" function might be helpful.
#include <vector>
struct error_code;
const char * parse(int&, const char * first, const char * last, int bas=
e, error_code& ec);
const char * parse(std::vector<int>& res, const char * first, con=
st char * last, int base, error_code& ec)
{
res.clear();
while (first !=3D last) {
int v;
const char * p =3D parse(v, first, last, base, ec);
if (p =3D=3D first) // error; ec is already set
return p;
res.push_back(v);
if (p =3D=3D last)
break;
if (*p !=3D ',') { // error
ec =3D /* whatever */;
return p;
}
first =3D p + 1;
}
}
</pre>
</blockquote>
An alternative solution using a no-raw loops with FTL [1,2] could be
something like<br>
<br>
<meta http-equiv=3D"content-type" content=3D"text/html; charset=3Dutf-8=
">
<pre>parser<std::vector<<span class=3D"pl-k">int</span>>> <=
span class=3D"pl-en">parseVectorInt</span>() {
<span class=3D"pl-k">return</span> <span class=3D"pl-c1">curry</span>(c=
ons)=20
% <span class=3D"pl-c1">parse<int></span>()
* <span class=3D"pl-c1">option</span>(<span class=3D"pl-c1">whitesp=
ace</span>() >> <span class=3D"pl-c1">lazy</span>(parseList), std::ve=
ctor<<span class=3D"pl-k">int</span>>());
}
where=20
<meta http-equiv=3D"content-type" content=3D"text/html; charset=3Dutf-8">pa=
rser<std::string> <span class=3D"pl-en">whitespace</span>() {
<span class=3D"pl-k">return</span> <span class=3D"pl-c1">many1</span>(<=
span class=3D"pl-c1">oneOf</span>(<span class=3D"pl-s"><span class=3D"pl-pd=
s">"</span> <span class=3D"pl-cce">\t\r\n</span><span class=3D"pl-pds">"</s=
pan></span>));
}<pre></pre>or using functional notation
parser<std::vector<<span class=3D"pl-k">int</span>>> <span clas=
s=3D"pl-en">parseVectorInt</span>() {
<span class=3D"pl-k">return</span> fmap(<span class=3D"pl-c1">curry</sp=
an>(cons),
<span class=3D"pl-c1">parse<int></span>(),
<span class=3D"pl-c1">option</span>(<span class=3D"pl-c1">mdo(white=
space</span>(), <span class=3D"pl-c1">lazy</span>(parseList)), std::vector&=
lt;<span class=3D"pl-k">int</span>>());
}
Vicente
[1] <a class=3D"moz-txt-link-freetext" href=3D"https://github.com/beark/ftl=
/blob/master/docs/Parsec-I.md">https://github.com/beark/ftl/blob/master/doc=
s/Parsec-I.md</a>
[2] <a class=3D"moz-txt-link-freetext" href=3D"https://github.com/beark/ftl=
/blob/master/docs/Parsec-II.md">https://github.com/beark/ftl/blob/master/do=
cs/Parsec-II.md</a>
</pre>
<br>
</body>
</html>
<p></p>
-- <br />
<br />
--- <br />
You received this message because you are subscribed to the Google Groups &=
quot;ISO C++ Standard - Future Proposals" group.<br />
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to <a href=3D"mailto:std-proposals+unsubscribe@isocpp.org">std-proposa=
ls+unsubscribe@isocpp.org</a>.<br />
To post to this group, send email to <a href=3D"mailto:std-proposals@isocpp=
..org">std-proposals@isocpp.org</a>.<br />
Visit this group at <a href=3D"http://groups.google.com/a/isocpp.org/group/=
std-proposals/">http://groups.google.com/a/isocpp.org/group/std-proposals/<=
/a>.<br />
--------------040904070109040701080307--
.
Author: Matthew Woehlke <mw_triad@users.sourceforge.net>
Date: Wed, 20 May 2015 18:54:57 -0400
Raw View
On 2015-05-20 14:28, Matthew Fioravante wrote:
> On Wednesday, May 20, 2015 at 12:29:51 PM UTC-4, Vicente J. Botet Escriba
> wrote:
>
>> Not yet there, but I would find quire readable to assign multiple values
>> and even declare them in situ as well
>>
>> {auto a, str} = parse<int>(str);
>>
>> {auto c, ignore} = parse<int>(str);
>
> Its quite a different proposal, but I think a syntax like this is badly
> needed.
Agreed. Unfortunately it came up before and IIRC did not get very far.
Nit: I think it would be good to allow a declaration 'void' (that is, no
name) to ignore a value. Example:
[auto result, void] = parse<int>(str);
So, each item in the [] list is either an assignable declaration,
assignable expression, or 'void'. (I could continue, but it's off topic
for this thread.)
> Maybe not braces since current use of braces always introduce a scope and
> that auto a lives in the outer scope. Maybe [] can be reused?
>
> [ auto a, str ] = parse<int>(str);
Works for me, FWIW.
--
Matthew
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
.
Author: "'Jeffrey Yasskin' via ISO C++ Standard - Future Proposals" <std-proposals@isocpp.org>
Date: Wed, 20 May 2015 15:59:39 -0700
Raw View
There's a significant risk here that if the proposal is too
complicated, nothing will get accepted.
Jens' suggestion has an advantage that it's clearly sufficient and in
line with the rest of the library, even if the interface might not be
as convenient as some other options. Even the interface, though, isn't
too bad when you look at the code using it.
On Wed, May 20, 2015 at 1:20 PM, Miro Knejp <miro.knejp@gmail.com> wrote:
> The Functional Template Library ( https://github.com/beark/ftl ) has an
> example of monadic parser generators inspired by Haskell.
> https://github.com/beark/ftl/blob/master/docs/Parsec-I.md
>
> Now that library also has overloads of operator>>= and others, which is not
> part of "turn a string into an int" problem, but with all parsing functions
> returning parser monads the composition is much easier to do.
>
> It starts of by introducing the monad itself:
>
> template<typename T>
> using parser = ftl::eitherT<error,ftl::function<T(std::istream&)>>;
>
>
> and a function to execute the actual parser
>
> template<typename T>
> ftl::either<error,T> run(parser<T> p, std::istream& is);
>
>
> Each parsing function then returns a parser object.
>
> parser<int> parseNatural();
>
>
> This obviously serves more than a simple "turn a string into an int" but is
> a prime example of composability that really shines with the combining
> operators like >> or << etc. It makes things like this easy
>
> parser<std::vector<int>> parseLispList() {
> using namesapce ftl;
> return parseChar('(')
> >> parseList()
> << parseChar(')');
> }
>
>
> I thought I'd throw this in just as an example. We had these discussions
> earlier without any consensus and the only new viewpoint brought in this
> time is the functional approach mentioned by Vicente J. Whether this is
> something the standard library can/should follow, or provides the
> performance people need, or it can solve all the use cases people can come
> up with and make everyone happy I don't know.
>
> --
>
> ---
> You received this message because you are subscribed to the Google Groups
> "ISO C++ Standard - Future Proposals" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to std-proposals+unsubscribe@isocpp.org.
> To post to this group, send email to std-proposals@isocpp.org.
> Visit this group at
> http://groups.google.com/a/isocpp.org/group/std-proposals/.
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
.
Author: Matthew Fioravante <fmatthew5876@gmail.com>
Date: Wed, 20 May 2015 16:04:00 -0700 (PDT)
Raw View
------=_Part_4341_1967694562.1432163040682
Content-Type: multipart/alternative;
boundary="----=_Part_4342_916581008.1432163040691"
------=_Part_4342_916581008.1432163040691
Content-Type: text/plain; charset=UTF-8
On Wednesday, May 20, 2015 at 6:55:11 PM UTC-4, Matthew Woehlke wrote:
>
> On 2015-05-20 14:28, Matthew Fioravante wrote:
> > On Wednesday, May 20, 2015 at 12:29:51 PM UTC-4, Vicente J. Botet
> Escriba
> > wrote:
> >
> >> Not yet there, but I would find quire readable to assign multiple
> values
> >> and even declare them in situ as well
> >>
> >> {auto a, str} = parse<int>(str);
> >>
> >> {auto c, ignore} = parse<int>(str);
> >
> > Its quite a different proposal, but I think a syntax like this is badly
> > needed.
>
> Agreed. Unfortunately it came up before and IIRC did not get very far.
>
> Nit: I think it would be good to allow a declaration 'void' (that is, no
> name) to ignore a value. Example:
>
> [auto result, void] = parse<int>(str);
>
+1 must better than std::ignore.
But we are off topic! :)
On Wednesday, May 20, 2015 at 7:00:03 PM UTC-4, Jeffrey Yasskin wrote:
>
> There's a significant risk here that if the proposal is too
> complicated, nothing will get accepted.
>
> That's my biggest fear as well, and also the reason why the last 2 threads
about this died with no action. Even if the standard version is pretty
simple its very easy to write your favorite flavor of wrapper ontop of it.
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
------=_Part_4342_916581008.1432163040691
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable
<div dir=3D"ltr"><br><br>On Wednesday, May 20, 2015 at 6:55:11 PM UTC-4, Ma=
tthew Woehlke wrote:<blockquote class=3D"gmail_quote" style=3D"margin: 0;ma=
rgin-left: 0.8ex;border-left: 1px #ccc solid;padding-left: 1ex;">On 2015-05=
-20 14:28, Matthew Fioravante wrote:
<br>> On Wednesday, May 20, 2015 at 12:29:51 PM UTC-4, Vicente J. Botet =
Escriba=20
<br>> wrote:
<br>>=20
<br>>> Not yet there, but I would find quire readable to assign multi=
ple values=20
<br>>> and even declare them in situ as well
<br>>>
<br>>> {auto a, str} =3D parse<int>(str);=20
<br>>>
<br>>> {auto c, ignore} =3D parse<int>(str);
<br>>
<br>> Its quite a different proposal, but I think a syntax like this is =
badly=20
<br>> needed.
<br>
<br>Agreed. Unfortunately it came up before and IIRC did not get very far.
<br>
<br>Nit: I think it would be good to allow a declaration 'void' (that is, n=
o
<br>name) to ignore a value. Example:
<br>
<br> [auto result, void] =3D parse<int>(str);
<br></blockquote><div><br>+1 must better than std::ignore.<br><br>But we ar=
e off topic! :) <br></div><br>On Wednesday, May 20, 2015 at 7:00:03 PM UTC-=
4, Jeffrey Yasskin wrote:<blockquote class=3D"gmail_quote" style=3D"margin:=
0;margin-left: 0.8ex;border-left: 1px #ccc solid;padding-left: 1ex;">There=
's a significant risk here that if the proposal is too
<br>complicated, nothing will get accepted.
<br>
<br></blockquote><div>That's my biggest fear as well, and also the reason w=
hy the last 2 threads about this died with no action. Even if the standard =
version is pretty simple its very easy to write your favorite flavor of wra=
pper ontop of it.<br><br></div></div>
<p></p>
-- <br />
<br />
--- <br />
You received this message because you are subscribed to the Google Groups &=
quot;ISO C++ Standard - Future Proposals" group.<br />
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to <a href=3D"mailto:std-proposals+unsubscribe@isocpp.org">std-proposa=
ls+unsubscribe@isocpp.org</a>.<br />
To post to this group, send email to <a href=3D"mailto:std-proposals@isocpp=
..org">std-proposals@isocpp.org</a>.<br />
Visit this group at <a href=3D"http://groups.google.com/a/isocpp.org/group/=
std-proposals/">http://groups.google.com/a/isocpp.org/group/std-proposals/<=
/a>.<br />
------=_Part_4342_916581008.1432163040691--
------=_Part_4341_1967694562.1432163040682--
.
Author: Matthew Fioravante <fmatthew5876@gmail.com>
Date: Wed, 20 May 2015 20:18:42 -0700 (PDT)
Raw View
------=_Part_4836_1788312111.1432178322075
Content-Type: multipart/alternative;
boundary="----=_Part_4837_1268253146.1432178322075"
------=_Part_4837_1268253146.1432178322075
Content-Type: text/plain; charset=UTF-8
On Wednesday, May 20, 2015 at 3:18:24 PM UTC-4, Jens Maurer wrote:
>
> My suggestion:
>
> const char * ret = parse(T& v, const char * first, const char * last, int
> base, error_code&);
>
What is the value add of using char* pairs and a char* return vs
string_view? Symmetry with iterator algorithms?
string_view parse(T& v, string_view s, int base, error_code&);
//If you have char* pointers
char* cb;
char* ce;
error_code ec;
int value;
auto tail = parse(value, {cb,ce}, 10, ec);
>
> [ret, last[ is the unparsed part of the string. If ret == first, v is not
> overwritten and we have an error, otherwise v contains the parsed value.
> (Feel free to switch the base vs. error_code& parameters to allow a
> defaulted base.)
>
>
You could also add an overload:
template <typename T> inline const char* parse(T& v, const char* b, const
char* e, error-code& ec) { return parse(v,b,e,10,ec); }
>
> This allows some symmetry with similar output operations:
>
> char * p = output(char * first, char * last, T v);
> which outputs "v" into the space provided by [first, last[,
> returning the remaining space as [p, last[.
>
> (This doesn't work with string_view, because it's read-only.)
>
This kind of API looks like it could be another use case for mstring_view.
array_view<char> would also work.
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
------=_Part_4837_1268253146.1432178322075
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable
<div dir=3D"ltr"><br><br>On Wednesday, May 20, 2015 at 3:18:24 PM UTC-4, Je=
ns Maurer wrote:<blockquote class=3D"gmail_quote" style=3D"margin: 0;margin=
-left: 0.8ex;border-left: 1px #ccc solid;padding-left: 1ex;">My suggestion:
<br>
<br>const char * ret =3D parse(T& v, const char * first, const char * l=
ast, int base, error_code&);
<br></blockquote><div><br></div><div>What is the value add of using char* p=
airs and a char* return vs string_view? Symmetry with iterator algorithms?<=
/div><div><br></div><div><div class=3D"prettyprint" style=3D"border: 1px so=
lid rgb(187, 187, 187); word-wrap: break-word; background-color: rgb(250, 2=
50, 250);"><code class=3D"prettyprint"><div class=3D"subprettyprint"><span =
style=3D"color: #000;" class=3D"styled-by-prettify">string_view parse</span=
><span style=3D"color: #660;" class=3D"styled-by-prettify">(</span><span st=
yle=3D"color: #000;" class=3D"styled-by-prettify">T</span><span style=3D"co=
lor: #660;" class=3D"styled-by-prettify">&</span><span style=3D"color: =
#000;" class=3D"styled-by-prettify"> v</span><span style=3D"color: #660;" c=
lass=3D"styled-by-prettify">,</span><span style=3D"color: #000;" class=3D"s=
tyled-by-prettify"> string_view s</span><span style=3D"color: #660;" class=
=3D"styled-by-prettify">,</span><span style=3D"color: #000;" class=3D"style=
d-by-prettify"> </span><span style=3D"color: #008;" class=3D"styled-by-pret=
tify">int</span><span style=3D"color: #000;" class=3D"styled-by-prettify"> =
</span><span style=3D"color: #008;" class=3D"styled-by-prettify">base</span=
><span style=3D"color: #660;" class=3D"styled-by-prettify">,</span><span st=
yle=3D"color: #000;" class=3D"styled-by-prettify"> error_code</span><span s=
tyle=3D"color: #660;" class=3D"styled-by-prettify">&);</span><span styl=
e=3D"color: #000;" class=3D"styled-by-prettify"><br><br></span><span style=
=3D"color: #800;" class=3D"styled-by-prettify">//If you have char* pointers=
</span><span style=3D"color: #000;" class=3D"styled-by-prettify"><br></span=
><span style=3D"color: #008;" class=3D"styled-by-prettify">char</span><span=
style=3D"color: #660;" class=3D"styled-by-prettify">*</span><span style=3D=
"color: #000;" class=3D"styled-by-prettify"> cb</span><span style=3D"color:=
#660;" class=3D"styled-by-prettify">;</span><span style=3D"color: #000;" c=
lass=3D"styled-by-prettify"><br></span><span style=3D"color: #008;" class=
=3D"styled-by-prettify">char</span><span style=3D"color: #660;" class=3D"st=
yled-by-prettify">*</span><span style=3D"color: #000;" class=3D"styled-by-p=
rettify"> ce</span><span style=3D"color: #660;" class=3D"styled-by-prettify=
">;</span><span style=3D"color: #000;" class=3D"styled-by-prettify"><br>err=
or_code ec</span><span style=3D"color: #660;" class=3D"styled-by-prettify">=
;</span><span style=3D"color: #000;" class=3D"styled-by-prettify"><br></spa=
n><span style=3D"color: #008;" class=3D"styled-by-prettify">int</span><span=
style=3D"color: #000;" class=3D"styled-by-prettify"> value</span><span sty=
le=3D"color: #660;" class=3D"styled-by-prettify">;</span><span style=3D"col=
or: #000;" class=3D"styled-by-prettify"><br></span><span style=3D"color: #0=
08;" class=3D"styled-by-prettify">auto</span><span style=3D"color: #000;" c=
lass=3D"styled-by-prettify"> tail </span><span style=3D"color: #660;" class=
=3D"styled-by-prettify">=3D</span><span style=3D"color: #000;" class=3D"sty=
led-by-prettify"> parse</span><span style=3D"color: #660;" class=3D"styled-=
by-prettify">(</span><span style=3D"color: #000;" class=3D"styled-by-pretti=
fy">value</span><span style=3D"color: #660;" class=3D"styled-by-prettify">,=
</span><span style=3D"color: #000;" class=3D"styled-by-prettify"> </span><s=
pan style=3D"color: #660;" class=3D"styled-by-prettify">{</span><span style=
=3D"color: #000;" class=3D"styled-by-prettify">cb</span><span style=3D"colo=
r: #660;" class=3D"styled-by-prettify">,</span><span style=3D"color: #000;"=
class=3D"styled-by-prettify">ce</span><span style=3D"color: #660;" class=
=3D"styled-by-prettify">},</span><font color=3D"#006666"><span style=3D"col=
or: #000;" class=3D"styled-by-prettify"> </span><span style=3D"color: #066;=
" class=3D"styled-by-prettify">10</span><span style=3D"color: #660;" class=
=3D"styled-by-prettify">,</span><span style=3D"color: #000;" class=3D"style=
d-by-prettify"> ec</span><span style=3D"color: #660;" class=3D"styled-by-pr=
ettify">);</span></font></div></code></div><br></div><div> <br></div><=
blockquote class=3D"gmail_quote" style=3D"margin: 0;margin-left: 0.8ex;bord=
er-left: 1px #ccc solid;padding-left: 1ex;">
<br>[ret, last[ is the unparsed part of the string. If ret =3D=3D fir=
st, v is not
<br>overwritten and we have an error, otherwise v contains the parsed value=
..
<br>(Feel free to switch the base vs. error_code& parameters to allow a=
defaulted base.)
<br>
<br></blockquote><div><br></div><div>You could also add an overload:</div><=
div><br></div><div><div class=3D"prettyprint" style=3D"border: 1px solid rg=
b(187, 187, 187); word-wrap: break-word; background-color: rgb(250, 250, 25=
0);"><code class=3D"prettyprint"><div class=3D"subprettyprint"><span style=
=3D"color: #008;" class=3D"styled-by-prettify">template</span><span style=
=3D"color: #000;" class=3D"styled-by-prettify"> </span><span style=3D"color=
: #660;" class=3D"styled-by-prettify"><</span><span style=3D"color: #008=
;" class=3D"styled-by-prettify">typename</span><span style=3D"color: #000;"=
class=3D"styled-by-prettify"> T</span><span style=3D"color: #660;" class=
=3D"styled-by-prettify">></span><span style=3D"color: #000;" class=3D"st=
yled-by-prettify"> </span><span style=3D"color: #008;" class=3D"styled-by-p=
rettify">inline</span><span style=3D"color: #000;" class=3D"styled-by-prett=
ify"> </span><span style=3D"color: #008;" class=3D"styled-by-prettify">cons=
t</span><span style=3D"color: #000;" class=3D"styled-by-prettify"> </span><=
span style=3D"color: #008;" class=3D"styled-by-prettify">char</span><span s=
tyle=3D"color: #660;" class=3D"styled-by-prettify">*</span><span style=3D"c=
olor: #000;" class=3D"styled-by-prettify"> parse</span><span style=3D"color=
: #660;" class=3D"styled-by-prettify">(</span><font color=3D"#000088"><span=
style=3D"color: #000;" class=3D"styled-by-prettify">T</span><span style=3D=
"color: #660;" class=3D"styled-by-prettify">&</span><span style=3D"colo=
r: #000;" class=3D"styled-by-prettify"> v</span><span style=3D"color: #660;=
" class=3D"styled-by-prettify">,</span><span style=3D"color: #000;" class=
=3D"styled-by-prettify"> </span><span style=3D"color: #008;" class=3D"style=
d-by-prettify">const</span><span style=3D"color: #000;" class=3D"styled-by-=
prettify"> </span><span style=3D"color: #008;" class=3D"styled-by-prettify"=
>char</span><span style=3D"color: #660;" class=3D"styled-by-prettify">*</sp=
an><span style=3D"color: #000;" class=3D"styled-by-prettify"> b</span><span=
style=3D"color: #660;" class=3D"styled-by-prettify">,</span><span style=3D=
"color: #000;" class=3D"styled-by-prettify"> </span><span style=3D"color: #=
008;" class=3D"styled-by-prettify">const</span><span style=3D"color: #000;"=
class=3D"styled-by-prettify"> </span><span style=3D"color: #008;" class=3D=
"styled-by-prettify">char</span><span style=3D"color: #660;" class=3D"style=
d-by-prettify">*</span><span style=3D"color: #000;" class=3D"styled-by-pret=
tify"> e</span><span style=3D"color: #660;" class=3D"styled-by-prettify">,<=
/span><span style=3D"color: #000;" class=3D"styled-by-prettify"> error</spa=
n><span style=3D"color: #660;" class=3D"styled-by-prettify">-</span><span s=
tyle=3D"color: #000;" class=3D"styled-by-prettify">code</span><span style=
=3D"color: #660;" class=3D"styled-by-prettify">&</span><span style=3D"c=
olor: #000;" class=3D"styled-by-prettify"> ec</span><span style=3D"color: #=
660;" class=3D"styled-by-prettify">)</span><span style=3D"color: #000;" cla=
ss=3D"styled-by-prettify"> </span><span style=3D"color: #660;" class=3D"sty=
led-by-prettify">{</span><span style=3D"color: #000;" class=3D"styled-by-pr=
ettify"> </span><span style=3D"color: #008;" class=3D"styled-by-prettify">r=
eturn</span><span style=3D"color: #000;" class=3D"styled-by-prettify"> pars=
e</span><span style=3D"color: #660;" class=3D"styled-by-prettify">(</span><=
span style=3D"color: #000;" class=3D"styled-by-prettify">v</span><span styl=
e=3D"color: #660;" class=3D"styled-by-prettify">,</span><span style=3D"colo=
r: #000;" class=3D"styled-by-prettify">b</span><span style=3D"color: #660;"=
class=3D"styled-by-prettify">,</span><span style=3D"color: #000;" class=3D=
"styled-by-prettify">e</span><span style=3D"color: #660;" class=3D"styled-b=
y-prettify">,</span><span style=3D"color: #066;" class=3D"styled-by-prettif=
y">10</span><span style=3D"color: #660;" class=3D"styled-by-prettify">,</sp=
an><span style=3D"color: #000;" class=3D"styled-by-prettify">ec</span><span=
style=3D"color: #660;" class=3D"styled-by-prettify">);</span><span style=
=3D"color: #000;" class=3D"styled-by-prettify"> </span><span style=3D"color=
: #660;" class=3D"styled-by-prettify">}</span></font></div></code></div><br=
> </div><blockquote class=3D"gmail_quote" style=3D"margin: 0;margin-le=
ft: 0.8ex;border-left: 1px #ccc solid;padding-left: 1ex;">
<br>This allows some symmetry with similar output operations:
<br>
<br>char * p =3D output(char * first, char * last, T v); </blockquote=
><blockquote class=3D"gmail_quote" style=3D"margin: 0;margin-left: 0.8ex;bo=
rder-left: 1px #ccc solid;padding-left: 1ex;">
<br>which outputs "v" into the space provided by [first, last[,
<br>returning the remaining space as [p, last[.
<br>
<br>(This doesn't work with string_view, because it's read-only.)
<br></blockquote><div><br></div><div>This kind of API looks like it could b=
e another use case for mstring_view. array_view<char> would also work=
..</div><div><br></div></div>
<p></p>
-- <br />
<br />
--- <br />
You received this message because you are subscribed to the Google Groups &=
quot;ISO C++ Standard - Future Proposals" group.<br />
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to <a href=3D"mailto:std-proposals+unsubscribe@isocpp.org">std-proposa=
ls+unsubscribe@isocpp.org</a>.<br />
To post to this group, send email to <a href=3D"mailto:std-proposals@isocpp=
..org">std-proposals@isocpp.org</a>.<br />
Visit this group at <a href=3D"http://groups.google.com/a/isocpp.org/group/=
std-proposals/">http://groups.google.com/a/isocpp.org/group/std-proposals/<=
/a>.<br />
------=_Part_4837_1268253146.1432178322075--
------=_Part_4836_1788312111.1432178322075--
.
Author: Olaf van der Spek <olafvdspek@gmail.com>
Date: Thu, 21 May 2015 07:10:05 +0200
Raw View
2015-05-21 0:00 GMT+02:00 Vicente J. Botet Escriba <vicente.botet@wanadoo.f=
r>:
> Le 20/05/15 20:49, Olaf van der Spek a =C3=A9crit :
>
> 2015-05-20 17:43 GMT+02:00 Matthew Woehlke <mw_triad@users.sourceforge.ne=
t>:
>
> On 2015-05-20 11:27, Matthew Fioravante wrote:
>
> Generally, using 0 or any other perfectly valid value to signal failure i=
s
> a really bad idea. It only makes sense when your use case is "Parse the
> value or give me some default if it fails". In that case, 0 may not be th=
e
> default value you want so its better to be able to actually specify it.
>
> Doesn't expected already handle this?
>
> Does expected exist?
>
> Do you mean an implementation to play with? Yes
> (https://github.com/ptal/expected)
No, I mean accepted by the committee and in a TS.
> And does it have something like test_and_set()?
>
> No. What test_and_set would have as parameters and what would be the effe=
ct?
bool test_and_set(T& out, optional<T> opt)
{
if (!opt)
return false;
out =3D *opt;
return true;
}
> Using a tail as out parameter it should be someting like
>
> expected<Date, error_code> parse_date(string_view is, string_view& out)
> {
> return make_date % // fmap
> ( parse<int>(is, &is) // year
> >> parse_separator(is, &is) ) *
> ( parse<int>(is, &is) // month
> >> parse_separator(is, &is) ) *
> parse<int>(is, &out) // day
> ;
> }
>
> or using the functional form with a parameter in-out it could be
>
> expected<Date> parse_date(string_view& is)
> {
> return fmap(make_date,
> mdo( parse<int>(is), // year
> parse_separator(is) ),
> mdo( parse<int>(is), // month
> parse_separator(is) ),
> parse<int>(is) // day
> );
> }
>
> There other possibilities (e.g. using Parser) as Miro pointed out.
Looks interesting but IMO that's higher-level then the basic level
we're aiming for here.
This can easily be build on top of the basic level can't it?
> Interface using exceptions compose quite well. The interface that doesn't=
use exceptions should be based on it. We need to see just how to report th=
e error_code. As a result, as out parameter or as a TLS.
IMO it should be the other way around. It's cleaner to build a
throwing interface on top of a non-throwing interface..
> Oh, I missed this link. as Matthew, I don't see the point to use it if th=
ere is an output of a ParserState or a string_view.
Right
--=20
Olaf
--=20
---=20
You received this message because you are subscribed to the Google Groups "=
ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposa=
ls/.
.
Author: Magnus Fromreide <magfr@lysator.liu.se>
Date: Thu, 21 May 2015 09:51:17 +0200
Raw View
On Mon, May 18, 2015 at 11:34:07AM -0700, Olaf van der Spek wrote:
> Let's get the party started.
>
> What have we got?
>
> We've got functions like strtol and stoi which take a const char* or
> std::string and return a number.
>
> long strtol(const char*, char **str_end, int base);
> int stoi(const std::string&, std::size_t* pos = 0, int base = 10);
>
> What do we want?
>
> Input should not be required to be null terminated, so string_view seems
> like a suitable input type.
> Error detection should be simpler, but not everyone is a fan of exceptions.
>
> And IMO skipping spaces should not be part of the parse function.
> There's also the question of what to do when not the entire input can be
> parsed. Return an error or not.
I have one more thing to throw into the air.
If the type asked for is unsigned then I think parsing of negative values
should result in a parse error.
"-1" should not be parseable when converting to unsigned T, the fail position
is before the -.
/MF
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
.
Author: Olaf van der Spek <olafvdspek@gmail.com>
Date: Thu, 21 May 2015 20:31:55 +0200
Raw View
Could we have some more real-world use cases please?
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
.
Author: Magnus Fromreide <magfr@lysator.liu.se>
Date: Thu, 21 May 2015 21:53:46 +0200
Raw View
On Thu, May 21, 2015 at 08:31:55PM +0200, Olaf van der Spek wrote:
> Could we have some more real-world use cases please?
I do not know if this is real-world enough for you?
unsigned int number_of_things = parse<unsigned int>("-1").value();
Why should parse<unsigned int> accept signed values?
What one have to do in C to parse an unsigned value is to use strtol and then
check that the result is greater or equal than zero. Couldn't C++ do better
here?
/MF
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
.
Author: Olaf van der Spek <olafvdspek@gmail.com>
Date: Thu, 21 May 2015 22:27:45 +0200
Raw View
2015-05-21 21:53 GMT+02:00 Magnus Fromreide <magfr@lysator.liu.se>:
> On Thu, May 21, 2015 at 08:31:55PM +0200, Olaf van der Spek wrote:
>> Could we have some more real-world use cases please?
>
> I do not know if this is real-world enough for you?
>
> unsigned int number_of_things = parse<unsigned int>("-1").value();
>
> Why should parse<unsigned int> accept signed values?
>
> What one have to do in C to parse an unsigned value is to use strtol and then
> check that the result is greater or equal than zero.
Really? http://www.cplusplus.com/reference/cstdlib/strtoul/
> Couldn't C++ do better
> here?
IMO your example should throw an out-of-range exception.
--
Olaf
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
.
Author: Olaf van der Spek <olafvdspek@gmail.com>
Date: Thu, 21 May 2015 22:28:38 +0200
Raw View
2015-05-21 21:53 GMT+02:00 Magnus Fromreide <magfr@lysator.liu.se>:
> On Thu, May 21, 2015 at 08:31:55PM +0200, Olaf van der Spek wrote:
>> Could we have some more real-world use cases please?
>
> I do not know if this is real-world enough for you?
I don't know, is that what you're currently using? If so, could you
share your parse implementation?
--
Olaf
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
.
Author: Thiago Macieira <thiago@macieira.org>
Date: Thu, 21 May 2015 14:08:19 -0700
Raw View
On Thursday 21 May 2015 22:27:45 Olaf van der Spek wrote:
> > What one have to do in C to parse an unsigned value is to use strtol and
> > then check that the result is greater or equal than zero.
>
> Really? http://www.cplusplus.com/reference/cstdlib/strtoul/
Really. strtoul will parse negative numbers too.
int main()
{
std::cout << strtoul("-1", 0, 0) << std::endl;
}
18446744073709551615
--
Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org
Software Architect - Intel Open Source Technology Center
PGP/GPG: 0x6EF45358; fingerprint:
E067 918B B660 DBD1 105C 966C 33F5 F005 6EF4 5358
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
.
Author: Magnus Fromreide <magfr@lysator.liu.se>
Date: Fri, 22 May 2015 07:25:43 +0200
Raw View
On Thu, May 21, 2015 at 10:27:45PM +0200, Olaf van der Spek wrote:
> 2015-05-21 21:53 GMT+02:00 Magnus Fromreide <magfr@lysator.liu.se>:
> > On Thu, May 21, 2015 at 08:31:55PM +0200, Olaf van der Spek wrote:
> >> Could we have some more real-world use cases please?
> >
> > I do not know if this is real-world enough for you?
> >
> > unsigned int number_of_things = parse<unsigned int>("-1").value();
> >
> > Why should parse<unsigned int> accept signed values?
> >
> > What one have to do in C to parse an unsigned value is to use strtol and then
> > check that the result is greater or equal than zero.
>
> Really? http://www.cplusplus.com/reference/cstdlib/strtoul/
Sadly. http://pubs.opengroup.org/onlinepubs/9699919799/functions/strtoul.html
> > Couldn't C++ do better here?
>
> IMO your example should throw an out-of-range exception.
I kind of agree - I am not sure about the exact exception, one could imagine
invalid-argument like if the input string had been ";", but I do think that
the tail should be "-1".
/MF
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
.
Author: Jens Maurer <Jens.Maurer@gmx.net>
Date: Fri, 22 May 2015 08:24:14 +0200
Raw View
On 05/20/2015 09:38 PM, Olaf van der Spek wrote:
> 2015-05-20 21:18 GMT+02:00 Jens Maurer <Jens.Maurer@gmx.net>:
>> On 05/18/2015 08:34 PM, Olaf van der Spek wrote:
>> My suggestion:
>>
>> const char * ret = parse(T& v, const char * first, const char * last, int base, error_code&);
>>
>> If ret == first, v is not
>> overwritten and we have an error, otherwise v contains the parsed value.
>
> Your function fails these requirements..
Right, that specification was intended for the "elementary" parse
operations only. The specification needs to be weaker for the
compound parses. Something like "if ret == first, the value of
"v" is unspecified, otherwise v is the result of the (partial) parse,
with error_code containing the error encountered at the end (if any)."
Note that functions like that are NOT intended for building
large parsers without additional helper layers in between,
although showing some code that performs compounding is
(in my opinion) helpful to determine whether the interface
is somewhat reasonable.
My focus here is to make the low-level functions available to
users. I'm not against providing some high-level functions, too.
(The C++ standard currently does not offer low-level functions;
strtoul is locale-dependent, which slows it down substantially.
See also N4412.)
Jens
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
.
Author: Olaf van der Spek <olafvdspek@gmail.com>
Date: Fri, 22 May 2015 10:25:28 +0200
Raw View
2015-05-22 8:24 GMT+02:00 Jens Maurer <Jens.Maurer@gmx.net>:
> My focus here is to make the low-level functions available to
> users. I'm not against providing some high-level functions, too.
> (The C++ standard currently does not offer low-level functions;
> strtoul is locale-dependent, which slows it down substantially.
> See also N4412.)
Right, locales..
Should a locale-aware variant be provided?
A locale-unaware one?
Both?
Should the locale be a parameter?
Some probably want an ASCII-only variant for simplicity and performance.
--
Olaf
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
.
Author: Matthew Woehlke <mw_triad@users.sourceforge.net>
Date: Fri, 22 May 2015 10:05:48 -0400
Raw View
On 2015-05-22 04:25, Olaf van der Spek wrote:
> 2015-05-22 8:24 GMT+02:00 Jens Maurer <Jens.Maurer@gmx.net>:
>> My focus here is to make the low-level functions available to
>> users. I'm not against providing some high-level functions, too.
>> (The C++ standard currently does not offer low-level functions;
>> strtoul is locale-dependent, which slows it down substantially.
>> See also N4412.)
>
> Right, locales..
> Should a locale-aware variant be provided?
> A locale-unaware one?
> Both?
> Should the locale be a parameter?
IMHO, yes (both). Locale-aware because dealing with locales is hard, and
we should not require the user to do this. ASCII-only because there will
be cases where that is all that is needed (reading machine-written data)
and because the performance is much, much faster.
I would probably go with something like 'parse(...)' (ASCII-only) and
'parse_l(..., std::locale* = nullptr)' which uses the current locale if
none is provided (and possibly falls back on the fast ASCII-only if the
locale is 'C').
--
Matthew
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
.
Author: Matthew Fioravante <fmatthew5876@gmail.com>
Date: Fri, 22 May 2015 10:05:41 -0700 (PDT)
Raw View
------=_Part_608_121658073.1432314341918
Content-Type: multipart/alternative;
boundary="----=_Part_609_302800967.1432314341918"
------=_Part_609_302800967.1432314341918
Content-Type: text/plain; charset=UTF-8
How exactly do locales come into play for this algorithm? Choosing . or ,
for the decimal point? What else?
Most big data processing for numeric text is going to be in simple ascii
format, so an optimized routine for that use case would help a lot of
people.
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
------=_Part_609_302800967.1432314341918
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable
<div dir=3D"ltr">How exactly do locales come into play for this algorithm? =
Choosing . or , for the decimal point? What else?<div><br></div><div>Most b=
ig data processing for numeric text is going to be in simple ascii format, =
so an optimized routine for that use case would help a lot of people.</div>=
</div>
<p></p>
-- <br />
<br />
--- <br />
You received this message because you are subscribed to the Google Groups &=
quot;ISO C++ Standard - Future Proposals" group.<br />
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to <a href=3D"mailto:std-proposals+unsubscribe@isocpp.org">std-proposa=
ls+unsubscribe@isocpp.org</a>.<br />
To post to this group, send email to <a href=3D"mailto:std-proposals@isocpp=
..org">std-proposals@isocpp.org</a>.<br />
Visit this group at <a href=3D"http://groups.google.com/a/isocpp.org/group/=
std-proposals/">http://groups.google.com/a/isocpp.org/group/std-proposals/<=
/a>.<br />
------=_Part_609_302800967.1432314341918--
------=_Part_608_121658073.1432314341918--
.
Author: Matthew Woehlke <mwoehlke.floss@gmail.com>
Date: Fri, 22 May 2015 13:23:09 -0400
Raw View
On 2015-05-22 13:05, Matthew Fioravante wrote:
> How exactly do locales come into play for this algorithm? Choosing . or ,=
=20
> for the decimal point? What else?
Digit grouping separators. Potentially digits themselves (e.g. should "
=E5=85=AD=E4=B8=87=E4=B8=83=E5=8D=83=E4=B8=83=E7=99=BE=E5=9B=9B=E5=8D=81=E4=
=B8=89" be parsed? "=D7=AA=D7=A9=D7=A1=D7=93"?).
> Most big data processing for numeric text is going to be in simple ascii=
=20
> format, so an optimized routine for that use case would help a lot of=20
> people.
Yes, and that's why I think an ASCII-only version is crucial. However, I
also think it makes sense to have a version that accepts any reasonable
user input, e.g. "13", "2.569,86", "=E4=BC=8D=E4=BD=B0=E8=82=86=E6=8B=BE=E5=
=BC=90" (okay, that one may be a
little dubious as it uses some obsolete forms), etc.
--=20
Matthew
--=20
---=20
You received this message because you are subscribed to the Google Groups "=
ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposa=
ls/.
.
Author: Jens Maurer <Jens.Maurer@gmx.net>
Date: Fri, 22 May 2015 19:25:00 +0200
Raw View
On 05/22/2015 10:25 AM, Olaf van der Spek wrote:
> 2015-05-22 8:24 GMT+02:00 Jens Maurer <Jens.Maurer@gmx.net>:
>> My focus here is to make the low-level functions available to
>> users. I'm not against providing some high-level functions, too.
>> (The C++ standard currently does not offer low-level functions;
>> strtoul is locale-dependent, which slows it down substantially.
>> See also N4412.)
>
> Right, locales..
> Should a locale-aware variant be provided?
No.
> A locale-unaware one?
Yes, basic execution character set only.
(Remember, we're dealing with parsing elementary items
such as ints and doubles. The latter is remarkably
hard to do correctly. It's not totally out of the
question to convert your locale-obliterated string
into a "plain" ASCII string for parsing at the
application level. Or use C++ locale's num_get<>
facet, if that does what you want.)
> Should the locale be a parameter?
No. (Not for the functions I'm talking about, anyway.)
> Some probably want an ASCII-only variant for simplicity and performance.
Indeed.
Locale support is hard, and what we have in C++ is insufficient.
The proper way forward is probably ICU with a decent C++ "wrapper"
of some sort. Anyway, adding locales to this discussion is sure
to balloon the scope beyond manageability.
Jens
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
.
Author: Matthew Fioravante <fmatthew5876@gmail.com>
Date: Fri, 22 May 2015 10:28:55 -0700 (PDT)
Raw View
------=_Part_1047_129234290.1432315735112
Content-Type: multipart/alternative;
boundary="----=_Part_1048_934744913.1432315735112"
------=_Part_1048_934744913.1432315735112
Content-Type: text/plain; charset=UTF-8
I agree with Jens here. Adding support for locales and parsing asian
numbers etc.. is way out of scope. strtol() doesn't even do that.
Lets focus on the simple common use case that benefits 99% of users first.
If a fully locale aware variant is desired, that can be done later as a
followup proposal.
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
------=_Part_1048_934744913.1432315735112
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable
<div dir=3D"ltr">I agree with Jens here. Adding support for locales and par=
sing asian numbers etc.. is way out of scope. strtol() doesn't even do that=
..<div><br></div><div>Lets focus on the simple common use case that benefits=
99% of users first. If a fully locale aware variant is desired, that can b=
e done later as a followup proposal.</div></div>
<p></p>
-- <br />
<br />
--- <br />
You received this message because you are subscribed to the Google Groups &=
quot;ISO C++ Standard - Future Proposals" group.<br />
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to <a href=3D"mailto:std-proposals+unsubscribe@isocpp.org">std-proposa=
ls+unsubscribe@isocpp.org</a>.<br />
To post to this group, send email to <a href=3D"mailto:std-proposals@isocpp=
..org">std-proposals@isocpp.org</a>.<br />
Visit this group at <a href=3D"http://groups.google.com/a/isocpp.org/group/=
std-proposals/">http://groups.google.com/a/isocpp.org/group/std-proposals/<=
/a>.<br />
------=_Part_1048_934744913.1432315735112--
------=_Part_1047_129234290.1432315735112--
.
Author: Matthew Woehlke <mw_triad@users.sourceforge.net>
Date: Fri, 22 May 2015 13:46:20 -0400
Raw View
On 2015-05-22 13:28, Matthew Fioravante wrote:
> I agree with Jens here. Adding support for locales and parsing asian
> numbers etc.. is way out of scope. strtol() doesn't even do that.
Previous comments notwithstanding, I'll agree also; having locale
support is good, but not worth derailing getting the version that only
works on ASCII / C-locale support. The latter is much more important
(and more clearly in scope).
--
Matthew
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
.
Author: "Vicente J. Botet Escriba" <vicente.botet@wanadoo.fr>
Date: Sat, 23 May 2015 00:04:02 +0200
Raw View
Le 21/05/15 07:10, Olaf van der Spek a =C3=A9crit :
> 2015-05-21 0:00 GMT+02:00 Vicente J. Botet Escriba <vicente.botet@wanadoo=
..fr>:
>> Le 20/05/15 20:49, Olaf van der Spek a =C3=A9crit :
>>
>> 2015-05-20 17:43 GMT+02:00 Matthew Woehlke <mw_triad@users.sourceforge.n=
et>:
>>
>> On 2015-05-20 11:27, Matthew Fioravante wrote:
>>
>> Generally, using 0 or any other perfectly valid value to signal failure =
is
>> a really bad idea. It only makes sense when your use case is "Parse the
>> value or give me some default if it fails". In that case, 0 may not be t=
he
>> default value you want so its better to be able to actually specify it.
>>
>> Doesn't expected already handle this?
>>
>> Does expected exist?
>>
>> Do you mean an implementation to play with? Yes
>> (https://github.com/ptal/expected)
> No, I mean accepted by the committee and in a TS.
No, not yet ;-)
>
>> And does it have something like test_and_set()?
>>
>> No. What test_and_set would have as parameters and what would be the eff=
ect?
> bool test_and_set(T& out, optional<T> opt)
> {
> if (!opt)
> return false;
> out =3D *opt;
> return true;
> }
>
The proposal doesn't includes this function, it can be useful in an=20
imperative paradigm.
>> Using a tail as out parameter it should be someting like
>>
>> expected<Date, error_code> parse_date(string_view is, string_view& out)
>> {
>> return make_date % // fmap
>> ( parse<int>(is, &is) // year
>>>> >>parse_separator(is, &is) ) *
>> ( parse<int>(is, &is) // month
>>>> >>parse_separator(is, &is) ) *
>> parse<int>(is, &out) // day
>> ;
>> }
>>
>> or using the functional form with a parameter in-out it could be
>>
>> expected<Date> parse_date(string_view& is)
>> {
>> return fmap(make_date,
>> mdo( parse<int>(is), // year
>> parse_separator(is) ),
>> mdo( parse<int>(is), // month
>> parse_separator(is) ),
>> parse<int>(is) // day
>> );
>> }
>>
>> There other possibilities (e.g. using Parser) as Miro pointed out.
> Looks interesting but IMO that's higher-level then the basic level
> we're aiming for here.
I see.
> This can easily be build on top of the basic level can't it?
Sure but what is the reason to have a lower-level? Performances?
>
>> Interface using exceptions compose quite well. The interface that doesn'=
t use exceptions should be based on it. We need to see just how to report t=
he error_code. As a result, as out parameter or as a TLS.
> IMO it should be the other way around. It's cleaner to build a
> throwing interface on top of a non-throwing interface..
I'm not talking about implementation, but about design of the interface.
Vicente
--=20
---=20
You received this message because you are subscribed to the Google Groups "=
ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposa=
ls/.
.
Author: Olaf van der Spek <olafvdspek@gmail.com>
Date: Sun, 24 May 2015 12:29:27 +0200
Raw View
2015-05-23 0:04 GMT+02:00 Vicente J. Botet Escriba <vicente.botet@wanadoo.fr>:
>> Looks interesting but IMO that's higher-level then the basic level
>> we're aiming for here.
>
> I see.
>>
>> This can easily be build on top of the basic level can't it?
>
> Sure but what is the reason to have a lower-level? Performances?
Yes, simplicity and performance. A much more complex proposal takes
longer to write and much much longer to agree on.
--
Olaf
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
.
Author: "Vicente J. Botet Escriba" <vicente.botet@wanadoo.fr>
Date: Sun, 24 May 2015 23:24:56 +0200
Raw View
This is a multi-part message in MIME format.
--------------050706000809020305020508
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: quoted-printable
Le 24/05/15 12:29, Olaf van der Spek a =C3=A9crit :
> 2015-05-23 0:04 GMT+02:00 Vicente J. Botet Escriba <vicente.botet@wanadoo=
..fr>:
>>> Looks interesting but IMO that's higher-level then the basic level
>>> we're aiming for here.
>> I see.
>>> This can easily be build on top of the basic level can't it?
>> Sure but what is the reason to have a lower-level? Performances?
> Yes, simplicity and performance. A much more complex proposal takes
> longer to write and much much longer to agree on.
>
>
I agree with you on the last point.
Why do you think that the lower level interface would perform better=20
than the higher level?
Is simplicity of the proposal or simplicity of the user code that you=20
are referring to?
Vicente
--=20
---=20
You received this message because you are subscribed to the Google Groups "=
ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposa=
ls/.
--------------050706000809020305020508
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable
<html>
<head>
<meta content=3D"text/html; charset=3Dutf-8" http-equiv=3D"Content-Type=
">
</head>
<body text=3D"#000000" bgcolor=3D"#FFFFFF">
<div class=3D"moz-cite-prefix">Le 24/05/15 12:29, Olaf van der Spek a
=C3=A9crit=C2=A0:<br>
</div>
<blockquote
cite=3D"mid:CAA7U3HOqjXPOyzQnDf7sTT2_L_ucXn_5+Lt-A2UZnvQS_kf=3Dcw@mail.gmai=
l.com"
type=3D"cite">
<pre wrap=3D"">2015-05-23 0:04 GMT+02:00 Vicente J. Botet Escriba <a =
class=3D"moz-txt-link-rfc2396E" href=3D"mailto:vicente.botet@wanadoo.fr">&l=
t;vicente.botet@wanadoo.fr></a>:
</pre>
<blockquote type=3D"cite">
<blockquote type=3D"cite">
<pre wrap=3D"">Looks interesting but IMO that's higher-level then=
the basic level
we're aiming for here.
</pre>
</blockquote>
<pre wrap=3D"">
I see.
</pre>
<blockquote type=3D"cite">
<pre wrap=3D"">
This can easily be build on top of the basic level can't it?
</pre>
</blockquote>
<pre wrap=3D"">
Sure but what is the reason to have a lower-level? Performances?
</pre>
</blockquote>
<pre wrap=3D"">
Yes, simplicity and performance. A much more complex proposal takes
longer to write and much much longer to agree on.
</pre>
</blockquote>
<font size=3D"+1">I agree with you on the last point. <br>
<br>
Why do you think that the lower level interface would perform
better than the higher level?<br>
Is simplicity of the proposal or simplicity of the user code that
you are referring to?<br>
<br>
Vicente<br>
</font>
</body>
</html>
<p></p>
-- <br />
<br />
--- <br />
You received this message because you are subscribed to the Google Groups &=
quot;ISO C++ Standard - Future Proposals" group.<br />
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to <a href=3D"mailto:std-proposals+unsubscribe@isocpp.org">std-proposa=
ls+unsubscribe@isocpp.org</a>.<br />
To post to this group, send email to <a href=3D"mailto:std-proposals@isocpp=
..org">std-proposals@isocpp.org</a>.<br />
Visit this group at <a href=3D"http://groups.google.com/a/isocpp.org/group/=
std-proposals/">http://groups.google.com/a/isocpp.org/group/std-proposals/<=
/a>.<br />
--------------050706000809020305020508--
.
Author: Olaf van der Spek <olafvdspek@gmail.com>
Date: Mon, 25 May 2015 15:02:15 +0200
Raw View
2015-05-24 23:24 GMT+02:00 Vicente J. Botet Escriba <vicente.botet@wanadoo.fr>:
> Why do you think that the lower level interface would perform better than
> the higher level?
Ideally all abstractions would be free but that's not always the case.
> Is simplicity of the proposal or simplicity of the user code that you are
> referring to?
Both, though IMO performance > user code simplicity > proposal simplicity
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
.
Author: Olaf van der Spek <olafvdspek@gmail.com>
Date: Fri, 29 May 2015 08:39:45 +0200
Raw View
2015-05-22 8:24 GMT+02:00 Jens Maurer <Jens.Maurer@gmx.net>:
> On 05/20/2015 09:38 PM, Olaf van der Spek wrote:
>> 2015-05-20 21:18 GMT+02:00 Jens Maurer <Jens.Maurer@gmx.net>:
>>> On 05/18/2015 08:34 PM, Olaf van der Spek wrote:
>>> My suggestion:
>>>
>>> const char * ret = parse(T& v, const char * first, const char * last, int base, error_code&);
>>>
>>> If ret == first, v is not
>>> overwritten and we have an error, otherwise v contains the parsed value.
>>
>> Your function fails these requirements..
>
> Right, that specification was intended for the "elementary" parse
> operations only. The specification needs to be weaker for the
> compound parses. Something like "if ret == first, the value of
> "v" is unspecified, otherwise v is the result of the (partial) parse,
> with error_code containing the error encountered at the end (if any)."
Consistency is king. So do we check for errors via ret == first or via ec?
Would this one work for you?
error_code parse(T&, string_view&, int base = 10);
> Note that functions like that are NOT intended for building
> large parsers without additional helper layers in between,
Why not let it work without helpers right away?
--
Olaf
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
.
Author: Jens Maurer <Jens.Maurer@gmx.net>
Date: Fri, 29 May 2015 09:06:08 +0200
Raw View
On 05/29/2015 08:39 AM, Olaf van der Spek wrote:
> 2015-05-22 8:24 GMT+02:00 Jens Maurer <Jens.Maurer@gmx.net>:
>> On 05/20/2015 09:38 PM, Olaf van der Spek wrote:
>>> 2015-05-20 21:18 GMT+02:00 Jens Maurer <Jens.Maurer@gmx.net>:
>>>> On 05/18/2015 08:34 PM, Olaf van der Spek wrote:
>>>> My suggestion:
>>>>
>>>> const char * ret = parse(T& v, const char * first, const char * last, int base, error_code&);
>>>>
>>>> If ret == first, v is not
>>>> overwritten and we have an error, otherwise v contains the parsed value.
>>>
>>> Your function fails these requirements..
>>
>> Right, that specification was intended for the "elementary" parse
>> operations only. The specification needs to be weaker for the
>> compound parses. Something like "if ret == first, the value of
>> "v" is unspecified, otherwise v is the result of the (partial) parse,
>> with error_code containing the error encountered at the end (if any)."
>
> Consistency is king. So do we check for errors via ret == first or via ec?
Check for errors using "ec", since ret == first doesn't seem the
natural outcome for partially-successful parses of lists.
That doesn't exclude a more strict specification for certain functions,
such as when parsing a simple "double" or "int".
> Would this one work for you?
> error_code parse(T&, string_view&, int base = 10);
I prefer the iterator-style approach. The style is well-established
in the standard library, can be extended to more general iterators
if someone feels so inclined (not me), has fewer issues with over-
eager aliasing assumptions that prevent compiler optimization,
and can be made to look parallel with output. Output doesn't work
with string_view at all, because its elements are const.
Regarding the aliasing assumptions: It's easy to say "that's the
implementer's job", but I have been unable to coerce gcc into
doing the right thing when examining similar style variations for
output operations, even with gcc's __restrict extension.
Maybe I'm just stupid.
>> Note that functions like that are NOT intended for building
>> large parsers without additional helper layers in between,
>
> Why not let it work without helpers right away?
It does work, but it's certainly more inconvenient to use for a
large-scale parser compared to a parser-builder meta-language
(as shown elsewhere on this thread), and that argument applies
to both iterator-style and string_view variants.
Jens
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
.
Author: Olaf van der Spek <olafvdspek@gmail.com>
Date: Fri, 29 May 2015 09:22:03 +0200
Raw View
2015-05-29 9:06 GMT+02:00 Jens Maurer <Jens.Maurer@gmx.net>:
>> Would this one work for you?
>> error_code parse(T&, string_view&, int base = 10);
>
> I prefer the iterator-style approach. The style is well-established
> in the standard library, can be extended to more general iterators
> if someone feels so inclined (not me), has fewer issues with over-
Iterator-style is well-established indeed but range-style is more
convenient IMO.
Can't the string_view one be easily generalized too?
> eager aliasing assumptions that prevent compiler optimization,
I still don't get the aliasing problem. Isn't that only an issue with writes?
Wouldn't doing
auto it = is.begin();
auto last = is.end();
be enough?
> and can be made to look parallel with output.
> Output doesn't work
> with string_view at all, because its elements are const.
True, but IMO we should punish input for that.
> Regarding the aliasing assumptions: It's easy to say "that's the
> implementer's job", but I have been unable to coerce gcc into
> doing the right thing when examining similar style variations for
> output operations, even with gcc's __restrict extension.
> Maybe I'm just stupid.
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
.
Author: Jens Maurer <Jens.Maurer@gmx.net>
Date: Fri, 29 May 2015 10:29:49 +0200
Raw View
On 05/29/2015 09:22 AM, Olaf van der Spek wrote:
> 2015-05-29 9:06 GMT+02:00 Jens Maurer <Jens.Maurer@gmx.net>:
>>> Would this one work for you?
>>> error_code parse(T&, string_view&, int base = 10);
>>
>> I prefer the iterator-style approach. The style is well-established
>> in the standard library, can be extended to more general iterators
>> if someone feels so inclined (not me), has fewer issues with over-
>
> Iterator-style is well-established indeed but range-style is more
> convenient IMO.
Possibly. I find reference parameters where read-modify-write
happens a lot less transparent in the calling code than pass-by-value.
For me, the main purpose of a parser is to advance the state
"where am I", producing parsed values and possibly an error code
while doing so. For my taste, hiding "where am I" in a
read-modify-write parameter is too obtuse if I can help it.
(Constant ranges, i.e. ranges whose extent is not modified,
are great, though. It's always been painful to pass
<expression>.begin(), <expression>.end()
to standard algorithms, where <expression> might be non-short.)
> Can't the string_view one be easily generalized too?
Which way? Have a range_view that takes a pair of arbitrary
iterators of type It? Sure. This feels even farther away
from established STL precedence.
>> eager aliasing assumptions that prevent compiler optimization,
>
> I still don't get the aliasing problem. Isn't that only an issue with writes?
> Wouldn't doing
> auto it = is.begin();
> auto last = is.end();
> be enough?
Consider a sequence of parses; let's take "int" for exposition:
int i1, i2;
string_view s( /* whatever */ );
parse(i1, s);
parse(i2, s);
The "string_view" contains "const char *" internally, which is allowed
to alias anything.
Let's assume the "parse" function is inlined. Ideally, string_view's
components should be kept in registers and never hit memory.
Now, the first "parse" call changes the value of the member of "s"
holding s.begin() (let's call it s.first). This write must hit
memory, because s.first might point to itself or to s.end under
the aliasing rules.
In the second parse call, the "auto it = is.begin()" and
"is.end()" must read from memory afresh, because those values
might have been changed by the write to s.first of the first
parse call.
End result: string_view's components are not kept in registers.
When I looked at the issue a while ago in an output() context,
my gcc wasn't clever enough in its points-to analysis to understand
that s.first does not point to the string_view itself. Points-to
analysis is probably fairly brittle anyway, i.e. has to pessimize
quite often. Maybe that has changed meanwhile.
When s.first and s.end are separate local variables, it's obvious
to the compiler that an externally-supplied value cannot possibly
point to them.
>> and can be made to look parallel with output.
>> Output doesn't work
>> with string_view at all, because its elements are const.
>
> True, but IMO we should punish input for that.
When we're done, we should have a set of elementary parse()
functions in the standard, and a corresponding set of output()
functions. Considering some in isolation is fine up to a point,
but we shouldn't lose track of the big picture.
Jens
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
.
Author: Olaf van der Spek <olafvdspek@gmail.com>
Date: Fri, 29 May 2015 12:12:12 +0200
Raw View
2015-05-29 10:29 GMT+02:00 Jens Maurer <Jens.Maurer@gmx.net>:
>> Can't the string_view one be easily generalized too?
>
> Which way? Have a range_view that takes a pair of arbitrary
> iterators of type It? Sure. This feels even farther away
> from established STL precedence.
No, I mean adding an overload taking two or three iterators..
I guess we'll include it anyway due to the aliasing concerns.
File read() and write() also update the file pointer in a hidden way..
is doing so really that bad?
If we get unified call syntax we even might be able to write is.parse()
> Let's assume the "parse" function is inlined. Ideally, string_view's
> components should be kept in registers and never hit memory.
> Now, the first "parse" call changes the value of the member of "s"
> holding s.begin() (let's call it s.first). This write must hit
> memory, because s.first might point to itself or to s.end under
> the aliasing rules.
>
> In the second parse call, the "auto it = is.begin()" and
> "is.end()" must read from memory afresh, because those values
> might have been changed by the write to s.first of the first
> parse call.
is.begin will almost surely have been changed.
> End result: string_view's components are not kept in registers.
True, but how big of a problem is one or two reads from L1 cache?
> When s.first and s.end are separate local variables, it's obvious
> to the compiler that an externally-supplied value cannot possibly
> point to them.
Hmm, I'm inclined to call this a quality of implementation issue.
>>> and can be made to look parallel with output.
>>> Output doesn't work
>>> with string_view at all, because its elements are const.
>>
>> True, but IMO we should punish input for that.
>
> When we're done, we should have a set of elementary parse()
> functions in the standard, and a corresponding set of output()
> functions. Considering some in isolation is fine up to a point,
> but we shouldn't lose track of the big picture.
Output / format functions are a whole different story. I'm not sure
it's worth it to worry about that for this proposal.
--
Olaf
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
.
Author: Jens Maurer <Jens.Maurer@gmx.net>
Date: Fri, 29 May 2015 13:06:06 +0200
Raw View
On 05/29/2015 12:12 PM, Olaf van der Spek wrote:
> 2015-05-29 10:29 GMT+02:00 Jens Maurer <Jens.Maurer@gmx.net>:
>>> Can't the string_view one be easily generalized too?
>>
>> Which way? Have a range_view that takes a pair of arbitrary
>> iterators of type It? Sure. This feels even farther away
>> from established STL precedence.
>
> No, I mean adding an overload taking two or three iterators..
I'm looking for the most basic abstraction to be standardized.
We've standardized quite a few non-basic abstractions, e.g.
iostreams or num_get<>(), which leaves a performance gap between
"what the standard provides" and "what the user could write himself
when not needing the bells + whistles". I don't want to have a
gap discussion ever again in the area we're talking about.
I have no sustained objection to standardizing various additional
overloads in addition to standardizing the basic abstraction,
if people feel like it. (I'm not really in favor, because it
blows up the std.lib interface, which makes it a little harder to
learn each time we do that.)
> I guess we'll include it anyway due to the aliasing concerns.
>
> File read() and write() also update the file pointer in a hidden way..
> is doing so really that bad?
File I/O has hidden state, yes. Is that bad? Maybe. We also
have pread and pwrite for those that don't like the hidden state.
> If we get unified call syntax we even might be able to write is.parse()
Yes, if string_view is the first parameter (which it currently isn't).
>> Let's assume the "parse" function is inlined. Ideally, string_view's
>> components should be kept in registers and never hit memory.
>> Now, the first "parse" call changes the value of the member of "s"
>> holding s.begin() (let's call it s.first). This write must hit
>> memory, because s.first might point to itself or to s.end under
>> the aliasing rules.
>>
>> In the second parse call, the "auto it = is.begin()" and
>> "is.end()" must read from memory afresh, because those values
>> might have been changed by the write to s.first of the first
>> parse call.
>
> is.begin will almost surely have been changed.
Yes, sorry.
>> End result: string_view's components are not kept in registers.
>
> True, but how big of a problem is one or two reads from L1 cache?
That might not be the only effect. Memory writes to an arbitrary
location (from the viewpoint of the compiler) might inhibit
code motion / scheduling between the two calls to parse, too,
producing more pipeline stalls etc.
>> When s.first and s.end are separate local variables, it's obvious
>> to the compiler that an externally-supplied value cannot possibly
>> point to them.
>
> Hmm, I'm inclined to call this a quality of implementation issue.
As I said earlier, I failed to achieve the QoI for the "output"
case and a similar interface structure.
>>>> and can be made to look parallel with output.
>>>> Output doesn't work
>>>> with string_view at all, because its elements are const.
>>>
>>> True, but IMO we should punish input for that.
>>
>> When we're done, we should have a set of elementary parse()
>> functions in the standard, and a corresponding set of output()
>> functions. Considering some in isolation is fine up to a point,
>> but we shouldn't lose track of the big picture.
>
> Output / format functions are a whole different story. I'm not sure
> it's worth it to worry about that for this proposal.
My use case is a (say) JSON input / output library.
I don't want this in the standard right now, but I do want to have
the basic building blocks available so that I can write my own
in C++ whose performance blows everything else out of the water while
using standard library facilities for the basic building blocks.
The status quo is that I won't even try, because I can't get rid of
locale-dependent parsing / output formatting of numbers when using
the standard library.
Thanks,
Jens
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
.
Author: Olaf van der Spek <olafvdspek@gmail.com>
Date: Fri, 29 May 2015 13:11:08 +0200
Raw View
2015-05-29 13:06 GMT+02:00 Jens Maurer <Jens.Maurer@gmx.net>:
> I'm looking for the most basic abstraction to be standardized.
> My use case is a (say) JSON input / output library.
> I don't want this in the standard right now, but I do want to have
> the basic building blocks available so that I can write my own
> in C++ whose performance blows everything else out of the water while
> using standard library facilities for the basic building blocks.
>
> The status quo is that I won't even try, because I can't get rid of
> locale-dependent parsing / output formatting of numbers when using
> the standard library.
I heard RapidJSON is quite.. quick.
That said, I'm also aiming for functions without unnecessary overhead.
--
Olaf
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
.
Author: Miro Knejp <miro.knejp@gmail.com>
Date: Fri, 29 May 2015 14:38:35 +0200
Raw View
> On 29 May 2015, at 10:29 , Jens Maurer <Jens.Maurer@gmx.net> wrote:
>=20
> On 05/29/2015 09:22 AM, Olaf van der Spek wrote:
>> 2015-05-29 9:06 GMT+02:00 Jens Maurer <Jens.Maurer@gmx.net>:
>>>> Would this one work for you?
>>>> error_code parse(T&, string_view&, int base =3D 10);
>>>=20
>>> I prefer the iterator-style approach. The style is well-established
>>> in the standard library, can be extended to more general iterators
>>> if someone feels so inclined (not me), has fewer issues with over-
>>=20
>> Iterator-style is well-established indeed but range-style is more
>> convenient IMO.
>=20
> Possibly. I find reference parameters where read-modify-write
> happens a lot less transparent in the calling code than pass-by-value.
> For me, the main purpose of a parser is to advance the state
> "where am I", producing parsed values and possibly an error code
> while doing so. For my taste, hiding "where am I" in a
> read-modify-write parameter is too obtuse if I can help it.
>=20
> (Constant ranges, i.e. ranges whose extent is not modified,
> are great, though. It's always been painful to pass
> <expression>.begin(), <expression>.end()
> to standard algorithms, where <expression> might be non-short.)
>=20
>> Can't the string_view one be easily generalized too?
>=20
> Which way? Have a range_view that takes a pair of arbitrary
> iterators of type It? Sure. This feels even farther away
> from established STL precedence.
>=20
>>> eager aliasing assumptions that prevent compiler optimization,
>>=20
>> I still don't get the aliasing problem. Isn't that only an issue with wr=
ites?
>> Wouldn't doing
>> auto it =3D is.begin();
>> auto last =3D is.end();
>> be enough?
>=20
> Consider a sequence of parses; let's take "int" for exposition:
>=20
> int i1, i2;
> string_view s( /* whatever */ );
> parse(i1, s);
> parse(i2, s);
>=20
> The "string_view" contains "const char *" internally, which is allowed
> to alias anything.
>=20
> Let's assume the "parse" function is inlined. Ideally, string_view's
> components should be kept in registers and never hit memory.
> Now, the first "parse" call changes the value of the member of "s"
> holding s.begin() (let's call it s.first). This write must hit
> memory, because s.first might point to itself or to s.end under
> the aliasing rules.
>=20
> In the second parse call, the "auto it =3D is.begin()" and
> "is.end()" must read from memory afresh, because those values
> might have been changed by the write to s.first of the first
> parse call.
>=20
> End result: string_view's components are not kept in registers.
>=20
> When I looked at the issue a while ago in an output() context,
> my gcc wasn't clever enough in its points-to analysis to understand
> that s.first does not point to the string_view itself. Points-to
> analysis is probably fairly brittle anyway, i.e. has to pessimize
> quite often. Maybe that has changed meanwhile.
I think this argumentation is backwards. You are not doing a write operatio=
n through the char pointer, you are replacing the pointer object itself, a =
member of s. Even if the char pointer does alias the string_view there is n=
o modifying operation taking place on the dereferenced char pointer that wo=
uld change the string_view object itself. At least if the full function def=
inition is available to the translation unit. So keeping with your assumpti=
on that the parse() calls are fully inlined the compiler *can* keep s in re=
gisters because of read-only accesses of the char pointer.
--=20
---=20
You received this message because you are subscribed to the Google Groups "=
ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposa=
ls/.
.
Author: Jens Maurer <Jens.Maurer@gmx.net>
Date: Fri, 29 May 2015 16:47:18 +0200
Raw View
On 05/29/2015 01:11 PM, Olaf van der Spek wrote:
> 2015-05-29 13:06 GMT+02:00 Jens Maurer <Jens.Maurer@gmx.net>:
>> I'm looking for the most basic abstraction to be standardized.
>
>> My use case is a (say) JSON input / output library.
>> I don't want this in the standard right now, but I do want to have
>> the basic building blocks available so that I can write my own
>> in C++ whose performance blows everything else out of the water while
>> using standard library facilities for the basic building blocks.
>>
>> The status quo is that I won't even try, because I can't get rid of
>> locale-dependent parsing / output formatting of numbers when using
>> the standard library.
>
> I heard RapidJSON is quite.. quick.
https://github.com/miloyip/rapidjson
This is a good example why we need fast low-level parsing
and output in the standard.
For floating-point, it does:
// This is a C++ header-only implementation of Grisu2 algorithm from the publication:
// Loitsch, Florian. "Printing floating-point numbers quickly and accurately with
// integers." ACM Sigplan Notices 45.6 (2010): 233-243.
For integer, it has its own
inline char* u32toa(uint32_t value, char* buffer);
and
inline char* u64toa(uint64_t value, char* buffer);
These are general-purpose functions, yet the author has seen fit
to duplicate the implementation (and not use strtod or similar)?
That can't be right.
> That said, I'm also aiming for functions without unnecessary overhead.
Good. :-)
Jens
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
.
Author: Olaf van der Spek <olafvdspek@gmail.com>
Date: Fri, 29 May 2015 17:42:48 +0200
Raw View
2015-05-29 16:47 GMT+02:00 Jens Maurer <Jens.Maurer@gmx.net>:
> On 05/29/2015 01:11 PM, Olaf van der Spek wrote:
> For integer, it has its own
> inline char* u32toa(uint32_t value, char* buffer);
> and
> inline char* u64toa(uint64_t value, char* buffer);
>
> These are general-purpose functions, yet the author has seen fit
> to duplicate the implementation (and not use strtod or similar)?
> That can't be right.
That's output, not input..
I do agree we need better output functions too but again that's not in
scope for this proposal.
--
Olaf
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
.
Author: "'Jeffrey Yasskin' via ISO C++ Standard - Future Proposals" <std-proposals@isocpp.org>
Date: Fri, 29 May 2015 09:31:54 -0700
Raw View
FWIW, I think Jens' arguments are going to be convincing to the
committee, so it's probably a good idea to follow them in the first
iteration of this paper.
On Fri, May 29, 2015 at 4:06 AM, Jens Maurer <Jens.Maurer@gmx.net> wrote:
> On 05/29/2015 12:12 PM, Olaf van der Spek wrote:
>> 2015-05-29 10:29 GMT+02:00 Jens Maurer <Jens.Maurer@gmx.net>:
>>>> Can't the string_view one be easily generalized too?
>>>
>>> Which way? Have a range_view that takes a pair of arbitrary
>>> iterators of type It? Sure. This feels even farther away
>>> from established STL precedence.
>>
>> No, I mean adding an overload taking two or three iterators..
>
> I'm looking for the most basic abstraction to be standardized.
>
> We've standardized quite a few non-basic abstractions, e.g.
> iostreams or num_get<>(), which leaves a performance gap between
> "what the standard provides" and "what the user could write himself
> when not needing the bells + whistles". I don't want to have a
> gap discussion ever again in the area we're talking about.
>
> I have no sustained objection to standardizing various additional
> overloads in addition to standardizing the basic abstraction,
> if people feel like it. (I'm not really in favor, because it
> blows up the std.lib interface, which makes it a little harder to
> learn each time we do that.)
>
>> I guess we'll include it anyway due to the aliasing concerns.
>>
>> File read() and write() also update the file pointer in a hidden way..
>> is doing so really that bad?
>
> File I/O has hidden state, yes. Is that bad? Maybe. We also
> have pread and pwrite for those that don't like the hidden state.
>
>> If we get unified call syntax we even might be able to write is.parse()
>
> Yes, if string_view is the first parameter (which it currently isn't).
>
>>> Let's assume the "parse" function is inlined. Ideally, string_view's
>>> components should be kept in registers and never hit memory.
>>> Now, the first "parse" call changes the value of the member of "s"
>>> holding s.begin() (let's call it s.first). This write must hit
>>> memory, because s.first might point to itself or to s.end under
>>> the aliasing rules.
>>>
>>> In the second parse call, the "auto it = is.begin()" and
>>> "is.end()" must read from memory afresh, because those values
>>> might have been changed by the write to s.first of the first
>>> parse call.
>>
>> is.begin will almost surely have been changed.
>
> Yes, sorry.
>
>>> End result: string_view's components are not kept in registers.
>>
>> True, but how big of a problem is one or two reads from L1 cache?
>
> That might not be the only effect. Memory writes to an arbitrary
> location (from the viewpoint of the compiler) might inhibit
> code motion / scheduling between the two calls to parse, too,
> producing more pipeline stalls etc.
>
>>> When s.first and s.end are separate local variables, it's obvious
>>> to the compiler that an externally-supplied value cannot possibly
>>> point to them.
>>
>> Hmm, I'm inclined to call this a quality of implementation issue.
>
> As I said earlier, I failed to achieve the QoI for the "output"
> case and a similar interface structure.
>
>>>>> and can be made to look parallel with output.
>>>>> Output doesn't work
>>>>> with string_view at all, because its elements are const.
>>>>
>>>> True, but IMO we should punish input for that.
>>>
>>> When we're done, we should have a set of elementary parse()
>>> functions in the standard, and a corresponding set of output()
>>> functions. Considering some in isolation is fine up to a point,
>>> but we shouldn't lose track of the big picture.
>>
>> Output / format functions are a whole different story. I'm not sure
>> it's worth it to worry about that for this proposal.
>
> My use case is a (say) JSON input / output library.
> I don't want this in the standard right now, but I do want to have
> the basic building blocks available so that I can write my own
> in C++ whose performance blows everything else out of the water while
> using standard library facilities for the basic building blocks.
>
> The status quo is that I won't even try, because I can't get rid of
> locale-dependent parsing / output formatting of numbers when using
> the standard library.
>
> Thanks,
> Jens
>
> --
>
> ---
> You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
> To post to this group, send email to std-proposals@isocpp.org.
> Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
.
Author: Thiago Macieira <thiago@macieira.org>
Date: Fri, 29 May 2015 13:15:43 -0700
Raw View
On Friday 29 May 2015 17:42:48 Olaf van der Spek wrote:
> 2015-05-29 16:47 GMT+02:00 Jens Maurer <Jens.Maurer@gmx.net>:
> > On 05/29/2015 01:11 PM, Olaf van der Spek wrote:
> > For integer, it has its own
> > inline char* u32toa(uint32_t value, char* buffer);
> > and
> > inline char* u64toa(uint64_t value, char* buffer);
> >
> > These are general-purpose functions, yet the author has seen fit
> > to duplicate the implementation (and not use strtod or similar)?
> > That can't be right.
>
> That's output, not input..
> I do agree we need better output functions too but again that's not in
> scope for this proposal.
Indeed, but we probably need it even more badly than parsing numbers.
We have strtol/strtod from C99 and POSIX to parse numbers and we even have
strtol_l from POSIX to parse it independent of locale. But the string
conversion still requires sprintf in C or ostringstream in C++. That's a hell
of an overhead...
--
Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org
Software Architect - Intel Open Source Technology Center
PGP/GPG: 0x6EF45358; fingerprint:
E067 918B B660 DBD1 105C 966C 33F5 F005 6EF4 5358
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
.
Author: Matthew Fioravante <fmatthew5876@gmail.com>
Date: Fri, 29 May 2015 20:30:23 -0700 (PDT)
Raw View
------=_Part_1274_2069023366.1432956623835
Content-Type: multipart/alternative;
boundary="----=_Part_1275_2020500138.1432956623835"
------=_Part_1275_2020500138.1432956623835
Content-Type: text/plain; charset=UTF-8
Output is a much harder problem because you don't know the size of the
resulting string required to store the textual representation.
The easiest way is to just dynamically allocate like to_string() but of
course this is not an optimal interface because you have to allocate a
separate buffer for every conversion.
template <typename T>
string serialize(const T& val);
The slightly less convenient way is passing string or vector<char> or some
other "container" via out parameter and have the method resize it as
needed. This is more efficient because you can reuse the same buffer for
multiple parses (see std::getline()). This is easy to use but it creates a
hard dependency on std::string or whatever is used as the buffer class
type. An interface like this could be templated to allow any container
supporting push_back().
template <typename T>
void serialize(std::string& buf, const T& val);
The most low level interface is to provide a fixed size buffer (via
pointers) but then what happens if you run out of space? Truncate and
report an error? How do you know how much space is actually needed? Can you
restart the operation from the exact point where it failed or do you have
to restart the serialization algorithm from scratch again?
Pushing the buffer size question back onto the client allows for the most
generic interface possible. The client's response to a full buffer could be
many depending on the situation. Example behaviors include truncating,
allocating more space and continuing, writing the current buffer to a
stream and then continuing again from the start.. Such an interface could
allow class authors to write out of line class serialization kernels that
would work efficiently for all uses cases (memory, files, network, etc...).
For example, a higher level interface which serializes to files could run
this kernel function directly on the internal file buffer and then flush it
to disk when full.
template <typename T>
char* serialize(char* b, char*e, const T& val, error_code& ec);
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
------=_Part_1275_2020500138.1432956623835
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable
<div dir=3D"ltr">Output is a much harder problem because you don't know the=
size of the resulting string required to store the textual representation.=
<div><br></div><div>The easiest way is to just dynamically allocate like to=
_string() but of course this is not an optimal interface because you have t=
o allocate a separate buffer for every conversion.</div><div><br></div><div=
><div class=3D"prettyprint" style=3D"border: 1px solid rgb(187, 187, 187); =
word-wrap: break-word; background-color: rgb(250, 250, 250);"><code class=
=3D"prettyprint"><div class=3D"subprettyprint"><font color=3D"#660066"><spa=
n style=3D"color: #008;" class=3D"styled-by-prettify">template</span><span =
style=3D"color: #000;" class=3D"styled-by-prettify"> </span><span style=3D"=
color: #660;" class=3D"styled-by-prettify"><</span><span style=3D"color:=
#008;" class=3D"styled-by-prettify">typename</span><span style=3D"color: #=
000;" class=3D"styled-by-prettify"> T</span><span style=3D"color: #660;" cl=
ass=3D"styled-by-prettify">></span><span style=3D"color: #000;" class=3D=
"styled-by-prettify"><br></span><span style=3D"color: #008;" class=3D"style=
d-by-prettify">string</span><span style=3D"color: #000;" class=3D"styled-by=
-prettify"> serialize</span><span style=3D"color: #660;" class=3D"styled-by=
-prettify">(</span><span style=3D"color: #008;" class=3D"styled-by-prettify=
">const</span><span style=3D"color: #000;" class=3D"styled-by-prettify"> T<=
/span><span style=3D"color: #660;" class=3D"styled-by-prettify">&</span=
><span style=3D"color: #000;" class=3D"styled-by-prettify"> val</span><span=
style=3D"color: #660;" class=3D"styled-by-prettify">);</span></font></div>=
</code></div><br></div><div>The slightly less convenient way is passing str=
ing or vector<char> or some other "container" via out parameter and h=
ave the method resize it as needed. This is more efficient because you can =
reuse the same buffer for multiple parses (see std::getline()). This is eas=
y to use but it creates a hard dependency on std::string or whatever is use=
d as the buffer class type. An interface like this could be templated to al=
low any container supporting push_back().</div><div><br></div><div><div cla=
ss=3D"prettyprint" style=3D"border: 1px solid rgb(187, 187, 187); word-wrap=
: break-word; background-color: rgb(250, 250, 250);"><code class=3D"prettyp=
rint"><div class=3D"subprettyprint"><span style=3D"color: #008;" class=3D"s=
tyled-by-prettify">template</span><span style=3D"color: #000;" class=3D"sty=
led-by-prettify"> </span><span style=3D"color: #660;" class=3D"styled-by-pr=
ettify"><</span><span style=3D"color: #008;" class=3D"styled-by-prettify=
">typename</span><span style=3D"color: #000;" class=3D"styled-by-prettify">=
T</span><span style=3D"color: #660;" class=3D"styled-by-prettify">></sp=
an><span style=3D"color: #000;" class=3D"styled-by-prettify"><br></span><sp=
an style=3D"color: #008;" class=3D"styled-by-prettify">void</span><span sty=
le=3D"color: #000;" class=3D"styled-by-prettify"> serialize</span><span sty=
le=3D"color: #660;" class=3D"styled-by-prettify">(</span><span style=3D"col=
or: #000;" class=3D"styled-by-prettify">std</span><span style=3D"color: #66=
0;" class=3D"styled-by-prettify">::</span><span style=3D"color: #008;" clas=
s=3D"styled-by-prettify">string</span><span style=3D"color: #660;" class=3D=
"styled-by-prettify">&</span><font color=3D"#000000"><span style=3D"col=
or: #000;" class=3D"styled-by-prettify"> buf</span><span style=3D"color: #6=
60;" class=3D"styled-by-prettify">,</span><span style=3D"color: #000;" clas=
s=3D"styled-by-prettify"> </span><span style=3D"color: #008;" class=3D"styl=
ed-by-prettify">const</span><span style=3D"color: #000;" class=3D"styled-by=
-prettify"> T</span><span style=3D"color: #660;" class=3D"styled-by-prettif=
y">&</span><span style=3D"color: #000;" class=3D"styled-by-prettify"> v=
al</span><span style=3D"color: #660;" class=3D"styled-by-prettify">);</span=
></font></div></code></div><br></div><div><div>The most low level interface=
is to provide a fixed size buffer (via pointers) but then what happens if =
you run out of space? Truncate and report an error? How do you know how muc=
h space is actually needed? Can you restart the operation from the exact po=
int where it failed or do you have to restart the serialization algorithm f=
rom scratch again?</div><div><br></div><div>Pushing the buffer size questio=
n back onto the client allows for the most generic interface possible. The =
client's response to a full buffer could be many depending on the situation=
.. Example behaviors include truncating, allocating more space and continuin=
g, writing the current buffer to a stream and then continuing again from th=
e start.. Such an interface could allow class authors to write out of line =
class serialization kernels that would work efficiently for all uses cases =
(memory, files, network, etc...). For example, a higher level interface whi=
ch serializes to files could run this kernel function directly on the inter=
nal file buffer and then flush it to disk when full.<br></div><div><br></di=
v><div><div class=3D"prettyprint" style=3D"border: 1px solid rgb(187, 187, =
187); word-wrap: break-word; background-color: rgb(250, 250, 250);"><code c=
lass=3D"prettyprint"><div class=3D"subprettyprint"><span style=3D"color: #0=
08;" class=3D"styled-by-prettify">template</span><span style=3D"color: #000=
;" class=3D"styled-by-prettify"> </span><span style=3D"color: #660;" class=
=3D"styled-by-prettify"><</span><span style=3D"color: #008;" class=3D"st=
yled-by-prettify">typename</span><span style=3D"color: #000;" class=3D"styl=
ed-by-prettify"> T</span><span style=3D"color: #660;" class=3D"styled-by-pr=
ettify">></span><span style=3D"color: #000;" class=3D"styled-by-prettify=
"><br></span><span style=3D"color: #008;" class=3D"styled-by-prettify">char=
</span><span style=3D"color: #660;" class=3D"styled-by-prettify">*</span><f=
ont color=3D"#000000"><span style=3D"color: #000;" class=3D"styled-by-prett=
ify"> serialize</span><span style=3D"color: #660;" class=3D"styled-by-prett=
ify">(</span><span style=3D"color: #008;" class=3D"styled-by-prettify">char=
</span><span style=3D"color: #660;" class=3D"styled-by-prettify">*</span><s=
pan style=3D"color: #000;" class=3D"styled-by-prettify"> b</span><span styl=
e=3D"color: #660;" class=3D"styled-by-prettify">,</span><span style=3D"colo=
r: #000;" class=3D"styled-by-prettify"> </span><span style=3D"color: #008;"=
class=3D"styled-by-prettify">char</span><span style=3D"color: #660;" class=
=3D"styled-by-prettify">*</span><span style=3D"color: #000;" class=3D"style=
d-by-prettify">e</span><span style=3D"color: #660;" class=3D"styled-by-pret=
tify">,</span><span style=3D"color: #000;" class=3D"styled-by-prettify"> </=
span><span style=3D"color: #008;" class=3D"styled-by-prettify">const</span>=
<span style=3D"color: #000;" class=3D"styled-by-prettify"> T</span><span st=
yle=3D"color: #660;" class=3D"styled-by-prettify">&</span><span style=
=3D"color: #000;" class=3D"styled-by-prettify"> val</span><span style=3D"co=
lor: #660;" class=3D"styled-by-prettify">,</span><span style=3D"color: #000=
;" class=3D"styled-by-prettify"> error_code</span><span style=3D"color: #66=
0;" class=3D"styled-by-prettify">&</span><span style=3D"color: #000;" c=
lass=3D"styled-by-prettify"> ec</span><span style=3D"color: #660;" class=3D=
"styled-by-prettify">);</span></font></div></code></div></div></div></div>
<p></p>
-- <br />
<br />
--- <br />
You received this message because you are subscribed to the Google Groups &=
quot;ISO C++ Standard - Future Proposals" group.<br />
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to <a href=3D"mailto:std-proposals+unsubscribe@isocpp.org">std-proposa=
ls+unsubscribe@isocpp.org</a>.<br />
To post to this group, send email to <a href=3D"mailto:std-proposals@isocpp=
..org">std-proposals@isocpp.org</a>.<br />
Visit this group at <a href=3D"http://groups.google.com/a/isocpp.org/group/=
std-proposals/">http://groups.google.com/a/isocpp.org/group/std-proposals/<=
/a>.<br />
------=_Part_1275_2020500138.1432956623835--
------=_Part_1274_2069023366.1432956623835--
.
Author: Bjorn Reese <breese@mail1.stofanet.dk>
Date: Sat, 30 May 2015 11:04:29 +0200
Raw View
On 05/29/2015 04:47 PM, Jens Maurer wrote:
> These are general-purpose functions, yet the author has seen fit
> to duplicate the implementation (and not use strtod or similar)?
> That can't be right.
Adherence to specification is one of the most main reasons for
implementing your own conversion functions for textual protocols (of
which JSON is just one example.) These protocols usually define exactly
how, say, floating-point numbers must be represented, and these
representations may not correspond to those chosen by C++.
One of the requirements mentioned in this thread is the absence of
locale. This means that it becomes difficult to use the proposed parsing
functions for a wider set of protocols, because there is variation in
how they represent numbers. For example, XSLT uses thousands
separators, while JSON does not. Another example, NaN and infinity is
represented as "null" in JSON.
If you investigate JSON or XML parsers you will find that they even
implement their own ctypes. For instance, a whitespace in JSON does not
match the C locale std::isspace().
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
.
Author: Olaf van der Spek <olafvdspek@gmail.com>
Date: Sat, 30 May 2015 13:35:45 +0200
Raw View
2015-05-29 18:31 GMT+02:00 'Jeffrey Yasskin' via ISO C++ Standard -
Future Proposals <std-proposals@isocpp.org>:
> FWIW, I think Jens' arguments are going to be convincing to the
> committee, so it's probably a good idea to follow them in the first
> iteration of this paper.
The aliasing issue (which shouldn't be an issue) or the iterator-based
interface?
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
.
Author: Jens Maurer <Jens.Maurer@gmx.net>
Date: Sat, 30 May 2015 21:31:45 +0200
Raw View
On 05/30/2015 11:04 AM, Bjorn Reese wrote:
> On 05/29/2015 04:47 PM, Jens Maurer wrote:
>
>> These are general-purpose functions, yet the author has seen fit
>> to duplicate the implementation (and not use strtod or similar)?
>> That can't be right.
>
> Adherence to specification is one of the most main reasons for
> implementing your own conversion functions for textual protocols (of
> which JSON is just one example.) These protocols usually define exactly
> how, say, floating-point numbers must be represented, and these
> representations may not correspond to those chosen by C++.
I haven't seen a textual protocol where thousand separators
would be mandatory.
For floating-point parsing, doing a pre-parse and removing the
thousand separators and switching the decimal point is an option;
another option would be to make these parameters of the standard
parse functions for floating-point. I'm not seeing a large
use for that.
> One of the requirements mentioned in this thread is the absence of
> locale. This means that it becomes difficult to use the proposed parsing
> functions for a wider set of protocols, because there is variation in
> how they represent numbers.
From what I've seen so far, if a protocol chooses decimal text
representations, the representation of integers is the
"obvious" one.
For floating-point, people seem to want exact round-trip capability
with minimal text used. C++ doesn't guarantee this right now, but
it seems reasonable to require this from new parser/output
functions. The algorithms for these are non-trivial, but
well-known.
http://www.cs.tufts.edu/~nr/cs257/archive/florian-loitsch/printf.pdf
http://www.cesura17.net/~will/Professional/Research/Papers/howtoread.pdf
> For example, XSLT uses thousands
> separators,
I can't find support for this claim.
http://www.w3.org/TR/xslt#section-Expressions
says XSLT is using XPath expressions, and
http://www.w3.org/TR/xpath/#NT-Number
doesn't seem to allow thousand-separators.
> while JSON does not. Another example, NaN and infinity is
> represented as "null" in JSON.
Good point about NaN and infinity. We should probably offer
parsers for these separate from the "main" floating-point
parser to allow for more flexibility for the caller.
Since JSON can't represent NaN and infinity (which one is
it if you get a "null"?), that's a separate challenge for
the JSON parser.
> If you investigate JSON or XML parsers you will find that they even
> implement their own ctypes. For instance, a whitespace in JSON does not
> match the C locale std::isspace().
Agreed. A JSON parser is still a JSON parser and is probably bad
at parsing XSLT. I do not anticipate standard-provided number
parsers to skip whitespace (deviating from strtod's behavior).
Depending on how much of a "complete" solution we want to offer
in the standard, a function such as
parse_and_skip("set-of-chars-to-skip")
might be helpful. (For null-terminated strings, that function
seems to be called strspn, btw.)
Jens
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
.
Author: Olaf van der Spek <olafvdspek@gmail.com>
Date: Fri, 12 Jun 2015 06:51:55 -0700 (PDT)
Raw View
------=_Part_27_1517086799.1434117115272
Content-Type: multipart/alternative;
boundary="----=_Part_28_604489962.1434117115272"
------=_Part_28_604489962.1434117115272
Content-Type: text/plain; charset=UTF-8
Some more Qs:
Should (unsigned) short be supported?
Should (unsigned) char be supported?
Bool?
Should octal input be supported if base = 0?
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
------=_Part_28_604489962.1434117115272
Content-Type: text/html; charset=UTF-8
<div dir="ltr"><div>Some more Qs:</div><div>Should (unsigned) short be supported?</div>Should (unsigned) char be supported?<div>Bool?</div><div><br></div><div>Should octal input be supported if base = 0?</div></div>
<p></p>
-- <br />
<br />
--- <br />
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.<br />
To unsubscribe from this group and stop receiving emails from it, send an email to <a href="mailto:std-proposals+unsubscribe@isocpp.org">std-proposals+unsubscribe@isocpp.org</a>.<br />
To post to this group, send email to <a href="mailto:std-proposals@isocpp.org">std-proposals@isocpp.org</a>.<br />
Visit this group at <a href="http://groups.google.com/a/isocpp.org/group/std-proposals/">http://groups.google.com/a/isocpp.org/group/std-proposals/</a>.<br />
------=_Part_28_604489962.1434117115272--
------=_Part_27_1517086799.1434117115272--
.
Author: Matthew Woehlke <mw_triad@users.sourceforge.net>
Date: Fri, 12 Jun 2015 10:03:30 -0400
Raw View
On 2015-06-12 09:51, Olaf van der Spek wrote:
> Should octal input be supported if base = 0?
IMHO, yes, if by "base = 0" you mean "detect base from prefix in input
string". Particularly, we should support at least what strtol does.
--
Matthew
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
.
Author: Matthew Fioravante <fmatthew5876@gmail.com>
Date: Fri, 12 Jun 2015 08:49:20 -0700 (PDT)
Raw View
------=_Part_528_1549752212.1434124160583
Content-Type: multipart/alternative;
boundary="----=_Part_529_2003756268.1434124160583"
------=_Part_529_2003756268.1434124160583
Content-Type: text/plain; charset=UTF-8
On Friday, June 12, 2015 at 9:51:55 AM UTC-4, Olaf van der Spek wrote:
>
> Some more Qs:
> Should (unsigned) short be supported?
>
Should (unsigned) char be supported?
>
I would support all integral types. I also believe unsigned types should be
supported because if you really want that high order bit without triggering
an overflow error you can't use a signed type. I would also make it an
error to try to specify a negative literal (i.e. '-' prefix) if T is
unsigned integral.
> Bool?
>
This is tricky because valid strings could be (0, 1), (true, false), (True,
False), (T,F), etc.... If the parsing function is named something like
str_to_num() then it makes sense to support bool with "0" and "1". If its
more generic like parse(), then the question of valid inputs becomes more
ambiguous.
>
> Should octal input be supported if base = 0?
>
I would at minimum support all of the literals supported by the core
language string literals. That is, hex (0x), decimal (), octal (0), and
binary (0b).
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
------=_Part_529_2003756268.1434124160583
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable
<div dir=3D"ltr"><br><br>On Friday, June 12, 2015 at 9:51:55 AM UTC-4, Olaf=
van der Spek wrote:<blockquote class=3D"gmail_quote" style=3D"margin: 0pt =
0pt 0pt 0.8ex; border-left: 1px solid rgb(204, 204, 204); padding-left: 1ex=
;"><div dir=3D"ltr"><div>Some more Qs:</div><div>Should (unsigned) short be=
supported?</div></div></blockquote><blockquote class=3D"gmail_quote" style=
=3D"margin: 0pt 0pt 0pt 0.8ex; border-left: 1px solid rgb(204, 204, 204); p=
adding-left: 1ex;"><div dir=3D"ltr">Should (unsigned) char be supported?</d=
iv></blockquote><div><br>I would support all integral types. I also believe=
unsigned types should be supported because if you really want that high or=
der bit without triggering an overflow error you can't use a signed type. I=
would also make it an error to try to specify a negative literal (i.e. '-'=
prefix) if T is unsigned integral.<br> </div><blockquote class=3D"gma=
il_quote" style=3D"margin: 0pt 0pt 0pt 0.8ex; border-left: 1px solid rgb(20=
4, 204, 204); padding-left: 1ex;"><div dir=3D"ltr"><div>Bool?<br></div></di=
v></blockquote><div><br>This is tricky because valid strings could be (0, 1=
), (true, false), (True, False), (T,F), etc.... If the parsing function is =
named something like str_to_num() then it makes sense to support bool with =
"0" and "1". If its more generic like parse(), then the question of valid i=
nputs becomes more ambiguous.<br><br></div><blockquote class=3D"gmail_quote=
" style=3D"margin: 0pt 0pt 0pt 0.8ex; border-left: 1px solid rgb(204, 204, =
204); padding-left: 1ex;"><div dir=3D"ltr"><div> </div></div></blockqu=
ote><blockquote class=3D"gmail_quote" style=3D"margin: 0pt 0pt 0pt 0.8ex; b=
order-left: 1px solid rgb(204, 204, 204); padding-left: 1ex;"><div dir=3D"l=
tr"><div><br></div><div>Should octal input be supported if base =3D 0?</div=
></div></blockquote><div><br>I would at minimum support all of the literals=
supported by the core language string literals. That is, hex (0x), decimal=
(), octal (0), and binary (0b).<br></div></div>
<p></p>
-- <br />
<br />
--- <br />
You received this message because you are subscribed to the Google Groups &=
quot;ISO C++ Standard - Future Proposals" group.<br />
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to <a href=3D"mailto:std-proposals+unsubscribe@isocpp.org">std-proposa=
ls+unsubscribe@isocpp.org</a>.<br />
To post to this group, send email to <a href=3D"mailto:std-proposals@isocpp=
..org">std-proposals@isocpp.org</a>.<br />
Visit this group at <a href=3D"http://groups.google.com/a/isocpp.org/group/=
std-proposals/">http://groups.google.com/a/isocpp.org/group/std-proposals/<=
/a>.<br />
------=_Part_529_2003756268.1434124160583--
------=_Part_528_1549752212.1434124160583--
.
Author: Olaf van der Spek <olafvdspek@gmail.com>
Date: Fri, 12 Jun 2015 19:14:14 +0200
Raw View
2015-06-12 17:49 GMT+02:00 Matthew Fioravante <fmatthew5876@gmail.com>:
>
>
> On Friday, June 12, 2015 at 9:51:55 AM UTC-4, Olaf van der Spek wrote:
>>
>> Some more Qs:
>> Should (unsigned) short be supported?
>>
>> Should (unsigned) char be supported?
>
>
> I would support all integral types. I also believe unsigned types should be
> supported because if you really want that high order bit without triggering
> an overflow error you can't use a signed type.
Of course
> I would also make it an error
> to try to specify a negative literal (i.e. '-' prefix) if T is unsigned
> integral.
What about "-0"?
"+1"?
>>
>> Bool?
>
>
> This is tricky because valid strings could be (0, 1), (true, false), (True,
> False), (T,F), etc.... If the parsing function is named something like
> str_to_num() then it makes sense to support bool with "0" and "1". If its
> more generic like parse(), then the question of valid inputs becomes more
> ambiguous.
True
There's also the question of what to do with for example -1 and 2. Map
to true or return invalid input?
>> Should octal input be supported if base = 0?
>
>
> I would at minimum support all of the literals supported by the core
> language string literals. That is, hex (0x), decimal (), octal (0), and
> binary (0b).
Does strtol support 0b?
0 as prefix is problematic as 09 for example is probably not intended
as an octal number.
--
Olaf
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
.
Author: Matthew Fioravante <fmatthew5876@gmail.com>
Date: Fri, 12 Jun 2015 10:58:10 -0700 (PDT)
Raw View
------=_Part_1299_1999294137.1434131890825
Content-Type: multipart/alternative;
boundary="----=_Part_1300_459935950.1434131890825"
------=_Part_1300_459935950.1434131890825
Content-Type: text/plain; charset=UTF-8
On Friday, June 12, 2015 at 1:14:17 PM UTC-4, Olaf van der Spek wrote:
>
> 2015-06-12 17:49 GMT+02:00 Matthew Fioravante <fmatth...@gmail.com
> <javascript:>>:
> >
> >
> > On Friday, June 12, 2015 at 9:51:55 AM UTC-4, Olaf van der Spek wrote:
> >>
> >> Some more Qs:
> >> Should (unsigned) short be supported?
> >>
> >> Should (unsigned) char be supported?
> >
> >
> > I would support all integral types. I also believe unsigned types should
> be
> > supported because if you really want that high order bit without
> triggering
> > an overflow error you can't use a signed type.
>
> Of course
>
> > I would also make it an error
> > to try to specify a negative literal (i.e. '-' prefix) if T is unsigned
> > integral.
>
> What about "-0"?
>
Good question. Having an exception for zero seems odd. It could depend on
how the error condition is defined. If "-235" results in a kind of "out of
range" error then one could argue that "-0" is in range so it should be
valid. On the other hand if "-235" is considered a "parsing" error because
'-' is not in the grammar, then I guess "-0" should be rejected as well.
Another argument is that for integers "-0" is not an actual value but more
like taking 0 and negating it with operator-(int). In otherwords, this is
an expression and now a raw literal integer value. These functions only
parse raw numbers, not mathematical expressions so "-0" is invalid.
I'd be fine either way. This is borderline bikeshed materal but should
definately be brought up in a proposal.
"+1"?
>
If "+" prefix is supported for signed, I would support it for unsigned
also.
> >>
> >> Bool?
> >
> >
> > This is tricky because valid strings could be (0, 1), (true, false),
> (True,
> > False), (T,F), etc.... If the parsing function is named something like
> > str_to_num() then it makes sense to support bool with "0" and "1". If
> its
> > more generic like parse(), then the question of valid inputs becomes
> more
> > ambiguous.
>
> True
> There's also the question of what to do with for example -1 and 2. Map
> to true or return invalid input?
>
Whats the valid range? 32 bits? 64 bits? sizeof(bool) / CHAR_BIT bits?
I would say invalid.
If you want to support parsings like -1 and 2 into bool then parse the
string into an int and then convert the int to bool yourself. I think we
should leave type conversions up to the user and be very strict with
parsing only valid values of the given type.
The parser itself should no do type conversions. This is also why I think
unsigned parsing should not allow negative values.
Since std::is_Integral<bool> == true, one could argue that we should treat
bools just like ints. That is, the parse overload for bool has a base
parameter (which actually does nothing since the only valid values are 0
and 1 in all bases). A base of "0" (auto-detect) could support all of the
same prefixes as well as the strings "true" and "false" since those are
literals in C++.
//Ignore errors for this example
template <typename T> T parse(string_view s);
for(int i = 2; i < 16; ++i) {
assert(!parse<bool>("0", i));
assert(parse<bool>("1", i));
}
//hex prefix
assert(!parse<bool>("0x0", 0));
assert(parse<bool>("0x1", 0));
//decimal no prefix
assert(!parse<bool>("0", 0));
assert(parse<bool>("1", 0));
//octal prefix
assert(!parse<bool>("00", 0));
assert(parse<bool>("01", 0));
//binary prefix
assert(!parse<bool>("0b0", 0));
assert(parse<bool>("0b1", 0));
//C++ true and false literals
assert(!parse<bool>("false", 0));
assert(parse<bool>("true", 0));
>
> >> Should octal input be supported if base = 0?
> >
> >
> > I would at minimum support all of the literals supported by the core
> > language string literals. That is, hex (0x), decimal (), octal (0), and
> > binary (0b).
>
> Does strtol support 0b?
>
http://en.cppreference.com/w/cpp/string/byte/strtol
No, but neither did C or C++ when the interface was defined. Binary
literals are useful and should be included. We are already breaking
compatibility by removing support for leading white space. I don't see any
strong reason why we have to conform to strtol(). I'd rather use the
integer literal prefixes supported by the core language as a guide.
Compatibility with the core language in this respect makes the interface
easier to understand for novices because they learn one set of numeric
prefixes and it works the same way everywhere. Eventually strtol() and its
ilk will go to the dustbin of history. People will not be referring back to
strtol() to figure out how to use this new interface.
> 0 as prefix is problematic as 09 for example is probably not intended
> as an octal number.
>
If the user wants to accept 09 as a decimal number, they should call
parse<int>(s, 10).
The base=0 is a simple helper utility for the most common use case. If the
user for some reason has unusal requirements like only supporting hex and
decimal but not octal and binary, then they can write a wrapper which
parses the prefix.
If you want to give the user maximum control, make a third defaulted out
argument which is a bitmask of what prefixes are requested. This would also
allow users to turn off prefix support when parsing a non-zero base (see
below).
auto hex_prefix = 1 << 16
auto dec_prefix = 1 << 10
auto oct_prefix = 1 << 8
auto bin_prefix = 1 << 2
template <typename T>
T parse(string_view s, int base=10, unsigned int
prefixes_to_check=0xFFFF);
auto hex_or_dec = hex_prefix | dec_prefix;
parse<int>("0x9", 0, hex_or_dec); //Ok == 9
parse<int>("9" , 0, hex_or_dec); //Ok == 9
parse<int>("09", 0, hex_or_dec); //Ok == 9 (parsed as decimal)
parse<int>("0b1001", 0, hex_or_dec); //Error bad input character "b"
parse<int>("F", 16); //Ok == 15
parse<int>("0xF", 16); //Ok == 15
parse<int>("F", 16, 0); //Ok == 15
parse<int>("0xF", 16, 0; //Error, bad input character "x"
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
------=_Part_1300_459935950.1434131890825
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable
<br><br>On Friday, June 12, 2015 at 1:14:17 PM UTC-4, Olaf van der Spek wro=
te:<blockquote class=3D"gmail_quote" style=3D"margin: 0pt 0pt 0pt 0.8ex; bo=
rder-left: 1px solid rgb(204, 204, 204); padding-left: 1ex;">2015-06-12 17:=
49 GMT+02:00 Matthew Fioravante <<a href=3D"javascript:" target=3D"_blan=
k" gdf-obfuscated-mailto=3D"jiLGAF4gK8YJ" rel=3D"nofollow" onmousedown=3D"t=
his.href=3D'javascript:';return true;" onclick=3D"this.href=3D'javascript:'=
;return true;">fmatth...@gmail.com</a>>:
<br>>
<br>>
<br>> On Friday, June 12, 2015 at 9:51:55 AM UTC-4, Olaf van der Spek wr=
ote:
<br>>>
<br>>> Some more Qs:
<br>>> Should (unsigned) short be supported?
<br>>>
<br>>> Should (unsigned) char be supported?
<br>>
<br>>
<br>> I would support all integral types. I also believe unsigned types =
should be
<br>> supported because if you really want that high order bit without t=
riggering
<br>> an overflow error you can't use a signed type.
<br>
<br>Of course
<br>
<br>> I would also make it an error
<br>> to try to specify a negative literal (i.e. '-' prefix) if T is uns=
igned
<br>> integral.
<br>
<br>What about "-0"?
<br></blockquote><div><br>Good question. Having an exception for zero seems=
odd. It could depend on how the error condition is defined. If "-235" resu=
lts in a kind of "out of range" error then one could argue that "-0" is in =
range so it should be valid. On the other hand if "-235" is considered a "p=
arsing" error because '-' is not in the grammar, then I guess "-0" should b=
e rejected as well. <br><br>Another argument is that for integers "-0" is n=
ot an actual value but more like taking 0 and negating it with operator-(in=
t). In otherwords, this is an expression and now a raw literal integer valu=
e. These functions only parse raw numbers, not mathematical expressions so =
"-0" is invalid.<br><br>I'd be fine either way. This is borderline bikeshed=
materal but should definately be brought up in a proposal.<br><br></div><b=
lockquote class=3D"gmail_quote" style=3D"margin: 0pt 0pt 0pt 0.8ex; border-=
left: 1px solid rgb(204, 204, 204); padding-left: 1ex;">"+1"?
<br></blockquote><div><br>If "+" prefix is supported for signed, I would su=
pport it for unsigned also. <br><br></div><blockquote class=3D"gmail_quote"=
style=3D"margin: 0pt 0pt 0pt 0.8ex; border-left: 1px solid rgb(204, 204, 2=
04); padding-left: 1ex;">
<br>>>
<br>>> Bool?
<br>>
<br>>
<br>> This is tricky because valid strings could be (0, 1), (true, false=
), (True,
<br>> False), (T,F), etc.... If the parsing function is named something =
like
<br>> str_to_num() then it makes sense to support bool with "0" and "1".=
If its
<br>> more generic like parse(), then the question of valid inputs becom=
es more
<br>> ambiguous.
<br>
<br>True
<br>There's also the question of what to do with for example -1 and 2. Map
<br>to true or return invalid input?
<br></blockquote><div><br>Whats the valid range? 32 bits? 64 bits? sizeof(b=
ool) / CHAR_BIT bits? <br><br>I would say invalid. <br><br>If you want to s=
upport parsings like -1 and 2 into bool then parse the string into an int a=
nd then convert the int to bool yourself. I think we should leave type conv=
ersions up to the user and be very strict with parsing only valid values of=
the given type. <br><br>The parser itself should no do type conversions. T=
his is also why I think unsigned parsing should not allow negative values.<=
br><br>Since std::is_Integral<bool> =3D=3D true, one could argue that=
we should treat bools just like ints. That is, the parse overload for bool=
has a base parameter (which actually does nothing since the only valid val=
ues are 0 and 1 in all bases). A base of "0" (auto-detect) could support al=
l of the same prefixes as well as the strings "true" and "false" since thos=
e are literals in C++.<br><br><div class=3D"prettyprint" style=3D"backgroun=
d-color: rgb(250, 250, 250); border: 1px solid rgb(187, 187, 187); word-wra=
p: break-word;"><code class=3D"prettyprint"></code><div class=3D"subprettyp=
rint"><span style=3D"color: rgb(0, 0, 136);" class=3D"styled-by-prettify">/=
/Ignore errors for this example<br>template <typename T> T parse(stri=
ng_view s);<br><br>for</span><span style=3D"color: rgb(102, 102, 0);" class=
=3D"styled-by-prettify">(</span><span style=3D"color: rgb(0, 0, 136);" clas=
s=3D"styled-by-prettify">int</span><span style=3D"color: rgb(0, 0, 0);" cla=
ss=3D"styled-by-prettify"> i </span><span style=3D"color: rgb(102, 102, 0);=
" class=3D"styled-by-prettify">=3D</span><span style=3D"color: rgb(0, 0, 0)=
;" class=3D"styled-by-prettify"> </span><span style=3D"color: rgb(0, 102, 1=
02);" class=3D"styled-by-prettify">2</span><span style=3D"color: rgb(102, 1=
02, 0);" class=3D"styled-by-prettify">;</span><span style=3D"color: rgb(0, =
0, 0);" class=3D"styled-by-prettify"> i </span><span style=3D"color: rgb(10=
2, 102, 0);" class=3D"styled-by-prettify"><</span><span style=3D"color: =
rgb(0, 0, 0);" class=3D"styled-by-prettify"> </span><span style=3D"color: r=
gb(0, 102, 102);" class=3D"styled-by-prettify">16</span><span style=3D"colo=
r: rgb(102, 102, 0);" class=3D"styled-by-prettify">;</span><span style=3D"c=
olor: rgb(0, 0, 0);" class=3D"styled-by-prettify"> </span><span style=3D"co=
lor: rgb(102, 102, 0);" class=3D"styled-by-prettify">++</span><span style=
=3D"color: rgb(0, 0, 0);" class=3D"styled-by-prettify">i</span><span style=
=3D"color: rgb(102, 102, 0);" class=3D"styled-by-prettify">)</span><span st=
yle=3D"color: rgb(0, 0, 0);" class=3D"styled-by-prettify"> </span><span sty=
le=3D"color: rgb(102, 102, 0);" class=3D"styled-by-prettify">{</span><span =
style=3D"color: rgb(0, 0, 0);" class=3D"styled-by-prettify"><br> </sp=
an><span style=3D"color: rgb(0, 0, 136);" class=3D"styled-by-prettify">asse=
rt</span><span style=3D"color: rgb(102, 102, 0);" class=3D"styled-by-pretti=
fy">(</span><span style=3D"color: rgb(0, 0, 0);" class=3D"styled-by-prettif=
y">!parse</span><span style=3D"color: rgb(0, 136, 0);" class=3D"styled-by-p=
rettify"><bool></span><span style=3D"color: rgb(102, 102, 0);" class=
=3D"styled-by-prettify">(</span><span style=3D"color: rgb(0, 136, 0);" clas=
s=3D"styled-by-prettify">"0"</span><span style=3D"color: rgb(102, 102, 0);"=
class=3D"styled-by-prettify">,</span><span style=3D"color: rgb(0, 0, 0);" =
class=3D"styled-by-prettify"> i</span><span style=3D"color: rgb(102, 102, 0=
);" class=3D"styled-by-prettify">)</span><span style=3D"color: rgb(0, 0, 0)=
;" class=3D"styled-by-prettify"></span><span style=3D"color: rgb(0, 102, 10=
2);" class=3D"styled-by-prettify"></span><span style=3D"color: rgb(102, 102=
, 0);" class=3D"styled-by-prettify">);</span><span style=3D"color: rgb(0, 0=
, 0);" class=3D"styled-by-prettify"><br> </span><span style=3D"color:=
rgb(0, 0, 136);" class=3D"styled-by-prettify">assert</span><span style=3D"=
color: rgb(102, 102, 0);" class=3D"styled-by-prettify">(</span><span style=
=3D"color: rgb(0, 0, 0);" class=3D"styled-by-prettify">parse</span><span st=
yle=3D"color: rgb(0, 136, 0);" class=3D"styled-by-prettify"><bool></s=
pan><span style=3D"color: rgb(102, 102, 0);" class=3D"styled-by-prettify">(=
</span><span style=3D"color: rgb(0, 136, 0);" class=3D"styled-by-prettify">=
"1"</span><span style=3D"color: rgb(102, 102, 0);" class=3D"styled-by-prett=
ify">,</span><span style=3D"color: rgb(0, 0, 0);" class=3D"styled-by-pretti=
fy"> i</span><span style=3D"color: rgb(102, 102, 0);" class=3D"styled-by-pr=
ettify">)</span><span style=3D"color: rgb(0, 102, 102);" class=3D"styled-by=
-prettify"></span><span style=3D"color: rgb(102, 102, 0);" class=3D"styled-=
by-prettify">);</span><span style=3D"color: rgb(0, 0, 0);" class=3D"styled-=
by-prettify"><br></span><span style=3D"color: rgb(102, 102, 0);" class=3D"s=
tyled-by-prettify">}</span><span style=3D"color: rgb(0, 0, 0);" class=3D"st=
yled-by-prettify"><br><br>//hex prefix<br></span><code class=3D"prettyprint=
"><span style=3D"color: rgb(0, 0, 136);" class=3D"styled-by-prettify">asser=
t</span><span style=3D"color: rgb(102, 102, 0);" class=3D"styled-by-prettif=
y">(</span><span style=3D"color: rgb(0, 0, 0);" class=3D"styled-by-prettify=
">!parse</span><span style=3D"color: rgb(0, 136, 0);" class=3D"styled-by-pr=
ettify"><bool></span><span style=3D"color: rgb(102, 102, 0);" class=
=3D"styled-by-prettify">(</span><span style=3D"color: rgb(0, 136, 0);" clas=
s=3D"styled-by-prettify">"0x0"</span><span style=3D"color: rgb(102, 102, 0)=
;" class=3D"styled-by-prettify">,</span><span style=3D"color: rgb(0, 0, 0);=
" class=3D"styled-by-prettify"> </span><span style=3D"color: rgb(0, 102, 10=
2);" class=3D"styled-by-prettify">0</span><span style=3D"color: rgb(102, 10=
2, 0);" class=3D"styled-by-prettify">)</span><span style=3D"color: rgb(0, 0=
, 0);" class=3D"styled-by-prettify"></span><span style=3D"color: rgb(102, 1=
02, 0);" class=3D"styled-by-prettify">);</span><span style=3D"color: rgb(0,=
0, 0);" class=3D"styled-by-prettify"></span><code class=3D"prettyprint"><s=
pan style=3D"color: rgb(0, 0, 0);" class=3D"styled-by-prettify"><br></span>=
<span style=3D"color: rgb(0, 0, 136);" class=3D"styled-by-prettify">assert<=
/span><span style=3D"color: rgb(102, 102, 0);" class=3D"styled-by-prettify"=
>(</span><span style=3D"color: rgb(0, 0, 0);" class=3D"styled-by-prettify">=
parse</span><span style=3D"color: rgb(0, 136, 0);" class=3D"styled-by-prett=
ify"><bool></span><span style=3D"color: rgb(102, 102, 0);" class=3D"s=
tyled-by-prettify">(</span><span style=3D"color: rgb(0, 136, 0);" class=3D"=
styled-by-prettify">"0x1"</span><span style=3D"color: rgb(102, 102, 0);" cl=
ass=3D"styled-by-prettify">,</span><span style=3D"color: rgb(0, 0, 0);" cla=
ss=3D"styled-by-prettify"> </span><span style=3D"color: rgb(0, 102, 102);" =
class=3D"styled-by-prettify">0</span><span style=3D"color: rgb(102, 102, 0)=
;" class=3D"styled-by-prettify">)</span><span style=3D"color: rgb(0, 102, 1=
02);" class=3D"styled-by-prettify"></span><span style=3D"color: rgb(102, 10=
2, 0);" class=3D"styled-by-prettify">);</span><span style=3D"color: rgb(0, =
0, 0);" class=3D"styled-by-prettify"></span></code></code><br>//decimal no =
prefix<br><span style=3D"color: rgb(0, 0, 136);" class=3D"styled-by-prettif=
y">assert</span><span style=3D"color: rgb(102, 102, 0);" class=3D"styled-by=
-prettify">(</span><span style=3D"color: rgb(0, 0, 0);" class=3D"styled-by-=
prettify">!parse</span><span style=3D"color: rgb(0, 136, 0);" class=3D"styl=
ed-by-prettify"><bool></span><span style=3D"color: rgb(102, 102, 0);"=
class=3D"styled-by-prettify">(</span><span style=3D"color: rgb(0, 136, 0);=
" class=3D"styled-by-prettify">"0"</span><span style=3D"color: rgb(102, 102=
, 0);" class=3D"styled-by-prettify">,</span><span style=3D"color: rgb(0, 0,=
0);" class=3D"styled-by-prettify"> </span><span style=3D"color: rgb(0, 102=
, 102);" class=3D"styled-by-prettify">0</span><span style=3D"color: rgb(102=
, 102, 0);" class=3D"styled-by-prettify">)</span><span style=3D"color: rgb(=
0, 0, 0);" class=3D"styled-by-prettify"></span><span style=3D"color: rgb(10=
2, 102, 0);" class=3D"styled-by-prettify">);</span><span style=3D"color: rg=
b(0, 0, 0);" class=3D"styled-by-prettify"></span><code class=3D"prettyprint=
"><span style=3D"color: rgb(0, 0, 0);" class=3D"styled-by-prettify"><br></s=
pan><span style=3D"color: rgb(0, 0, 136);" class=3D"styled-by-prettify">ass=
ert</span><span style=3D"color: rgb(102, 102, 0);" class=3D"styled-by-prett=
ify">(</span><span style=3D"color: rgb(0, 0, 0);" class=3D"styled-by-pretti=
fy">parse</span><span style=3D"color: rgb(0, 136, 0);" class=3D"styled-by-p=
rettify"><bool></span><span style=3D"color: rgb(102, 102, 0);" class=
=3D"styled-by-prettify">(</span><span style=3D"color: rgb(0, 136, 0);" clas=
s=3D"styled-by-prettify">"1"</span><span style=3D"color: rgb(102, 102, 0);"=
class=3D"styled-by-prettify">,</span><span style=3D"color: rgb(0, 0, 0);" =
class=3D"styled-by-prettify"> </span><span style=3D"color: rgb(0, 102, 102)=
;" class=3D"styled-by-prettify">0</span><span style=3D"color: rgb(102, 102,=
0);" class=3D"styled-by-prettify">)</span><span style=3D"color: rgb(0, 102=
, 102);" class=3D"styled-by-prettify"></span><span style=3D"color: rgb(102,=
102, 0);" class=3D"styled-by-prettify">);</span><span style=3D"color: rgb(=
0, 0, 0);" class=3D"styled-by-prettify"><br>//octal prefix<br></span></code=
><code class=3D"prettyprint"><span style=3D"color: rgb(0, 0, 136);" class=
=3D"styled-by-prettify">assert</span><span style=3D"color: rgb(102, 102, 0)=
;" class=3D"styled-by-prettify">(</span><span style=3D"color: rgb(0, 0, 0);=
" class=3D"styled-by-prettify">!parse</span><span style=3D"color: rgb(0, 13=
6, 0);" class=3D"styled-by-prettify"><bool></span><span style=3D"colo=
r: rgb(102, 102, 0);" class=3D"styled-by-prettify">(</span><span style=3D"c=
olor: rgb(0, 136, 0);" class=3D"styled-by-prettify">"00"</span><span style=
=3D"color: rgb(102, 102, 0);" class=3D"styled-by-prettify">,</span><span st=
yle=3D"color: rgb(0, 0, 0);" class=3D"styled-by-prettify"> </span><span sty=
le=3D"color: rgb(0, 102, 102);" class=3D"styled-by-prettify">0</span><span =
style=3D"color: rgb(102, 102, 0);" class=3D"styled-by-prettify">)</span><sp=
an style=3D"color: rgb(0, 0, 0);" class=3D"styled-by-prettify"></span><span=
style=3D"color: rgb(102, 102, 0);" class=3D"styled-by-prettify">);</span><=
span style=3D"color: rgb(0, 0, 0);" class=3D"styled-by-prettify"></span><co=
de class=3D"prettyprint"><span style=3D"color: rgb(0, 0, 0);" class=3D"styl=
ed-by-prettify"><br></span><span style=3D"color: rgb(0, 0, 136);" class=3D"=
styled-by-prettify">assert</span><span style=3D"color: rgb(102, 102, 0);" c=
lass=3D"styled-by-prettify">(</span><span style=3D"color: rgb(0, 0, 0);" cl=
ass=3D"styled-by-prettify">parse</span><span style=3D"color: rgb(0, 136, 0)=
;" class=3D"styled-by-prettify"><bool></span><span style=3D"color: rg=
b(102, 102, 0);" class=3D"styled-by-prettify">(</span><span style=3D"color:=
rgb(0, 136, 0);" class=3D"styled-by-prettify">"01"</span><span style=3D"co=
lor: rgb(102, 102, 0);" class=3D"styled-by-prettify">,</span><span style=3D=
"color: rgb(0, 0, 0);" class=3D"styled-by-prettify"> </span><span style=3D"=
color: rgb(0, 102, 102);" class=3D"styled-by-prettify">0</span><span style=
=3D"color: rgb(102, 102, 0);" class=3D"styled-by-prettify">)</span><span st=
yle=3D"color: rgb(0, 102, 102);" class=3D"styled-by-prettify"></span><span =
style=3D"color: rgb(102, 102, 0);" class=3D"styled-by-prettify">);</span><s=
pan style=3D"color: rgb(0, 0, 0);" class=3D"styled-by-prettify"><br>//binar=
y prefix<br></span></code></code><code class=3D"prettyprint"><code class=3D=
"prettyprint"><span style=3D"color: rgb(0, 0, 136);" class=3D"styled-by-pre=
ttify">assert</span><span style=3D"color: rgb(102, 102, 0);" class=3D"style=
d-by-prettify">(</span><span style=3D"color: rgb(0, 0, 0);" class=3D"styled=
-by-prettify">!parse</span><span style=3D"color: rgb(0, 136, 0);" class=3D"=
styled-by-prettify"><bool></span><span style=3D"color: rgb(102, 102, =
0);" class=3D"styled-by-prettify">(</span><span style=3D"color: rgb(0, 136,=
0);" class=3D"styled-by-prettify">"0b0"</span><span style=3D"color: rgb(10=
2, 102, 0);" class=3D"styled-by-prettify">,</span><span style=3D"color: rgb=
(0, 0, 0);" class=3D"styled-by-prettify"> </span><span style=3D"color: rgb(=
0, 102, 102);" class=3D"styled-by-prettify">0</span><span style=3D"color: r=
gb(102, 102, 0);" class=3D"styled-by-prettify">)</span><span style=3D"color=
: rgb(0, 0, 0);" class=3D"styled-by-prettify"></span><span style=3D"color: =
rgb(102, 102, 0);" class=3D"styled-by-prettify">);</span><span style=3D"col=
or: rgb(0, 0, 0);" class=3D"styled-by-prettify"></span><code class=3D"prett=
yprint"><span style=3D"color: rgb(0, 0, 0);" class=3D"styled-by-prettify"><=
br></span><span style=3D"color: rgb(0, 0, 136);" class=3D"styled-by-prettif=
y">assert</span><span style=3D"color: rgb(102, 102, 0);" class=3D"styled-by=
-prettify">(</span><span style=3D"color: rgb(0, 0, 0);" class=3D"styled-by-=
prettify">parse</span><span style=3D"color: rgb(0, 136, 0);" class=3D"style=
d-by-prettify"><bool></span><span style=3D"color: rgb(102, 102, 0);" =
class=3D"styled-by-prettify">(</span><span style=3D"color: rgb(0, 136, 0);"=
class=3D"styled-by-prettify">"0b1"</span><span style=3D"color: rgb(102, 10=
2, 0);" class=3D"styled-by-prettify">,</span><span style=3D"color: rgb(0, 0=
, 0);" class=3D"styled-by-prettify"> </span><span style=3D"color: rgb(0, 10=
2, 102);" class=3D"styled-by-prettify">0</span><span style=3D"color: rgb(10=
2, 102, 0);" class=3D"styled-by-prettify">)</span><span style=3D"color: rgb=
(0, 102, 102);" class=3D"styled-by-prettify"></span><span style=3D"color: r=
gb(102, 102, 0);" class=3D"styled-by-prettify">);</span><span style=3D"colo=
r: rgb(0, 0, 0);" class=3D"styled-by-prettify"></span></code></code></code>=
<br>//C++ true and false literals<br><code class=3D"prettyprint"><code clas=
s=3D"prettyprint"><code class=3D"prettyprint"><span style=3D"color: rgb(0, =
0, 136);" class=3D"styled-by-prettify">assert</span><span style=3D"color: r=
gb(102, 102, 0);" class=3D"styled-by-prettify">(</span><span style=3D"color=
: rgb(0, 0, 0);" class=3D"styled-by-prettify">!parse</span><span style=3D"c=
olor: rgb(0, 136, 0);" class=3D"styled-by-prettify"><bool></span><spa=
n style=3D"color: rgb(102, 102, 0);" class=3D"styled-by-prettify">(</span><=
span style=3D"color: rgb(0, 136, 0);" class=3D"styled-by-prettify">"false"<=
/span><span style=3D"color: rgb(102, 102, 0);" class=3D"styled-by-prettify"=
>,</span><span style=3D"color: rgb(0, 0, 0);" class=3D"styled-by-prettify">=
</span><span style=3D"color: rgb(0, 102, 102);" class=3D"styled-by-prettif=
y">0</span><span style=3D"color: rgb(102, 102, 0);" class=3D"styled-by-pret=
tify">)</span><span style=3D"color: rgb(0, 0, 0);" class=3D"styled-by-prett=
ify"></span><span style=3D"color: rgb(102, 102, 0);" class=3D"styled-by-pre=
ttify">);</span><span style=3D"color: rgb(0, 0, 0);" class=3D"styled-by-pre=
ttify"></span><code class=3D"prettyprint"><span style=3D"color: rgb(0, 0, 0=
);" class=3D"styled-by-prettify"><br></span><span style=3D"color: rgb(0, 0,=
136);" class=3D"styled-by-prettify">assert</span><span style=3D"color: rgb=
(102, 102, 0);" class=3D"styled-by-prettify">(</span><span style=3D"color: =
rgb(0, 0, 0);" class=3D"styled-by-prettify">parse</span><span style=3D"colo=
r: rgb(0, 136, 0);" class=3D"styled-by-prettify"><bool></span><span s=
tyle=3D"color: rgb(102, 102, 0);" class=3D"styled-by-prettify">(</span><spa=
n style=3D"color: rgb(0, 136, 0);" class=3D"styled-by-prettify">"true"</spa=
n><span style=3D"color: rgb(102, 102, 0);" class=3D"styled-by-prettify">,</=
span><span style=3D"color: rgb(0, 0, 0);" class=3D"styled-by-prettify"> </s=
pan><span style=3D"color: rgb(0, 102, 102);" class=3D"styled-by-prettify">0=
</span><span style=3D"color: rgb(102, 102, 0);" class=3D"styled-by-prettify=
">)</span><span style=3D"color: rgb(0, 102, 102);" class=3D"styled-by-prett=
ify"></span><span style=3D"color: rgb(102, 102, 0);" class=3D"styled-by-pre=
ttify">);</span><span style=3D"color: rgb(0, 0, 0);" class=3D"styled-by-pre=
ttify"></span></code></code></code></code><br><br><br></div></div><br><br>&=
nbsp;<br></div><blockquote class=3D"gmail_quote" style=3D"margin: 0pt 0pt 0=
pt 0.8ex; border-left: 1px solid rgb(204, 204, 204); padding-left: 1ex;">
<br>>> Should octal input be supported if base =3D 0?
<br>>
<br>>
<br>> I would at minimum support all of the literals supported by the co=
re
<br>> language string literals. That is, hex (0x), decimal (), octal (0)=
, and
<br>> binary (0b).
<br>
<br>Does strtol support 0b?<br></blockquote><div><br>http://en.cppreference=
..com/w/cpp/string/byte/strtol<br><br>No, but neither did C or C++ when the =
interface was defined. Binary literals are useful and should be included. W=
e are already breaking compatibility by removing support for leading white =
space. I don't see any strong reason why we have to conform to strtol(). I'=
d rather use the integer literal prefixes supported by the core language as=
a guide. Compatibility with the core language in this respect makes the in=
terface easier to understand for novices because they learn one set of nume=
ric prefixes and it works the same way everywhere. Eventually strtol() and =
its ilk will go to the dustbin of history. People will not be referring bac=
k to strtol() to figure out how to use this new interface.<br> <br></d=
iv><blockquote class=3D"gmail_quote" style=3D"margin: 0pt 0pt 0pt 0.8ex; bo=
rder-left: 1px solid rgb(204, 204, 204); padding-left: 1ex;">0 as prefix is=
problematic as 09 for example is probably not intended
<br>as an octal number.
<br></blockquote><div><br>If the user wants to accept 09 as a decimal numbe=
r, they should call parse<int>(s, 10). <br><br>The base=3D0 is a simp=
le helper utility for the most common use case. If the user for some reason=
has unusal requirements like only supporting hex and decimal but not octal=
and binary, then they can write a wrapper which parses the prefix. <br><br=
>If you want to give the user maximum control, make a third defaulted out a=
rgument which is a bitmask of what prefixes are requested. This would also =
allow users to turn off prefix support when parsing a non-zero base (see be=
low).<br><br><div class=3D"prettyprint" style=3D"background-color: rgb(250,=
250, 250); border: 1px solid rgb(187, 187, 187); word-wrap: break-word;"><=
code class=3D"prettyprint"></code><div class=3D"subprettyprint"><span style=
=3D"color: rgb(102, 0, 102);" class=3D"styled-by-prettify">auto hex_prefix =
=3D 1 << 16<br>auto dec_prefix =3D 1 << 10<br>auto oct_prefix =
=3D 1 << 8<br>auto bin_prefix =3D 1 << 2<br><br>template <ty=
pename T><br> T parse(string_view s, int base=3D10, unsigned int p=
refixes_to_check=3D0xFFFF);<br><br>auto hex_or_dec =3D hex_prefix | dec_pre=
fix;<br><br>parse<int>("0x9", 0, hex_or_dec); //Ok =3D=3D 9<br>parse&=
lt;int>("9" , 0, hex_or_dec); //Ok =3D=3D 9<br>parse<int>("09", 0,=
hex_or_dec); //Ok =3D=3D 9 (parsed as decimal)<br>parse<int>("0b1001=
", 0, hex_or_dec); //Error bad input character "b"<br><br>parse<int>(=
"F", 16); //Ok =3D=3D 15<br>parse<int>("0xF", 16); //Ok =3D=3D 15<br>=
parse<int>("F", 16, 0); //Ok =3D=3D 15<br>parse<int>("0xF", 16,=
0; //Error, bad input character "x"<br></span><span style=3D"color: rgb(10=
2, 102, 0);" class=3D"styled-by-prettify"></span></div></div><br><br></div>
<p></p>
-- <br />
<br />
--- <br />
You received this message because you are subscribed to the Google Groups &=
quot;ISO C++ Standard - Future Proposals" group.<br />
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to <a href=3D"mailto:std-proposals+unsubscribe@isocpp.org">std-proposa=
ls+unsubscribe@isocpp.org</a>.<br />
To post to this group, send email to <a href=3D"mailto:std-proposals@isocpp=
..org">std-proposals@isocpp.org</a>.<br />
Visit this group at <a href=3D"http://groups.google.com/a/isocpp.org/group/=
std-proposals/">http://groups.google.com/a/isocpp.org/group/std-proposals/<=
/a>.<br />
------=_Part_1300_459935950.1434131890825--
------=_Part_1299_1999294137.1434131890825--
.
Author: Thiago Macieira <thiago@macieira.org>
Date: Fri, 12 Jun 2015 15:33:34 -0700
Raw View
On Friday 12 June 2015 10:58:10 Matthew Fioravante wrote:
> If you want to give the user maximum control, make a third defaulted out
> argument which is a bitmask of what prefixes are requested. This would also
> allow users to turn off prefix support when parsing a non-zero base (see
> below).
That requires a 64-bit parameter for the bitfield mask, as there are at 35
options (bases 2 to 36). Not the end of the world, but will require two
parameter slots on 32-bit systems.
--
Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org
Software Architect - Intel Open Source Technology Center
PGP/GPG: 0x6EF45358; fingerprint:
E067 918B B660 DBD1 105C 966C 33F5 F005 6EF4 5358
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
.
Author: Magnus Fromreide <magfr@lysator.liu.se>
Date: Sat, 13 Jun 2015 00:53:34 +0200
Raw View
On Fri, Jun 12, 2015 at 10:58:10AM -0700, Matthew Fioravante wrote:
>
>
> If "+" prefix is supported for signed, I would support it for unsigned
> also.
I see a big if looming there.
Leaing pluses makes it impossible to know what input the parse routine got
since it creates two possible inputs that give the same output.
> > >> Bool?
> > >
> > >
> > > This is tricky because valid strings could be (0, 1), (true, false),
> > (True,
> > > False), (T,F), etc.... If the parsing function is named something like
> > > str_to_num() then it makes sense to support bool with "0" and "1". If
> > its
> > > more generic like parse(), then the question of valid inputs becomes
> > more
> > > ambiguous.
> >
> > True
> > There's also the question of what to do with for example -1 and 2. Map
> > to true or return invalid input?
> >
>
> Since std::is_Integral<bool> == true, one could argue that we should treat
> bools just like ints. That is, the parse overload for bool has a base
> parameter (which actually does nothing since the only valid values are 0
> and 1 in all bases). A base of "0" (auto-detect) could support all of the
> same prefixes as well as the strings "true" and "false" since those are
> literals in C++.
As long as we make dead sure that nothing tries to drag in the localization
code, and that is why I am mildly against support for true/false.
/MF
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
.
Author: Olaf van der Spek <olafvdspek@gmail.com>
Date: Sat, 13 Jun 2015 14:14:48 +0200
Raw View
2015-06-13 0:33 GMT+02:00 Thiago Macieira <thiago@macieira.org>:
> On Friday 12 June 2015 10:58:10 Matthew Fioravante wrote:
>> If you want to give the user maximum control, make a third defaulted out
>> argument which is a bitmask of what prefixes are requested. This would also
>> allow users to turn off prefix support when parsing a non-zero base (see
>> below).
>
> That requires a 64-bit parameter for the bitfield mask, as there are at 35
> options (bases 2 to 36). Not the end of the world, but will require two
> parameter slots on 32-bit systems.
Why's that? Most of the bases don't have a prefix so can only be
selected manually.
--
Olaf
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
.
Author: Matthew Fioravante <fmatthew5876@gmail.com>
Date: Sat, 13 Jun 2015 06:38:53 -0700 (PDT)
Raw View
------=_Part_1921_1348136671.1434202733790
Content-Type: multipart/alternative;
boundary="----=_Part_1922_33070000.1434202733790"
------=_Part_1922_33070000.1434202733790
Content-Type: text/plain; charset=UTF-8
On Friday, June 12, 2015 at 6:53:38 PM UTC-4, Magnus Fromreide wrote:
>
> On Fri, Jun 12, 2015 at 10:58:10AM -0700, Matthew Fioravante wrote:
> >
> >
> > If "+" prefix is supported for signed, I would support it for unsigned
> > also.
>
> I see a big if looming there.
>
> Leaing pluses makes it impossible to know what input the parse routine got
> since it creates two possible inputs that give the same output.
>
Why is it important to know which input string produced a value? And anyway
you already have this problem because you don't know what base was used for
the input string.
>
> > > >> Bool?
> > > >
> > > >
> > > > This is tricky because valid strings could be (0, 1), (true, false),
> > > (True,
> > > > False), (T,F), etc.... If the parsing function is named something
> like
> > > > str_to_num() then it makes sense to support bool with "0" and "1".
> If
> > > its
> > > > more generic like parse(), then the question of valid inputs becomes
> > > more
> > > > ambiguous.
> > >
> > > True
> > > There's also the question of what to do with for example -1 and 2. Map
> > > to true or return invalid input?
> > >
> >
> > Since std::is_Integral<bool> == true, one could argue that we should
> treat
> > bools just like ints. That is, the parse overload for bool has a base
> > parameter (which actually does nothing since the only valid values are 0
> > and 1 in all bases). A base of "0" (auto-detect) could support all of
> the
> > same prefixes as well as the strings "true" and "false" since those are
> > literals in C++.
>
> As long as we make dead sure that nothing tries to drag in the
> localization
> code, and that is why I am mildly against support for true/false.
>
Agree, it true/false means locales and all that id rather skip it. But we
already have an 'x' and a 'b' for hex and binary prefix.
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
------=_Part_1922_33070000.1434202733790
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable
<div dir=3D"ltr"><br><br>On Friday, June 12, 2015 at 6:53:38 PM UTC-4, Magn=
us Fromreide wrote:<blockquote class=3D"gmail_quote" style=3D"margin: 0;mar=
gin-left: 0.8ex;border-left: 1px #ccc solid;padding-left: 1ex;">On Fri, Jun=
12, 2015 at 10:58:10AM -0700, Matthew Fioravante wrote:
<br>>=20
<br>>=20
<br>> If "+" prefix is supported for signed, I would support it for unsi=
gned=20
<br>> also.=20
<br>
<br>I see a big if looming there.
<br>
<br>Leaing pluses makes it impossible to know what input the parse routine =
got
<br>since it creates two possible inputs that give the same output.
<br></blockquote><div><br></div><div>Why is it important to know which inpu=
t string produced a value? And anyway you already have this problem because=
you don't know what base was used for the input string.</div><div> </=
div><blockquote class=3D"gmail_quote" style=3D"margin: 0;margin-left: 0.8ex=
;border-left: 1px #ccc solid;padding-left: 1ex;">
<br>> > >> Bool?=20
<br>> > >=20
<br>> > >=20
<br>> > > This is tricky because valid strings could be (0, 1), (t=
rue, false),=20
<br>> > (True,=20
<br>> > > False), (T,F), etc.... If the parsing function is named =
something like=20
<br>> > > str_to_num() then it makes sense to support bool with "0=
" and "1". If=20
<br>> > its=20
<br>> > > more generic like parse(), then the question of valid in=
puts becomes=20
<br>> > more=20
<br>> > > ambiguous.=20
<br>> >
<br>> > True=20
<br>> > There's also the question of what to do with for example -1 a=
nd 2. Map=20
<br>> > to true or return invalid input?=20
<br>> >
<br>>=20
<br>> Since std::is_Integral<bool> =3D=3D true, one could argue th=
at we should treat=20
<br>> bools just like ints. That is, the parse overload for bool has a b=
ase=20
<br>> parameter (which actually does nothing since the only valid values=
are 0=20
<br>> and 1 in all bases). A base of "0" (auto-detect) could support all=
of the=20
<br>> same prefixes as well as the strings "true" and "false" since thos=
e are=20
<br>> literals in C++.
<br>
<br>As long as we make dead sure that nothing tries to drag in the localiza=
tion
<br>code, and that is why I am mildly against support for true/false.
<br></blockquote><div><br></div><div>Agree, it true/false means locales and=
all that id rather skip it. But we already have an 'x' and a 'b' for hex a=
nd binary prefix.</div></div>
<p></p>
-- <br />
<br />
--- <br />
You received this message because you are subscribed to the Google Groups &=
quot;ISO C++ Standard - Future Proposals" group.<br />
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to <a href=3D"mailto:std-proposals+unsubscribe@isocpp.org">std-proposa=
ls+unsubscribe@isocpp.org</a>.<br />
To post to this group, send email to <a href=3D"mailto:std-proposals@isocpp=
..org">std-proposals@isocpp.org</a>.<br />
Visit this group at <a href=3D"http://groups.google.com/a/isocpp.org/group/=
std-proposals/">http://groups.google.com/a/isocpp.org/group/std-proposals/<=
/a>.<br />
------=_Part_1922_33070000.1434202733790--
------=_Part_1921_1348136671.1434202733790--
.
Author: Miro Knejp <miro.knejp@gmail.com>
Date: Sat, 13 Jun 2015 16:17:01 +0200
Raw View
This is a multi-part message in MIME format.
--------------030609000400030009080807
Content-Type: text/plain; charset=UTF-8; format=flowed
There should definitely be a way to disable the "+" prefix and treat it
as error. Not all data or interchange formats allow a leading plus and
in some aviation protocols I know (but can't talk about) anything that
is not set in stone in the specification is an error. A number parser
where the plus sign cannot be disabled is useless in such situations as
it would not pass certification. Sure, I could check for the presence of
"+" manually first, but why do the same work twice?
The thread so far is heavily revolving around parsing C-syntax numbers
as seen by the prefix discussion. This is quite restrictive and should
only be an addition on top of the absolute basic number parsing. The
numeric (ASCII) decimal/hex digits are the same in every
programming/scripting language or textual file formats, prefixes and
other stuff are not. If these functions are to be the basic foundation
for all number parsing that makes it obsolete for everyone to roll their
own as is currently the case then all these extras must be optional.
Am 13.06.2015 um 15:38 schrieb Matthew Fioravante:
>
>
> On Friday, June 12, 2015 at 6:53:38 PM UTC-4, Magnus Fromreide wrote:
>
> On Fri, Jun 12, 2015 at 10:58:10AM -0700, Matthew Fioravante wrote:
> >
> >
> > If "+" prefix is supported for signed, I would support it for
> unsigned
> > also.
>
> I see a big if looming there.
>
> Leaing pluses makes it impossible to know what input the parse
> routine got
> since it creates two possible inputs that give the same output.
>
>
> Why is it important to know which input string produced a value? And
> anyway you already have this problem because you don't know what base
> was used for the input string.
>
>
> > > >> Bool?
> > > >
> > > >
> > > > This is tricky because valid strings could be (0, 1), (true,
> false),
> > > (True,
> > > > False), (T,F), etc.... If the parsing function is named
> something like
> > > > str_to_num() then it makes sense to support bool with "0"
> and "1". If
> > > its
> > > > more generic like parse(), then the question of valid inputs
> becomes
> > > more
> > > > ambiguous.
> > >
> > > True
> > > There's also the question of what to do with for example -1
> and 2. Map
> > > to true or return invalid input?
> > >
> >
> > Since std::is_Integral<bool> == true, one could argue that we
> should treat
> > bools just like ints. That is, the parse overload for bool has a
> base
> > parameter (which actually does nothing since the only valid
> values are 0
> > and 1 in all bases). A base of "0" (auto-detect) could support
> all of the
> > same prefixes as well as the strings "true" and "false" since
> those are
> > literals in C++.
>
> As long as we make dead sure that nothing tries to drag in the
> localization
> code, and that is why I am mildly against support for true/false.
>
>
> Agree, it true/false means locales and all that id rather skip it. But
> we already have an 'x' and a 'b' for hex and binary prefix.
> --
>
> ---
> You received this message because you are subscribed to the Google
> Groups "ISO C++ Standard - Future Proposals" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to std-proposals+unsubscribe@isocpp.org
> <mailto:std-proposals+unsubscribe@isocpp.org>.
> To post to this group, send email to std-proposals@isocpp.org
> <mailto:std-proposals@isocpp.org>.
> Visit this group at
> http://groups.google.com/a/isocpp.org/group/std-proposals/.
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
--------------030609000400030009080807
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable
<html>
<head>
<meta content=3D"text/html; charset=3Dutf-8" http-equiv=3D"Content-Type=
">
</head>
<body bgcolor=3D"#FFFFFF" text=3D"#000000">
There should definitely be a way to disable the "+" prefix and treat
it as error. Not all data or interchange formats allow a leading
plus and in some aviation protocols I know (but can't talk about)
anything that is not set in stone in the specification is an error.
A number parser where the plus sign cannot be disabled is useless in
such situations as it would not pass certification. Sure, I could
check for the presence of "+" manually first, but why do the same
work twice?<br>
<br>
The thread so far is heavily revolving around parsing C-syntax
numbers as seen by the prefix discussion. This is quite restrictive
and should only be an addition on top of the absolute basic number
parsing. The numeric (ASCII) decimal/hex digits are the same in
every programming/scripting language or textual file formats,
prefixes and other stuff are not. If these functions are to be the
basic foundation for all number parsing that makes it obsolete for
everyone to roll their own as is currently the case then all these
extras must be optional.<br>
<br>
<div class=3D"moz-cite-prefix">Am 13.06.2015 um 15:38 schrieb Matthew
Fioravante:<br>
</div>
<blockquote
cite=3D"mid:ceb6f450-d84b-40c9-8c3e-931b7e1518c1@isocpp.org"
type=3D"cite">
<div dir=3D"ltr"><br>
<br>
On Friday, June 12, 2015 at 6:53:38 PM UTC-4, Magnus Fromreide
wrote:
<blockquote class=3D"gmail_quote" style=3D"margin: 0;margin-left:
0.8ex;border-left: 1px #ccc solid;padding-left: 1ex;">On Fri,
Jun 12, 2015 at 10:58:10AM -0700, Matthew Fioravante wrote:
<br>
> <br>
> <br>
> If "+" prefix is supported for signed, I would support it
for unsigned <br>
> also. <br>
<br>
I see a big if looming there.
<br>
<br>
Leaing pluses makes it impossible to know what input the parse
routine got
<br>
since it creates two possible inputs that give the same
output.
<br>
</blockquote>
<div><br>
</div>
<div>Why is it important to know which input string produced a
value? And anyway you already have this problem because you
don't know what base was used for the input string.</div>
<div>=C2=A0</div>
<blockquote class=3D"gmail_quote" style=3D"margin: 0;margin-left:
0.8ex;border-left: 1px #ccc solid;padding-left: 1ex;">
<br>
> > >> Bool? <br>
> > > <br>
> > > <br>
> > > This is tricky because valid strings could be
(0, 1), (true, false), <br>
> > (True, <br>
> > > False), (T,F), etc.... If the parsing function
is named something like <br>
> > > str_to_num() then it makes sense to support
bool with "0" and "1". If <br>
> > its <br>
> > > more generic like parse(), then the question of
valid inputs becomes <br>
> > more <br>
> > > ambiguous. <br>
> >
<br>
> > True <br>
> > There's also the question of what to do with for
example -1 and 2. Map <br>
> > to true or return invalid input? <br>
> >
<br>
> <br>
> Since std::is_Integral<bool> =3D=3D true, one could
argue that we should treat <br>
> bools just like ints. That is, the parse overload for
bool has a base <br>
> parameter (which actually does nothing since the only
valid values are 0 <br>
> and 1 in all bases). A base of "0" (auto-detect) could
support all of the <br>
> same prefixes as well as the strings "true" and "false"
since those are <br>
> literals in C++.
<br>
<br>
As long as we make dead sure that nothing tries to drag in the
localization
<br>
code, and that is why I am mildly against support for
true/false.
<br>
</blockquote>
<div><br>
</div>
<div>Agree, it true/false means locales and all that id rather
skip it. But we already have an 'x' and a 'b' for hex and
binary prefix.</div>
</div>
-- <br>
<br>
--- <br>
You received this message because you are subscribed to the Google
Groups "ISO C++ Standard - Future Proposals" group.<br>
To unsubscribe from this group and stop receiving emails from it,
send an email to <a moz-do-not-send=3D"true"
href=3D"mailto:std-proposals+unsubscribe@isocpp.org">std-proposals+=
unsubscribe@isocpp.org</a>.<br>
To post to this group, send email to <a moz-do-not-send=3D"true"
href=3D"mailto:std-proposals@isocpp.org">std-proposals@isocpp.org</=
a>.<br>
Visit this group at <a moz-do-not-send=3D"true"
href=3D"http://groups.google.com/a/isocpp.org/group/std-proposals/"=
>http://groups.google.com/a/isocpp.org/group/std-proposals/</a>.<br>
</blockquote>
<br>
</body>
</html>
<p></p>
-- <br />
<br />
--- <br />
You received this message because you are subscribed to the Google Groups &=
quot;ISO C++ Standard - Future Proposals" group.<br />
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to <a href=3D"mailto:std-proposals+unsubscribe@isocpp.org">std-proposa=
ls+unsubscribe@isocpp.org</a>.<br />
To post to this group, send email to <a href=3D"mailto:std-proposals@isocpp=
..org">std-proposals@isocpp.org</a>.<br />
Visit this group at <a href=3D"http://groups.google.com/a/isocpp.org/group/=
std-proposals/">http://groups.google.com/a/isocpp.org/group/std-proposals/<=
/a>.<br />
--------------030609000400030009080807--
.
Author: Jens Maurer <Jens.Maurer@gmx.net>
Date: Sun, 14 Jun 2015 23:10:25 +0200
Raw View
On 06/13/2015 04:17 PM, Miro Knejp wrote:
> The thread so far is heavily revolving around parsing C-syntax
> numbers as seen by the prefix discussion. This is quite restrictive
> and should only be an addition on top of the absolute basic number
> parsing. The numeric (ASCII) decimal/hex digits are the same in every
> programming/scripting language or textual file formats, prefixes and
> other stuff are not. If these functions are to be the basic
> foundation for all number parsing that makes it obsolete for everyone
> to roll their own as is currently the case then all these extras must
> be optional.
Yes, fully agreed.
Jens
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
.
Author: Olaf van der Spek <olafvdspek@gmail.com>
Date: Mon, 15 Jun 2015 13:10:33 +0200
Raw View
2015-06-13 16:17 GMT+02:00 Miro Knejp <miro.knejp@gmail.com>:
> There should definitely be a way to disable the "+" prefix and treat it as
> error. Not all data or interchange formats allow a leading plus and in some
> aviation protocols I know (but can't talk about) anything that is not set in
> stone in the specification is an error. A number parser where the plus sign
> cannot be disabled is useless in such situations as it would not pass
> certification. Sure, I could check for the presence of "+" manually first,
> but why do the same work twice?
What about minus? Leading zeros?
Disallowing things is certainly necessary in some cases but I'm afraid
it makes the proposal more complex.
> The thread so far is heavily revolving around parsing C-syntax numbers as
> seen by the prefix discussion. This is quite restrictive and should only be
> an addition on top of the absolute basic number parsing. The numeric (ASCII)
> decimal/hex digits are the same in every programming/scripting language or
> textual file formats, prefixes and other stuff are not. If these functions
> are to be the basic foundation for all number parsing that makes it obsolete
> for everyone to roll their own as is currently the case then all these
> extras must be optional.
Auto-detection of base when base = 0 is only a small part of this function..
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
.
Author: Miro Knejp <miro.knejp@gmail.com>
Date: Mon, 15 Jun 2015 16:59:04 +0200
Raw View
Am 15.06.2015 um 13:10 schrieb Olaf van der Spek:
> 2015-06-13 16:17 GMT+02:00 Miro Knejp <miro.knejp@gmail.com>:
>> There should definitely be a way to disable the "+" prefix and treat it as
>> error. Not all data or interchange formats allow a leading plus and in some
>> aviation protocols I know (but can't talk about) anything that is not set in
>> stone in the specification is an error. A number parser where the plus sign
>> cannot be disabled is useless in such situations as it would not pass
>> certification. Sure, I could check for the presence of "+" manually first,
>> but why do the same work twice?
> What about minus? Leading zeros?
> Disallowing things is certainly necessary in some cases but I'm afraid
> it makes the proposal more complex.
In fact it's the opposite. A proposal that does nothing but raw number
parsing without any prefix support or other extras is probably the least
complexity you can have. Everything else can be added on top of that. I
can imagine having proposals that are layered on top of others.
>> The thread so far is heavily revolving around parsing C-syntax numbers as
>> seen by the prefix discussion. This is quite restrictive and should only be
>> an addition on top of the absolute basic number parsing. The numeric (ASCII)
>> decimal/hex digits are the same in every programming/scripting language or
>> textual file formats, prefixes and other stuff are not. If these functions
>> are to be the basic foundation for all number parsing that makes it obsolete
>> for everyone to roll their own as is currently the case then all these
>> extras must be optional.
> Auto-detection of base when base = 0 is only a small part of this function..
>
It's not just about bases, it's the assumption that everyone wants or
needs parsing functions that are centered around the C syntax for
numeric literals. I just want people to be aware that there are plenty
of places where such a syntax is not supported or allowed and forcing
these extras into the standard library just means people are going to
*yet again* roll their own implementation and the status quo remains.
There is a difference between providing the foundation on which to build
up more complex parsers, and completely locking people out of using
these facilities because the standard library functions make too many
assumptions about usage scenarios.
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
.
Author: Olaf van der Spek <olafvdspek@gmail.com>
Date: Mon, 15 Jun 2015 17:41:21 +0200
Raw View
2015-06-15 16:59 GMT+02:00 Miro Knejp <miro.knejp@gmail.com>:
>> What about minus? Leading zeros?
>> Disallowing things is certainly necessary in some cases but I'm afraid
>> it makes the proposal more complex.
>
> In fact it's the opposite. A proposal that does nothing but raw number
> parsing without any prefix support or other extras is probably the least
> complexity you can have. Everything else can be added on top of that. I can
> imagine having proposals that are layered on top of others.
A proposal / parser not supporting signed numbers is not complete enough IMO.
That said, what about a parse_unsigned variant that doesn't parse
signs (but is still defined for signed types too).
>> Auto-detection of base when base = 0 is only a small part of this
>> function..
>>
> It's not just about bases, it's the assumption that everyone wants or needs
> parsing functions that are centered around the C syntax for numeric
> literals.
What's C-specific besides bases?
--
Olaf
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
.
Author: Peter Koch Larsen <peter.koch.larsen@gmail.com>
Date: Tue, 16 Jun 2015 01:01:11 +0200
Raw View
You certainly have started a party!
To me it seems as this thread has begun the construction of a Swiss
army knife, but a very heavy one.
Some of these suggestions would be better off using a Tool such as
boost::spirit. A generic numeric parser
should be fast and lightweight, simply supporting conversion and
appropriate errorhandling (which for me
would be an exception).
/Peter
On Mon, Jun 15, 2015 at 1:10 PM, Olaf van der Spek <olafvdspek@gmail.com> wrote:
> 2015-06-13 16:17 GMT+02:00 Miro Knejp <miro.knejp@gmail.com>:
>> There should definitely be a way to disable the "+" prefix and treat it as
>> error. Not all data or interchange formats allow a leading plus and in some
>> aviation protocols I know (but can't talk about) anything that is not set in
>> stone in the specification is an error. A number parser where the plus sign
>> cannot be disabled is useless in such situations as it would not pass
>> certification. Sure, I could check for the presence of "+" manually first,
>> but why do the same work twice?
>
> What about minus? Leading zeros?
>
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
.
Author: Miro Knejp <miro.knejp@gmail.com>
Date: Tue, 16 Jun 2015 05:10:32 +0200
Raw View
Am 15.06.2015 um 17:41 schrieb Olaf van der Spek:
> 2015-06-15 16:59 GMT+02:00 Miro Knejp <miro.knejp@gmail.com>:
>>> What about minus? Leading zeros?
>>> Disallowing things is certainly necessary in some cases but I'm afraid
>>> it makes the proposal more complex.
>> In fact it's the opposite. A proposal that does nothing but raw number
>> parsing without any prefix support or other extras is probably the least
>> complexity you can have. Everything else can be added on top of that. I can
>> imagine having proposals that are layered on top of others.
> A proposal / parser not supporting signed numbers is not complete enough IMO.
>
> That said, what about a parse_unsigned variant that doesn't parse
> signs (but is still defined for signed types too).
I didn't say don't support signed types, I said don't force in extras
which aren't necessary for the foundation. Always make those optional.
>>> Auto-detection of base when base = 0 is only a small part of this
>>> function..
>>>
>> It's not just about bases, it's the assumption that everyone wants or needs
>> parsing functions that are centered around the C syntax for numeric
>> literals.
> What's C-specific besides bases?
>
Prefixes, separators, suffixes. '+' and '-' are technically not part of
the literal but not accepting '-' for negative types is nonsense. There
is no requirement for accepting '+' in either case, it's redundant and
not correct in all use cases.
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
.
Author: Olaf van der Spek <olafvdspek@gmail.com>
Date: Tue, 16 Jun 2015 08:22:52 +0200
Raw View
2015-06-16 5:10 GMT+02:00 Miro Knejp <miro.knejp@gmail.com>:
>> What's C-specific besides bases?
>>
> Prefixes, separators, suffixes. '+' and '-' are technically not part of the
> literal but not accepting '-' for negative types is nonsense. There is no
> requirement for accepting '+' in either case, it's redundant and not correct
> in all use cases.
Prefixes aren't allowed when base != 0
What separators? The dot etc for floating point numbers?
What suffixes?
> I didn't say don't support signed types, I said don't force in extras which aren't necessary for the foundation. Always make those optional.
Got a proposal for how to specify what is and what is not allowed?
--
Olaf
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
.
Author: Matthew Fioravante <fmatthew5876@gmail.com>
Date: Tue, 16 Jun 2015 17:47:55 -0700 (PDT)
Raw View
------=_Part_1223_796361515.1434502075920
Content-Type: multipart/alternative;
boundary="----=_Part_1224_1329894501.1434502075920"
------=_Part_1224_1329894501.1434502075920
Content-Type: text/plain; charset=UTF-8
On Tuesday, June 16, 2015 at 2:22:55 AM UTC-4, Olaf van der Spek wrote:
> 2015-06-16 5:10 GMT+02:00 Miro Knejp <miro....@gmail.com <javascript:>>:
> >> What's C-specific besides bases?
> >>
> > Prefixes, separators, suffixes. '+' and '-' are technically not part of
> the
> > literal but not accepting '-' for negative types is nonsense. There is
> no
> > requirement for accepting '+' in either case, it's redundant and not
> correct
> > in all use cases.
>
> Prefixes aren't allowed when base != 0
>
FYI they are allowed for strtol(). Also see my earlier example about
prefixes again.
http://en.cppreference.com/w/cpp/string/byte/strtol
> What separators? The dot etc for floating point numbers?
> What suffixes?
>
>
> > I didn't say don't support signed types, I said don't force in extras
> which aren't necessary for the foundation. Always make those optional.
>
> Got a proposal for how to specify what is and what is not allowed?
>
>
So lets enumerate all of the variations in input for integers.
Required rules:
-The digits for the absolute integer value
-A leading '-' for signed inputs.
Useful optional rules:
- A base prefix (e.g 0x, 0, 0b), required for auto-detection and optionally
allowed for a fixed base.
- A leading '+' which is redundant
- Supporting digit separators (, or . or any character and length). This
can be useful because you basically have to reimplement your own parser
from scratch if you want to do this. Digit separators are also allowed in
C++ literals so there is a compatibility argument.
Questionable rules:
- A leading '-' for unsigned values, causing a parse into signed followed
by a conversion to unsigned.
- 'true' and 'false' for auto detected bool.
I agree with the others who would like to use this function with only the
necessary rules in order to use it effectively in a high level strict
parsing protocol. These cases are rare however, and I think by default
doing the most obvious thing is best. Just like with std::atomic, the
default is the slowest but also the easiest to understand and use quickly.
I think we should also make base = 0 by default instead of base=10.
Here is one interface for this. I'm still using string_view here to not get
back into the discussion of the return / out-param issue.
using prefix_set = std::bitset<36>;
using digit_separator_char_set = std::bitset<0x100>;
static constexpr auto auto_base = 0;
static constexpr auto all_prefixes = prefix_set().flip();
static constexpr auto no_prefixes = prefix_set();
static constexpr auto all_digit_separators =
digit_separator_char_set().flio();
static constexpr auto no_digit_separators = digit_separator_char_set();
//Default is allow no digit separators, auto detect base, all prefixes
enabled, leading plus enabled,
template <typename Integral>
error_code parse(string_view& tail, string_view str, int base, digit_separator_set
digit_separators, prefix_set prefixes_enabled=all_prefixes, bool
leading_plus_enabled = true});
template <typename Integral>
error_code parse(string_view& tail, string_view str, int base, char
digit_separator, prefix_set prefixes_enabled=all_prefixes, bool
leading_plus_enabled = true);
template <typename Integral>
inline error_code parse(string_view& tail, string_view str, int base=0, prefix_set
prefixes_enabled=all_prefixes, bool leading_plus_enabled = true) {
return parse(tail, str, base, no_digit_separators, prefixes_enabled,
leading_plus_enabled);
}
This is a lot of function arguments, but I imagine this will be implemented
using different implementations based on whether or not options are turned
on which means optimizes implementations of specific combinations of
defaults can be replaced by optimized overloads and inline dispatch. For
example, using simd on an array of digits may not be possible or efficient
if digit separators are supported.
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
------=_Part_1224_1329894501.1434502075920
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable
<div dir=3D"ltr"><div><br></div><div>On Tuesday, June 16, 2015 at 2:22:55 A=
M UTC-4, Olaf van der Spek wrote:<br></div><blockquote class=3D"gmail_quote=
" style=3D"margin: 0;margin-left: 0.8ex;border-left: 1px #ccc solid;padding=
-left: 1ex;">2015-06-16 5:10 GMT+02:00 Miro Knejp <<a href=3D"javascript=
:" target=3D"_blank" gdf-obfuscated-mailto=3D"ofboCiPePAEJ" rel=3D"nofollow=
" onmousedown=3D"this.href=3D'javascript:';return true;" onclick=3D"this.hr=
ef=3D'javascript:';return true;">miro....@gmail.com</a>>:
<br>>> What's C-specific besides bases?
<br>>>
<br>> Prefixes, separators, suffixes. '+' and '-' are technically not pa=
rt of the
<br>> literal but not accepting '-' for negative types is nonsense. Ther=
e is no
<br>> requirement for accepting '+' in either case, it's redundant and n=
ot correct
<br>> in all use cases.
<br>
<br>Prefixes aren't allowed when base !=3D 0
<br></blockquote><div><br></div><div>FYI they are allowed for strtol(). Als=
o see my earlier example about prefixes again.</div><div><br></div><div>htt=
p://en.cppreference.com/w/cpp/string/byte/strtol<br></div><div> </div>=
<blockquote class=3D"gmail_quote" style=3D"margin: 0;margin-left: 0.8ex;bor=
der-left: 1px #ccc solid;padding-left: 1ex;">What separators? The dot etc f=
or floating point numbers?
<br>What suffixes?
<br>
<br>
<br>> I didn't say don't support signed types, I said don't force in ext=
ras which aren't necessary for the foundation. Always make those optional.
<br>
<br>Got a proposal for how to specify what is and what is not allowed?
<br><br></blockquote><div><br></div><div>So lets enumerate all of the varia=
tions in input for integers. </div><div><br></div><div>Required rules:=
</div><div>-The digits for the absolute integer value</div><div>-A leading =
'-' for signed inputs.</div><div> </div><div>Useful optional rules:</d=
iv><div>- A base prefix (e.g 0x, 0, 0b), required for auto-detection and op=
tionally allowed for a fixed base.</div><div>- A leading '+' which is redun=
dant</div><div><div>- Supporting digit separators (, or . or any character =
and length). This can be useful because you basically have to reimplement y=
our own parser from scratch if you want to do this. Digit separators are al=
so allowed in C++ literals so there is a compatibility argument.</div></div=
><div><br></div><div><br></div><div>Questionable rules:</div><div>- A leadi=
ng '-' for unsigned values, causing a parse into signed followed by a conve=
rsion to unsigned.</div><div>- 'true' and 'false' for auto detected bool.&n=
bsp;<br></div><div><br></div><div>I agree with the others who would like to=
use this function with only the necessary rules in order to use it effecti=
vely in a high level strict parsing protocol. These cases are rare however,=
and I think by default doing the most obvious thing is best. Just like wit=
h std::atomic, the default is the slowest but also the easiest to understan=
d and use quickly. </div><div><br></div><div>I think we should also ma=
ke base =3D 0 by default instead of base=3D10.<br></div><div><br></div><div=
>Here is one interface for this. I'm still using string_view here to not ge=
t back into the discussion of the return / out-param issue.</div><div><br><=
/div><div><div class=3D"prettyprint" style=3D"border: 1px solid rgb(187, 18=
7, 187); word-wrap: break-word; background-color: rgb(250, 250, 250);"><cod=
e class=3D"prettyprint"><div class=3D"subprettyprint"><span style=3D"color:=
#000;" class=3D"styled-by-prettify">using prefix_set =3D std::bitset<36=
>;<br></span><span style=3D"color: #008;" class=3D"styled-by-prettify">u=
sing</span><span style=3D"color: #000;" class=3D"styled-by-prettify"> digit=
_separator_char_set </span><span style=3D"color: #660;" class=3D"styled-by-=
prettify">=3D</span><span style=3D"color: #000;" class=3D"styled-by-prettif=
y"> std</span><span style=3D"color: #660;" class=3D"styled-by-prettify">::<=
/span><span style=3D"color: #000;" class=3D"styled-by-prettify">bitset</spa=
n><span style=3D"color: #660;" class=3D"styled-by-prettify"><</span><spa=
n style=3D"color: #066;" class=3D"styled-by-prettify">0x100</span><span sty=
le=3D"color: #660;" class=3D"styled-by-prettify">>;<br></span><span styl=
e=3D"color: #000;" class=3D"styled-by-prettify"><br>static constexpr auto a=
uto_base =3D 0;<br>static constexpr auto all_prefixes =3D prefix_set().flip=
();<br>static constexpr auto no_prefixes =3D prefix_set();<br>static conste=
xpr auto all_digit_separators =3D digit_separator_char_set().flio();<br>sta=
tic constexpr auto no_digit_separators =3D digit_separator_char_set();<br><=
br>//Default is allow no digit separators, auto detect base, all prefixes e=
nabled, leading plus enabled, <br></span><span style=3D"color: #008;" =
class=3D"styled-by-prettify">template</span><span style=3D"color: #000;" cl=
ass=3D"styled-by-prettify"> </span><span style=3D"color: #660;" class=3D"st=
yled-by-prettify"><</span><span style=3D"color: #008;" class=3D"styled-b=
y-prettify">typename</span><span style=3D"color: #000;" class=3D"styled-by-=
prettify"> </span><span style=3D"color: #606;" class=3D"styled-by-prettify"=
>Integral</span><span style=3D"color: #660;" class=3D"styled-by-prettify">&=
gt;</span><span style=3D"color: #000;" class=3D"styled-by-prettify"><br>err=
or_code parse</span><span style=3D"color: #660;" class=3D"styled-by-prettif=
y">(</span><span style=3D"color: #000;" class=3D"styled-by-prettify">string=
_view</span><span style=3D"color: #660;" class=3D"styled-by-prettify">&=
</span><span style=3D"color: #000;" class=3D"styled-by-prettify"> tail</spa=
n><span style=3D"color: #660;" class=3D"styled-by-prettify">,</span><span s=
tyle=3D"color: #000;" class=3D"styled-by-prettify"> string_view str</span><=
span style=3D"color: #660;" class=3D"styled-by-prettify">,</span><span styl=
e=3D"color: #000;" class=3D"styled-by-prettify"> </span><span class=3D=
"styled-by-prettify" style=3D"font-family: Arial, Helvetica, sans-serif; co=
lor: rgb(0, 0, 136);">int</span><span class=3D"styled-by-prettify" style=3D=
"font-family: Arial, Helvetica, sans-serif; color: rgb(0, 0, 0);"> </span><=
span class=3D"styled-by-prettify" style=3D"font-family: Arial, Helvetica, s=
ans-serif; color: rgb(0, 0, 136);">base</span><span class=3D"styled-by-pret=
tify" style=3D"font-family: Arial, Helvetica, sans-serif; color: rgb(102, 1=
02, 0);">,</span><span class=3D"styled-by-prettify" style=3D"font-family: A=
rial, Helvetica, sans-serif; color: rgb(0, 0, 0);"> </span><span class=
=3D"styled-by-prettify" style=3D"font-family: Arial, Helvetica, sans-serif;=
color: rgb(0, 0, 0);">digit_separator_set digit_separators</span><font col=
or=3D"#000000" style=3D"font-family: Arial, Helvetica, sans-serif;"><span c=
lass=3D"styled-by-prettify" style=3D"color: rgb(102, 102, 0);">, </span></f=
ont><span class=3D"styled-by-prettify" style=3D"font-family: Arial, Helveti=
ca, sans-serif; color: rgb(0, 0, 0);">prefix_set prefixes_enabled</span><sp=
an class=3D"styled-by-prettify" style=3D"font-family: Arial, Helvetica, san=
s-serif; color: rgb(102, 102, 0);">=3D</span><span class=3D"styled-by-prett=
ify" style=3D"font-family: Arial, Helvetica, sans-serif; color: rgb(0, 0, 0=
);">all_prefixes</span><span class=3D"styled-by-prettify" style=3D"font-fam=
ily: Arial, Helvetica, sans-serif; color: rgb(102, 102, 0);">,</span><span =
class=3D"styled-by-prettify" style=3D"font-family: Arial, Helvetica, sans-s=
erif; color: rgb(0, 0, 0);"> </span><span class=3D"styled-by-prettify" styl=
e=3D"font-family: Arial, Helvetica, sans-serif; color: rgb(0, 0, 136);">boo=
l</span><span class=3D"styled-by-prettify" style=3D"font-family: Arial, Hel=
vetica, sans-serif; color: rgb(0, 0, 0);"> leading_plus_enabled </span><spa=
n class=3D"styled-by-prettify" style=3D"font-family: Arial, Helvetica, sans=
-serif; color: rgb(102, 102, 0);">=3D</span><span class=3D"styled-by-pretti=
fy" style=3D"font-family: Arial, Helvetica, sans-serif; color: rgb(0, 0, 0)=
;"> </span><span class=3D"styled-by-prettify" style=3D"font-family: Arial, =
Helvetica, sans-serif; color: rgb(0, 0, 136);">true</span><font color=3D"#0=
00000" style=3D"font-family: Arial, Helvetica, sans-serif;"><span style=3D"=
color: #660;" class=3D"styled-by-prettify">}</span></font><font color=3D"#0=
00000" style=3D"font-family: Arial, Helvetica, sans-serif;"><span style=3D"=
color: #660;" class=3D"styled-by-prettify">);</span></font></div><div class=
=3D"subprettyprint"><span style=3D"color: #008;" class=3D"styled-by-prettif=
y">template</span><span style=3D"color: #000;" class=3D"styled-by-prettify"=
> </span><span style=3D"color: #660;" class=3D"styled-by-prettify"><</sp=
an><span style=3D"color: #008;" class=3D"styled-by-prettify">typename</span=
><span style=3D"color: #000;" class=3D"styled-by-prettify"> </span><span st=
yle=3D"color: #606;" class=3D"styled-by-prettify">Integral</span><span styl=
e=3D"color: #660;" class=3D"styled-by-prettify">></span><span style=3D"c=
olor: #000;" class=3D"styled-by-prettify"><br>error_code parse</span><span =
style=3D"color: #660;" class=3D"styled-by-prettify">(</span><span style=3D"=
color: #000;" class=3D"styled-by-prettify">string_view</span><span style=3D=
"color: #660;" class=3D"styled-by-prettify">&</span><span style=3D"colo=
r: #000;" class=3D"styled-by-prettify"> tail</span><span style=3D"color: #6=
60;" class=3D"styled-by-prettify">,</span><span style=3D"color: #000;" clas=
s=3D"styled-by-prettify"> string_view str</span><span style=3D"color: #660;=
" class=3D"styled-by-prettify">,</span><span style=3D"color: #000;" class=
=3D"styled-by-prettify"> </span><span style=3D"color: #008;" class=3D"style=
d-by-prettify">int</span><span style=3D"color: #000;" class=3D"styled-by-pr=
ettify"> </span><span style=3D"color: #008;" class=3D"styled-by-prettify">b=
ase</span><span style=3D"color: #660;" class=3D"styled-by-prettify">, =
</span><span class=3D"styled-by-prettify" style=3D"font-family: Arial, Helv=
etica, sans-serif; color: rgb(0, 0, 136);">char</span><span class=3D"styled=
-by-prettify" style=3D"font-family: Arial, Helvetica, sans-serif; color: rg=
b(0, 0, 0);"> digit_separator</span><span class=3D"styled-by-prettify"=
style=3D"font-family: Arial, Helvetica, sans-serif; color: rgb(0, 136, 0);=
">, </span><span class=3D"styled-by-prettify" style=3D"font-family: Ar=
ial, Helvetica, sans-serif; color: rgb(0, 0, 0);">prefix_set prefixes_enabl=
ed</span><span class=3D"styled-by-prettify" style=3D"font-family: Arial, He=
lvetica, sans-serif; color: rgb(102, 102, 0);">=3D</span><span class=3D"sty=
led-by-prettify" style=3D"font-family: Arial, Helvetica, sans-serif; color:=
rgb(0, 0, 0);">all_prefixes</span><span class=3D"styled-by-prettify" style=
=3D"font-family: Arial, Helvetica, sans-serif; color: rgb(102, 102, 0);">,<=
/span><span class=3D"styled-by-prettify" style=3D"font-family: Arial, Helve=
tica, sans-serif; color: rgb(0, 0, 0);"> </span><span class=3D"styled-by-pr=
ettify" style=3D"font-family: Arial, Helvetica, sans-serif; color: rgb(0, 0=
, 136);">bool</span><span class=3D"styled-by-prettify" style=3D"font-family=
: Arial, Helvetica, sans-serif; color: rgb(0, 0, 0);"> leading_plus_enabled=
</span><span class=3D"styled-by-prettify" style=3D"font-family: Arial, Hel=
vetica, sans-serif; color: rgb(102, 102, 0);">=3D</span><span class=3D"styl=
ed-by-prettify" style=3D"font-family: Arial, Helvetica, sans-serif; color: =
rgb(0, 0, 0);"> </span><span class=3D"styled-by-prettify" style=3D"font-fam=
ily: Arial, Helvetica, sans-serif; color: rgb(0, 0, 136);">true</span><font=
color=3D"#000000" style=3D"font-family: Arial, Helvetica, sans-serif;"><sp=
an style=3D"color: #660;" class=3D"styled-by-prettify">);<br></span></font>=
<span class=3D"styled-by-prettify" style=3D"font-family: Arial, Helvetica, =
sans-serif; color: rgb(0, 0, 136);">template</span><span class=3D"styled-by=
-prettify" style=3D"font-family: Arial, Helvetica, sans-serif; color: rgb(0=
, 0, 0);"> </span><span class=3D"styled-by-prettify" style=3D"font-fam=
ily: Arial, Helvetica, sans-serif; color: rgb(102, 102, 0);"><</span><sp=
an class=3D"styled-by-prettify" style=3D"font-family: Arial, Helvetica, san=
s-serif; color: rgb(0, 0, 136);">typename</span><span class=3D"styled-by-pr=
ettify" style=3D"font-family: Arial, Helvetica, sans-serif; color: rgb(0, 0=
, 0);"> </span><span class=3D"styled-by-prettify" style=3D"font-family=
: Arial, Helvetica, sans-serif; color: rgb(102, 0, 102);">Integral</span><s=
pan class=3D"styled-by-prettify" style=3D"font-family: Arial, Helvetica, sa=
ns-serif; color: rgb(102, 102, 0);">></span><font color=3D"#000000" styl=
e=3D"font-family: Arial, Helvetica, sans-serif;"><span style=3D"color: #660=
;" class=3D"styled-by-prettify"><br></span></font></div><span class=3D"styl=
ed-by-prettify" style=3D"color: rgb(0, 0, 0);">inline error_code parse</spa=
n><span class=3D"styled-by-prettify" style=3D"color: rgb(102, 102, 0);">(</=
span><span class=3D"styled-by-prettify" style=3D"color: rgb(0, 0, 0);">stri=
ng_view</span><span class=3D"styled-by-prettify" style=3D"color: rgb(102, 1=
02, 0);">&</span><span class=3D"styled-by-prettify" style=3D"color: rgb=
(0, 0, 0);"> tail</span><span class=3D"styled-by-prettify" style=3D"co=
lor: rgb(102, 102, 0);">,</span><span class=3D"styled-by-prettify" style=3D=
"color: rgb(0, 0, 0);"> string_view str</span><span class=3D"styled-by=
-prettify" style=3D"color: rgb(102, 102, 0);">,</span><span class=3D"styled=
-by-prettify" style=3D"color: rgb(0, 0, 0);"> </span><span class=3D"st=
yled-by-prettify" style=3D"color: rgb(0, 0, 136);">int</span><span class=3D=
"styled-by-prettify" style=3D"color: rgb(0, 0, 0);"> </span><span clas=
s=3D"styled-by-prettify" style=3D"color: rgb(0, 0, 136);">base=3D0</span><s=
pan class=3D"styled-by-prettify" style=3D"color: rgb(102, 102, 0);">,</span=
><span class=3D"styled-by-prettify" style=3D"font-family: Arial, Helvetica,=
sans-serif; color: rgb(0, 136, 0);"> </span><span class=3D"styled-by-=
prettify" style=3D"font-family: Arial, Helvetica, sans-serif; color: rgb(0,=
0, 0);">prefix_set prefixes_enabled</span><span class=3D"styled-by-prettif=
y" style=3D"font-family: Arial, Helvetica, sans-serif; color: rgb(102, 102,=
0);">=3D</span><span class=3D"styled-by-prettify" style=3D"font-family: Ar=
ial, Helvetica, sans-serif; color: rgb(0, 0, 0);">all_prefixes</span><span =
class=3D"styled-by-prettify" style=3D"font-family: Arial, Helvetica, sans-s=
erif; color: rgb(102, 102, 0);">,</span><span class=3D"styled-by-prettify" =
style=3D"font-family: Arial, Helvetica, sans-serif; color: rgb(0, 0, 0);">&=
nbsp;</span><span class=3D"styled-by-prettify" style=3D"font-family: Arial,=
Helvetica, sans-serif; color: rgb(0, 0, 136);">bool</span><span class=3D"s=
tyled-by-prettify" style=3D"font-family: Arial, Helvetica, sans-serif; colo=
r: rgb(0, 0, 0);"> leading_plus_enabled </span><span class=3D"sty=
led-by-prettify" style=3D"font-family: Arial, Helvetica, sans-serif; color:=
rgb(102, 102, 0);">=3D</span><span class=3D"styled-by-prettify" style=3D"f=
ont-family: Arial, Helvetica, sans-serif; color: rgb(0, 0, 0);"> </spa=
n><span class=3D"styled-by-prettify" style=3D"font-family: Arial, Helvetica=
, sans-serif; color: rgb(0, 0, 136);">true</span><font color=3D"#000000" st=
yle=3D"font-family: Arial, Helvetica, sans-serif;"><span class=3D"styled-by=
-prettify" style=3D"color: rgb(102, 102, 0);">) {<br> return parse(ta=
il, str, base, no_digit_separators, prefixes_enabled, leading_plus_enabled)=
;<br></span></font><div class=3D"subprettyprint"><font color=3D"#000000"><s=
pan style=3D"color: #000;" class=3D"styled-by-prettify">} </span></fon=
t></div></code></div><div><br></div>This is a lot of function arguments, bu=
t I imagine this will be implemented using different implementations based =
on whether or not options are turned on which means optimizes implementatio=
ns of specific combinations of defaults can be replaced by optimized overlo=
ads and inline dispatch. For example, using simd on an array of digits may =
not be possible or efficient if digit separators are supported.<br><br></di=
v></div>
<p></p>
-- <br />
<br />
--- <br />
You received this message because you are subscribed to the Google Groups &=
quot;ISO C++ Standard - Future Proposals" group.<br />
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to <a href=3D"mailto:std-proposals+unsubscribe@isocpp.org">std-proposa=
ls+unsubscribe@isocpp.org</a>.<br />
To post to this group, send email to <a href=3D"mailto:std-proposals@isocpp=
..org">std-proposals@isocpp.org</a>.<br />
Visit this group at <a href=3D"http://groups.google.com/a/isocpp.org/group/=
std-proposals/">http://groups.google.com/a/isocpp.org/group/std-proposals/<=
/a>.<br />
------=_Part_1224_1329894501.1434502075920--
------=_Part_1223_796361515.1434502075920--
.
Author: Olaf van der Spek <olafvdspek@gmail.com>
Date: Wed, 17 Jun 2015 13:25:09 +0200
Raw View
2015-06-17 2:47 GMT+02:00 Matthew Fioravante <fmatthew5876@gmail.com>:
>> Prefixes aren't allowed when base != 0
>
>
> FYI they are allowed for strtol(). Also see my earlier example about
> prefixes again.
I should have said when base isn't 0 or 16.
When base is 10 prefixes aren't allowed (AFAIK).
> I think we should also make base = 0 by default instead of base=10.
Why?
> This is a lot of function arguments, but I imagine this will be implemented
> using different implementations based on whether or not options are turned
> on which means optimizes implementations of specific combinations of
> defaults can be replaced by optimized overloads and inline dispatch. For
> example, using simd on an array of digits may not be possible or efficient
> if digit separators are supported.
Having two or three bool arguments is unreadable..
I really think we should keep the proposal simple and worry about
disallowing stuff later..
Though a parse_unsigned variant would allow one to build a strict
parser oneselves.. and still be quite simple.
--
Olaf
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
.
Author: Miro Knejp <miro.knejp@gmail.com>
Date: Wed, 17 Jun 2015 16:31:38 +0200
Raw View
This is a multi-part message in MIME format.
--------------080907090201030006090807
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: quoted-printable
Am 17.06.2015 um 02:47 schrieb Matthew Fioravante:
>
> On Tuesday, June 16, 2015 at 2:22:55 AM UTC-4, Olaf van der Spek wrote:
>
> 2015-06-16 5:10 GMT+02:00 Miro Knejp <miro....@gmail.com
> <javascript:>>:
> >> What's C-specific besides bases?
> >>
> > Prefixes, separators, suffixes. '+' and '-' are technically not
> part of the
> > literal but not accepting '-' for negative types is nonsense.
> There is no
> > requirement for accepting '+' in either case, it's redundant and
> not correct
> > in all use cases.
>
> Prefixes aren't allowed when base !=3D 0
>
>
> FYI they are allowed for strtol(). Also see my earlier example about=20
> prefixes again.
>
> http://en.cppreference.com/w/cpp/string/byte/strtol
>
> What separators? The dot etc for floating point numbers?
> What suffixes?
>
>
> > I didn't say don't support signed types, I said don't force in
> extras which aren't necessary for the foundation. Always make
> those optional.
>
> Got a proposal for how to specify what is and what is not allowed?
>
>
> So lets enumerate all of the variations in input for integers.
>
> Required rules:
> -The digits for the absolute integer value
> -A leading '-' for signed inputs.
> Useful optional rules:
> - A base prefix (e.g 0x, 0, 0b), required for auto-detection and=20
> optionally allowed for a fixed base.
> - A leading '+' which is redundant
> - Supporting digit separators (, or . or any character and length).=20
> This can be useful because you basically have to reimplement your own=20
> parser from scratch if you want to do this. Digit separators are also=20
> allowed in C++ literals so there is a compatibility argument.
>
This seems reasonable.
>
> Questionable rules:
> - A leading '-' for unsigned values, causing a parse into signed=20
> followed by a conversion to unsigned.
> - 'true' and 'false' for auto detected bool.
Don't think this is a good idea. Accepting negative numbers for unsigned=20
types just seems very surprising. And I would say that 0/1 should=20
suffice for bool in the basic version. Maybe go as far as add two=20
overloads each accepting two chars or two strings for the true/false case.
>
> I agree with the others who would like to use this function with only=20
> the necessary rules in order to use it effectively in a high level=20
> strict parsing protocol. These cases are rare however,
How do you know it is rare? Do you have numbers to back this claim? We=20
are talking about an international standard that is used by virtually=20
every industry in existence in some way. =E2=80=9CNobody knows what most C+=
+=20
programmers do.=E2=80=9D
> and I think by default doing the most obvious thing is best. Just like=20
> with std::atomic, the default is the slowest but also the easiest to=20
> understand and use quickly.
>
> I think we should also make base =3D 0 by default instead of base=3D10.
This is a bad idea. Having a number be interpreted differently because=20
it has a leading zero should not be the default surprise unless one has=20
to explicitly enable the octal prefix.
>
> Here is one interface for this. I'm still using string_view here to=20
> not get back into the discussion of the return / out-param issue.
>
> |
> using prefix_set =3D std::bitset<36>;
> usingdigit_separator_char_set =3Dstd::bitset<0x100>;
>
> static constexpr auto auto_base =3D 0;
> static constexpr auto all_prefixes =3D prefix_set().flip();
> static constexpr auto no_prefixes =3D prefix_set();
> static constexpr auto all_digit_separators =3D=20
> digit_separator_char_set().flio();
> static constexpr auto no_digit_separators =3D digit_separator_char_set();
>
> //Default is allow no digit separators, auto detect base, all prefixes=20
> enabled, leading plus enabled,
> template<typenameIntegral>
> error_code parse(string_view&tail,string_view=20
> str,intbase,digit_separator_set digit_separators, prefix_set=20
> prefixes_enabled=3Dall_prefixes,boolleading_plus_enabled =3Dtrue});
> template<typenameIntegral>
> error_code parse(string_view&tail,string_view str,intbase,=20
> char digit_separator, prefix_set=20
> prefixes_enabled=3Dall_prefixes,boolleading_plus_enabled =3Dtrue);
> template<typenameIntegral>
> inline error_code parse(string_view& tail, string_view=20
> str,intbase=3D0,prefix_set=20
> prefixes_enabled=3Dall_prefixes,bool leading_plus_enabled =3Dtrue) {
> return parse(tail, str, base, no_digit_separators, prefixes_enabled,=20
> leading_plus_enabled);
> }
> |
>
Why 36 prefix bits? There are only 3 availablechoices: 0(x|X), 0, 0(b|B).
I wouldn't add multiple digit separator support. You should know what=20
character you accept as digit separator. Remember these calls are=20
locale-independent to be fast and simple. You most likely know the=20
format of your input and I doubt it has mixed digit separators.
Ignoring the input/output arguments this should do the job
.... parse(..., int radix) // The default accepts '+' but no separators
.... parse(..., int radix, no_plus_prefix_t)
.... parse(..., int radix, char separator)
.... parse(..., int radix, no_plus_prefix_t, char separator)
One *could* squeeze the plus prefix option into the radix param since=20
its only valid values are in [0;36] and there are plenty of bits left.=20
Give it a custom type and you have yourself a pretty fluent interface:
.... parse(..., parse_options_t); // parse_options_t is a concrete type=20
(no int or enum)
parse(..., radix(10) | no_plus_prefix | digit_separator(','))
--=20
---=20
You received this message because you are subscribed to the Google Groups "=
ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposa=
ls/.
--------------080907090201030006090807
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable
<html>
<head>
<meta content=3D"text/html; charset=3Dutf-8" http-equiv=3D"Content-Type=
">
</head>
<body bgcolor=3D"#FFFFFF" text=3D"#000000">
<br>
<br>
<div class=3D"moz-cite-prefix">Am 17.06.2015 um 02:47 schrieb Matthew
Fioravante:<br>
</div>
<blockquote
cite=3D"mid:d3567976-4329-4f3f-9b3c-af11fd480e8c@isocpp.org"
type=3D"cite">
<div dir=3D"ltr">
<div><br>
</div>
<div>On Tuesday, June 16, 2015 at 2:22:55 AM UTC-4, Olaf van der
Spek wrote:<br>
</div>
<blockquote class=3D"gmail_quote" style=3D"margin: 0;margin-left:
0.8ex;border-left: 1px #ccc solid;padding-left: 1ex;">2015-06-16
5:10 GMT+02:00 Miro Knejp <<a moz-do-not-send=3D"true"
href=3D"javascript:" target=3D"_blank"
gdf-obfuscated-mailto=3D"ofboCiPePAEJ" rel=3D"nofollow"
onmousedown=3D"this.href=3D'javascript:';return true;"
onclick=3D"this.href=3D'javascript:';return true;">miro....@gma=
il.com</a>>:
<br>
>> What's C-specific besides bases?
<br>
>>
<br>
> Prefixes, separators, suffixes. '+' and '-' are
technically not part of the
<br>
> literal but not accepting '-' for negative types is
nonsense. There is no
<br>
> requirement for accepting '+' in either case, it's
redundant and not correct
<br>
> in all use cases.
<br>
<br>
Prefixes aren't allowed when base !=3D 0
<br>
</blockquote>
<div><br>
</div>
<div>FYI they are allowed for strtol(). Also see my earlier
example about prefixes again.</div>
<div><br>
</div>
<div><a class=3D"moz-txt-link-freetext" href=3D"http://en.cpprefere=
nce.com/w/cpp/string/byte/strtol">http://en.cppreference.com/w/cpp/string/b=
yte/strtol</a><br>
</div>
<div>=C2=A0</div>
<blockquote class=3D"gmail_quote" style=3D"margin: 0;margin-left:
0.8ex;border-left: 1px #ccc solid;padding-left: 1ex;">What
separators? The dot etc for floating point numbers?
<br>
What suffixes?
<br>
<br>
<br>
> I didn't say don't support signed types, I said don't
force in extras which aren't necessary for the foundation.
Always make those optional.
<br>
<br>
Got a proposal for how to specify what is and what is not
allowed?
<br>
<br>
</blockquote>
<div><br>
</div>
<div>So lets enumerate all of the variations in input for
integers.=C2=A0</div>
<div><br>
</div>
<div>Required rules:</div>
<div>-The digits for the absolute integer value</div>
<div>-A leading '-' for signed inputs.</div>
<div>=C2=A0</div>
<div>Useful optional rules:</div>
<div>- A base prefix (e.g 0x, 0, 0b), required for
auto-detection and optionally allowed for a fixed base.</div>
<div>- A leading '+' which is redundant</div>
<div>
<div>- Supporting digit separators (, or . or any character
and length). This can be useful because you basically have
to reimplement your own parser from scratch if you want to
do this. Digit separators are also allowed in C++ literals
so there is a compatibility argument.</div>
</div>
<div><br>
</div>
</div>
</blockquote>
This seems reasonable.<br>
<blockquote
cite=3D"mid:d3567976-4329-4f3f-9b3c-af11fd480e8c@isocpp.org"
type=3D"cite">
<div dir=3D"ltr">
<div><br>
</div>
<div>Questionable rules:</div>
<div>- A leading '-' for unsigned values, causing a parse into
signed followed by a conversion to unsigned.</div>
<div>- 'true' and 'false' for auto detected bool. <br>
</div>
</div>
</blockquote>
Don't think this is a good idea. Accepting negative numbers for
unsigned types just seems very surprising. And I would say that 0/1
should suffice for bool in the basic version. Maybe go as far as add
two overloads each accepting two chars or two strings for the
true/false case.<br>
<blockquote
cite=3D"mid:d3567976-4329-4f3f-9b3c-af11fd480e8c@isocpp.org"
type=3D"cite">
<div dir=3D"ltr">
<div><br>
</div>
<div>I agree with the others who would like to use this function
with only the necessary rules in order to use it effectively
in a high level strict parsing protocol. These cases are rare
however, </div>
</div>
</blockquote>
How do you know it is rare? Do you have numbers to back this claim?
We are talking about an international standard that is used by
virtually every industry in existence in some way. =E2=80=9CNobody know=
s
what most C++ programmers do.=E2=80=9D<br>
<blockquote
cite=3D"mid:d3567976-4329-4f3f-9b3c-af11fd480e8c@isocpp.org"
type=3D"cite">
<div dir=3D"ltr">
<div>and I think by default doing the most obvious thing is
best. Just like with std::atomic, the default is the slowest
but also the easiest to understand and use quickly.=C2=A0</div>
<div><br>
</div>
<div>I think we should also make base =3D 0 by default instead of
base=3D10.<br>
</div>
</div>
</blockquote>
This is a bad idea. Having a number be interpreted differently
because it has a leading zero should not be the default surprise
unless one has to explicitly enable the octal prefix.<br>
<blockquote
cite=3D"mid:d3567976-4329-4f3f-9b3c-af11fd480e8c@isocpp.org"
type=3D"cite">
<div dir=3D"ltr">
<div><br>
</div>
<div>Here is one interface for this. I'm still using string_view
here to not get back into the discussion of the return /
out-param issue.</div>
<div><br>
</div>
<div>
<div class=3D"prettyprint" style=3D"border: 1px solid rgb(187,
187, 187); word-wrap: break-word; background-color: rgb(250,
250, 250);"><code class=3D"prettyprint">
<div class=3D"subprettyprint"><span style=3D"color: #000;"
class=3D"styled-by-prettify">using prefix_set =3D
std::bitset<36>;<br>
</span><span style=3D"color: #008;"
class=3D"styled-by-prettify">using</span><span
style=3D"color: #000;" class=3D"styled-by-prettify">
digit_separator_char_set </span><span style=3D"color:
#660;" class=3D"styled-by-prettify">=3D</span><span
style=3D"color: #000;" class=3D"styled-by-prettify"> std<=
/span><span
style=3D"color: #660;" class=3D"styled-by-prettify">::</s=
pan><span
style=3D"color: #000;" class=3D"styled-by-prettify">bitse=
t</span><span
style=3D"color: #660;" class=3D"styled-by-prettify"><<=
/span><span
style=3D"color: #066;" class=3D"styled-by-prettify">0x100=
</span><span
style=3D"color: #660;" class=3D"styled-by-prettify">>;=
<br>
</span><span style=3D"color: #000;"
class=3D"styled-by-prettify"><br>
static constexpr auto auto_base =3D 0;<br>
static constexpr auto all_prefixes =3D
prefix_set().flip();<br>
static constexpr auto no_prefixes =3D prefix_set();<br>
static constexpr auto all_digit_separators =3D
digit_separator_char_set().flio();<br>
static constexpr auto no_digit_separators =3D
digit_separator_char_set();<br>
<br>
//Default is allow no digit separators, auto detect
base, all prefixes enabled, leading plus enabled,=C2=A0<b=
r>
</span><span style=3D"color: #008;"
class=3D"styled-by-prettify">template</span><span
style=3D"color: #000;" class=3D"styled-by-prettify"> </sp=
an><span
style=3D"color: #660;" class=3D"styled-by-prettify"><<=
/span><span
style=3D"color: #008;" class=3D"styled-by-prettify">typen=
ame</span><span
style=3D"color: #000;" class=3D"styled-by-prettify"> </sp=
an><span
style=3D"color: #606;" class=3D"styled-by-prettify">Integ=
ral</span><span
style=3D"color: #660;" class=3D"styled-by-prettify">><=
/span><span
style=3D"color: #000;" class=3D"styled-by-prettify"><br>
error_code parse</span><span style=3D"color: #660;"
class=3D"styled-by-prettify">(</span><span style=3D"color=
:
#000;" class=3D"styled-by-prettify">string_view</span><sp=
an
style=3D"color: #660;" class=3D"styled-by-prettify">&=
</span><span
style=3D"color: #000;" class=3D"styled-by-prettify"> tail=
</span><span
style=3D"color: #660;" class=3D"styled-by-prettify">,</sp=
an><span
style=3D"color: #000;" class=3D"styled-by-prettify">
string_view str</span><span style=3D"color: #660;"
class=3D"styled-by-prettify">,</span><span style=3D"color=
:
#000;" class=3D"styled-by-prettify">=C2=A0</span><span
class=3D"styled-by-prettify" style=3D"font-family: Arial,
Helvetica, sans-serif; color: rgb(0, 0, 136);">int</span>=
<span
class=3D"styled-by-prettify" style=3D"font-family: Arial,
Helvetica, sans-serif; color: rgb(0, 0, 0);"> </span><spa=
n
class=3D"styled-by-prettify" style=3D"font-family: Arial,
Helvetica, sans-serif; color: rgb(0, 0, 136);">base</span=
><span
class=3D"styled-by-prettify" style=3D"font-family: Arial,
Helvetica, sans-serif; color: rgb(102, 102, 0);">,</span>=
<span
class=3D"styled-by-prettify" style=3D"font-family: Arial,
Helvetica, sans-serif; color: rgb(0, 0, 0);">=C2=A0</span=
><span
class=3D"styled-by-prettify" style=3D"font-family: Arial,
Helvetica, sans-serif; color: rgb(0, 0, 0);">digit_separa=
tor_set
digit_separators</span><font style=3D"font-family:
Arial, Helvetica, sans-serif;" color=3D"#000000"><span
class=3D"styled-by-prettify" style=3D"color: rgb(102,
102, 0);">, </span></font><span
class=3D"styled-by-prettify" style=3D"font-family: Arial,
Helvetica, sans-serif; color: rgb(0, 0, 0);">prefix_set
prefixes_enabled</span><span
class=3D"styled-by-prettify" style=3D"font-family: Arial,
Helvetica, sans-serif; color: rgb(102, 102, 0);">=3D</spa=
n><span
class=3D"styled-by-prettify" style=3D"font-family: Arial,
Helvetica, sans-serif; color: rgb(0, 0, 0);">all_prefixes=
</span><span
class=3D"styled-by-prettify" style=3D"font-family: Arial,
Helvetica, sans-serif; color: rgb(102, 102, 0);">,</span>=
<span
class=3D"styled-by-prettify" style=3D"font-family: Arial,
Helvetica, sans-serif; color: rgb(0, 0, 0);"> </span><spa=
n
class=3D"styled-by-prettify" style=3D"font-family: Arial,
Helvetica, sans-serif; color: rgb(0, 0, 136);">bool</span=
><span
class=3D"styled-by-prettify" style=3D"font-family: Arial,
Helvetica, sans-serif; color: rgb(0, 0, 0);">
leading_plus_enabled </span><span
class=3D"styled-by-prettify" style=3D"font-family: Arial,
Helvetica, sans-serif; color: rgb(102, 102, 0);">=3D</spa=
n><span
class=3D"styled-by-prettify" style=3D"font-family: Arial,
Helvetica, sans-serif; color: rgb(0, 0, 0);"> </span><spa=
n
class=3D"styled-by-prettify" style=3D"font-family: Arial,
Helvetica, sans-serif; color: rgb(0, 0, 136);">true</span=
><font
style=3D"font-family: Arial, Helvetica, sans-serif;"
color=3D"#000000"><span style=3D"color: #660;"
class=3D"styled-by-prettify">}</span></font><font
style=3D"font-family: Arial, Helvetica, sans-serif;"
color=3D"#000000"><span style=3D"color: #660;"
class=3D"styled-by-prettify">);</span></font></div>
<div class=3D"subprettyprint"><span style=3D"color: #008;"
class=3D"styled-by-prettify">template</span><span
style=3D"color: #000;" class=3D"styled-by-prettify"> </sp=
an><span
style=3D"color: #660;" class=3D"styled-by-prettify"><<=
/span><span
style=3D"color: #008;" class=3D"styled-by-prettify">typen=
ame</span><span
style=3D"color: #000;" class=3D"styled-by-prettify"> </sp=
an><span
style=3D"color: #606;" class=3D"styled-by-prettify">Integ=
ral</span><span
style=3D"color: #660;" class=3D"styled-by-prettify">><=
/span><span
style=3D"color: #000;" class=3D"styled-by-prettify"><br>
error_code parse</span><span style=3D"color: #660;"
class=3D"styled-by-prettify">(</span><span style=3D"color=
:
#000;" class=3D"styled-by-prettify">string_view</span><sp=
an
style=3D"color: #660;" class=3D"styled-by-prettify">&=
</span><span
style=3D"color: #000;" class=3D"styled-by-prettify"> tail=
</span><span
style=3D"color: #660;" class=3D"styled-by-prettify">,</sp=
an><span
style=3D"color: #000;" class=3D"styled-by-prettify">
string_view str</span><span style=3D"color: #660;"
class=3D"styled-by-prettify">,</span><span style=3D"color=
:
#000;" class=3D"styled-by-prettify"> </span><span
style=3D"color: #008;" class=3D"styled-by-prettify">int</=
span><span
style=3D"color: #000;" class=3D"styled-by-prettify"> </sp=
an><span
style=3D"color: #008;" class=3D"styled-by-prettify">base<=
/span><span
style=3D"color: #660;" class=3D"styled-by-prettify">,=C2=
=A0</span><span
class=3D"styled-by-prettify" style=3D"font-family: Arial,
Helvetica, sans-serif; color: rgb(0, 0, 136);">char</span=
><span
class=3D"styled-by-prettify" style=3D"font-family: Arial,
Helvetica, sans-serif; color: rgb(0, 0, 0);">=C2=A0digit_=
separator</span><span
class=3D"styled-by-prettify" style=3D"font-family: Arial,
Helvetica, sans-serif; color: rgb(0, 136, 0);">,=C2=A0</s=
pan><span
class=3D"styled-by-prettify" style=3D"font-family: Arial,
Helvetica, sans-serif; color: rgb(0, 0, 0);">prefix_set
prefixes_enabled</span><span
class=3D"styled-by-prettify" style=3D"font-family: Arial,
Helvetica, sans-serif; color: rgb(102, 102, 0);">=3D</spa=
n><span
class=3D"styled-by-prettify" style=3D"font-family: Arial,
Helvetica, sans-serif; color: rgb(0, 0, 0);">all_prefixes=
</span><span
class=3D"styled-by-prettify" style=3D"font-family: Arial,
Helvetica, sans-serif; color: rgb(102, 102, 0);">,</span>=
<span
class=3D"styled-by-prettify" style=3D"font-family: Arial,
Helvetica, sans-serif; color: rgb(0, 0, 0);"> </span><spa=
n
class=3D"styled-by-prettify" style=3D"font-family: Arial,
Helvetica, sans-serif; color: rgb(0, 0, 136);">bool</span=
><span
class=3D"styled-by-prettify" style=3D"font-family: Arial,
Helvetica, sans-serif; color: rgb(0, 0, 0);">
leading_plus_enabled </span><span
class=3D"styled-by-prettify" style=3D"font-family: Arial,
Helvetica, sans-serif; color: rgb(102, 102, 0);">=3D</spa=
n><span
class=3D"styled-by-prettify" style=3D"font-family: Arial,
Helvetica, sans-serif; color: rgb(0, 0, 0);"> </span><spa=
n
class=3D"styled-by-prettify" style=3D"font-family: Arial,
Helvetica, sans-serif; color: rgb(0, 0, 136);">true</span=
><font
style=3D"font-family: Arial, Helvetica, sans-serif;"
color=3D"#000000"><span style=3D"color: #660;"
class=3D"styled-by-prettify">);<br>
</span></font><span class=3D"styled-by-prettify"
style=3D"font-family: Arial, Helvetica, sans-serif;
color: rgb(0, 0, 136);">template</span><span
class=3D"styled-by-prettify" style=3D"font-family: Arial,
Helvetica, sans-serif; color: rgb(0, 0, 0);">=C2=A0</span=
><span
class=3D"styled-by-prettify" style=3D"font-family: Arial,
Helvetica, sans-serif; color: rgb(102, 102, 0);"><</sp=
an><span
class=3D"styled-by-prettify" style=3D"font-family: Arial,
Helvetica, sans-serif; color: rgb(0, 0, 136);">typename</=
span><span
class=3D"styled-by-prettify" style=3D"font-family: Arial,
Helvetica, sans-serif; color: rgb(0, 0, 0);">=C2=A0</span=
><span
class=3D"styled-by-prettify" style=3D"font-family: Arial,
Helvetica, sans-serif; color: rgb(102, 0, 102);">Integral=
</span><span
class=3D"styled-by-prettify" style=3D"font-family: Arial,
Helvetica, sans-serif; color: rgb(102, 102, 0);">></sp=
an><font
style=3D"font-family: Arial, Helvetica, sans-serif;"
color=3D"#000000"><span style=3D"color: #660;"
class=3D"styled-by-prettify"><br>
</span></font></div>
<span class=3D"styled-by-prettify" style=3D"color: rgb(0, 0,
0);">inline error_code parse</span><span
class=3D"styled-by-prettify" style=3D"color: rgb(102, 102,
0);">(</span><span class=3D"styled-by-prettify"
style=3D"color: rgb(0, 0, 0);">string_view</span><span
class=3D"styled-by-prettify" style=3D"color: rgb(102, 102,
0);">&</span><span class=3D"styled-by-prettify"
style=3D"color: rgb(0, 0, 0);">=C2=A0tail</span><span
class=3D"styled-by-prettify" style=3D"color: rgb(102, 102,
0);">,</span><span class=3D"styled-by-prettify"
style=3D"color: rgb(0, 0, 0);">=C2=A0string_view str</span>=
<span
class=3D"styled-by-prettify" style=3D"color: rgb(102, 102,
0);">,</span><span class=3D"styled-by-prettify"
style=3D"color: rgb(0, 0, 0);">=C2=A0</span><span
class=3D"styled-by-prettify" style=3D"color: rgb(0, 0,
136);">int</span><span class=3D"styled-by-prettify"
style=3D"color: rgb(0, 0, 0);">=C2=A0</span><span
class=3D"styled-by-prettify" style=3D"color: rgb(0, 0,
136);">base=3D0</span><span class=3D"styled-by-prettify"
style=3D"color: rgb(102, 102, 0);">,</span><span
class=3D"styled-by-prettify" style=3D"font-family: Arial,
Helvetica, sans-serif; color: rgb(0, 136, 0);">=C2=A0</span=
><span
class=3D"styled-by-prettify" style=3D"font-family: Arial,
Helvetica, sans-serif; color: rgb(0, 0, 0);">prefix_set
prefixes_enabled</span><span class=3D"styled-by-prettify"
style=3D"font-family: Arial, Helvetica, sans-serif; color:
rgb(102, 102, 0);">=3D</span><span
class=3D"styled-by-prettify" style=3D"font-family: Arial,
Helvetica, sans-serif; color: rgb(0, 0, 0);">all_prefixes</=
span><span
class=3D"styled-by-prettify" style=3D"font-family: Arial,
Helvetica, sans-serif; color: rgb(102, 102, 0);">,</span><s=
pan
class=3D"styled-by-prettify" style=3D"font-family: Arial,
Helvetica, sans-serif; color: rgb(0, 0, 0);">=C2=A0</span><=
span
class=3D"styled-by-prettify" style=3D"font-family: Arial,
Helvetica, sans-serif; color: rgb(0, 0, 136);">bool</span><=
span
class=3D"styled-by-prettify" style=3D"font-family: Arial,
Helvetica, sans-serif; color: rgb(0, 0, 0);">=C2=A0leading_=
plus_enabled=C2=A0</span><span
class=3D"styled-by-prettify" style=3D"font-family: Arial,
Helvetica, sans-serif; color: rgb(102, 102, 0);">=3D</span>=
<span
class=3D"styled-by-prettify" style=3D"font-family: Arial,
Helvetica, sans-serif; color: rgb(0, 0, 0);">=C2=A0</span><=
span
class=3D"styled-by-prettify" style=3D"font-family: Arial,
Helvetica, sans-serif; color: rgb(0, 0, 136);">true</span><=
font
style=3D"font-family: Arial, Helvetica, sans-serif;"
color=3D"#000000"><span class=3D"styled-by-prettify"
style=3D"color: rgb(102, 102, 0);">) {<br>
=C2=A0 return parse(tail, str, base, no_digit_separators,
prefixes_enabled, leading_plus_enabled);<br>
</span></font>
<div class=3D"subprettyprint"><font color=3D"#000000"><span
style=3D"color: #000;" class=3D"styled-by-prettify">}=
=C2=A0</span></font></div>
</code></div>
<div><br>
</div>
</div>
</div>
</blockquote>
Why 36 prefix bits? There are only 3 availablechoices: 0(x|X), 0,
0(b|B). <br>
<br>
I wouldn't add multiple digit separator support. You should know
what character you accept as digit separator. Remember these calls
are locale-independent to be fast and simple. You most likely know
the format of your input and I doubt it has mixed digit separators.<br>
<br>
Ignoring the input/output arguments this should do the job<br>
... parse(..., int radix) // The default accepts '+' but no
separators<br>
... parse(..., int radix, no_plus_prefix_t)<br>
... parse(..., int radix, char separator)<br>
... parse(..., int radix, no_plus_prefix_t, char separator)<br>
<br>
One *could* squeeze the plus prefix option into the radix param
since its only valid values are in [0;36] and there are plenty of
bits left. Give it a custom type and you have yourself a pretty
fluent interface:<br>
<br>
... parse(..., parse_options_t); // parse_options_t is a concrete
type (no int or enum)<br>
<br>
parse(..., radix(10) | no_plus_prefix | digit_separator(','))<br>
<br>
<br>
</body>
</html>
<p></p>
-- <br />
<br />
--- <br />
You received this message because you are subscribed to the Google Groups &=
quot;ISO C++ Standard - Future Proposals" group.<br />
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to <a href=3D"mailto:std-proposals+unsubscribe@isocpp.org">std-proposa=
ls+unsubscribe@isocpp.org</a>.<br />
To post to this group, send email to <a href=3D"mailto:std-proposals@isocpp=
..org">std-proposals@isocpp.org</a>.<br />
Visit this group at <a href=3D"http://groups.google.com/a/isocpp.org/group/=
std-proposals/">http://groups.google.com/a/isocpp.org/group/std-proposals/<=
/a>.<br />
--------------080907090201030006090807--
.
Author: Matthew Woehlke <mw_triad@users.sourceforge.net>
Date: Wed, 17 Jun 2015 10:48:54 -0400
Raw View
On 2015-06-17 07:25, Olaf van der Spek wrote:
> I really think we should keep the proposal simple and worry about
> disallowing stuff later..
>
> Though a parse_unsigned variant would allow one to build a strict
> parser oneselves.. and still be quite simple.
+1, that's what I was thinking as well... have an integer parser that
*only* parses digits (no prefix, *no sign allowed*). Use an unsigned
type so as to preserve maximum range. Maybe even base-10 only (i.e. have
both a base-10 optimized version and a user base - with no prefix
detection - version).
I'd like to also see standardized a more fully functional parser on top
of that, but I can relate to the points raised about needing to support
the simplest and fastest case. Probably this would detect and take note
of leading prefix characters (sign, base indicator) and call the
appropriate lower level function, then apply a sign transform if needed.
--
Matthew
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
.
Author: Matthew Woehlke <mw_triad@users.sourceforge.net>
Date: Wed, 17 Jun 2015 10:53:30 -0400
Raw View
On 2015-06-17 10:31, Miro Knejp wrote:
> I would say that 0/1 should suffice for bool in the basic version.
> Maybe go as far as add two overloads each accepting two chars or two
> strings for the true/false case.
Accepting regex's would be better; that way you can say that true is
e.g. "1|y(es)?|t(rue)?" (case insensitive), rather than being limited to
exactly one accepted string. (Having faster overloads that take char or
string literals is fine too, but if we have those, I would also have a
regex overload.)
--
Matthew
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
.
Author: Matthew Fioravante <fmatthew5876@gmail.com>
Date: Wed, 17 Jun 2015 08:20:11 -0700 (PDT)
Raw View
------=_Part_299_1824107234.1434554411459
Content-Type: multipart/alternative;
boundary="----=_Part_300_293737266.1434554411459"
------=_Part_300_293737266.1434554411459
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable
On Wednesday, June 17, 2015 at 7:25:11 AM UTC-4, Olaf van der Spek wrote:
>
> 2015-06-17 2:47 GMT+02:00 Matthew Fioravante <fmatth...@gmail.com>:=20
> >> Prefixes aren't allowed when base !=3D 0=20
> >=20
> >=20
> > FYI they are allowed for strtol(). Also see my earlier example about=20
> > prefixes again.=20
>
> I should have said when base isn't 0 or 16.=20
>
Also 8 and 2.
=20
> When base is 10 prefixes aren't allowed (AFAIK).=20
>
Yes but prefixes don't make sense here. There is no prefix for 10 or=20
anything else not 2,8,16.
>
> > I think we should also make base =3D 0 by default instead of base=3D10.=
=20
>
> Why?=20
>
I was thinking the default behavior to enable all numeric formats. But I'm=
=20
fine with default =3D 10 or whatever.
=20
On Wednesday, June 17, 2015 at 10:31:04 AM UTC-4, Miro Knejp wrote:
>
> =20
>
> Am 17.06.2015 um 02:47 schrieb Matthew Fioravante:
> =20
> =20
> On Tuesday, June 16, 2015 at 2:22:55 AM UTC-4, Olaf van der Spek wrote:
> =20
>> 2015-06-16 5:10 GMT+02:00 Miro Knejp <miro....@gmail.com>:=20
>> >> What's C-specific besides bases?=20
>> >>=20
>> > Prefixes, separators, suffixes. '+' and '-' are technically not part o=
f=20
>> the=20
>> > literal but not accepting '-' for negative types is nonsense. There is=
=20
>> no=20
>> > requirement for accepting '+' in either case, it's redundant and not=
=20
>> correct=20
>> > in all use cases.=20
>>
>> Prefixes aren't allowed when base !=3D 0=20
>>
>
> FYI they are allowed for strtol(). Also see my earlier example about=20
> prefixes again.
>
> http://en.cppreference.com/w/cpp/string/byte/strtol
> =20
>
>> What separators? The dot etc for floating point numbers?=20
>> What suffixes?=20
>>
>>
>> > I didn't say don't support signed types, I said don't force in extras=
=20
>> which aren't necessary for the foundation. Always make those optional.=
=20
>>
>> Got a proposal for how to specify what is and what is not allowed?=20
>>
>> =20
> So lets enumerate all of the variations in input for integers.=20
>
> Required rules:
> -The digits for the absolute integer value
> -A leading '-' for signed inputs.
> =20
> Useful optional rules:
> - A base prefix (e.g 0x, 0, 0b), required for auto-detection and=20
> optionally allowed for a fixed base.
> - A leading '+' which is redundant
> - Supporting digit separators (, or . or any character and length). This=
=20
> can be useful because you basically have to reimplement your own parser=
=20
> from scratch if you want to do this. Digit separators are also allowed in=
=20
> C++ literals so there is a compatibility argument.
> =20
> This seems reasonable.
>
> =20
> Questionable rules:
> - A leading '-' for unsigned values, causing a parse into signed followed=
=20
> by a conversion to unsigned.
> - 'true' and 'false' for auto detected bool.=20
> =20
> Don't think this is a good idea. Accepting negative numbers for unsigned=
=20
> types just seems very surprising.
>
I agree with you here but I'm still mentioning it for completeness. Any=20
proposal should probably discuss this rule and justify why it was deemed a=
=20
bad idea.
=20
> And I would say that 0/1 should suffice for bool in the basic version.=20
> Maybe go as far as add two overloads each accepting two chars or two=20
> strings for the true/false case.
>
0 and 1 is fine. I really don't care too much about true/false but it could=
=20
be nice to have.
=20
> =20
> I agree with the others who would like to use this function with only=20
> the necessary rules in order to use it effectively in a high level strict=
=20
> parsing protocol. These cases are rare however,=20
> =20
> How do you know it is rare? Do you have numbers to back this claim? We ar=
e=20
> talking about an international standard that is used by virtually every=
=20
> industry in existence in some way. =E2=80=9CNobody knows what most C++ pr=
ogrammers=20
> do.=E2=80=9D
>
I can't speak for other people but 99% of the time I'm doing this I'm not=
=20
so strict on limiting the input grammar to some external specification.=20
I'll take whatever the standard function does and that's usually good=20
enough. The defaults should be aimed at something easy to use standalone by=
=20
someone wanting to develop something quickly. The defaults should not be=20
"surprising" to novices. Users with strict requirements have to learn the=
=20
parameters and specify them carefully to match the desired behavior.
=20
> and I think by default doing the most obvious thing is best. Just like=
=20
> with std::atomic, the default is the slowest but also the easiest to=20
> understand and use quickly.=20
>
> I think we should also make base =3D 0 by default instead of base=3D10.
> =20
> This is a bad idea. Having a number be interpreted differently because it=
=20
> has a leading zero should not be the default surprise unless one has to=
=20
> explicitly enable the octal prefix.
>
But if you use a leading 0 for a C++ integer literal you get the same=20
behavior. That being said, I would bet leading 0's in input text typically=
=20
are more often meant to be parsed as decimal, not octal so you maybe right=
=20
that default =3D10 is best. The 0 prefix for octal is unfortunate but its=
=20
been around forever and known by the whole world so changing it is a=20
non-starter.
It looks like other languages use base=3D10 default as well. For example, i=
n=20
python int("0234") =3D=3D 234.
> =20
> Here is one interface for this. I'm still using string_view here to not=
=20
> get back into the discussion of the return / out-param issue.
>
> using prefix_set =3D std::bitset<36>;
> using digit_separator_char_set =3D std::bitset<0x100>;
>
> static constexpr auto auto_base =3D 0;
> static constexpr auto all_prefixes =3D prefix_set().flip();
> static constexpr auto no_prefixes =3D prefix_set();
> static constexpr auto all_digit_separators =3D=20
> digit_separator_char_set().flio();
> static constexpr auto no_digit_separators =3D digit_separator_char_set();
>
> //Default is allow no digit separators, auto detect base, all prefixes=20
> enabled, leading plus enabled,=20
> template <typename Integral>
> error_code parse(string_view& tail, string_view str, int base, digit_sepa=
rator_set=20
> digit_separators, prefix_set prefixes_enabled=3Dall_prefixes, bool=20
> leading_plus_enabled =3D true});
> template <typename Integral>
> error_code parse(string_view& tail, string_view str, int base, char
> digit_separator, prefix_set prefixes_enabled=3Dall_prefixes, bool=20
> leading_plus_enabled =3D true);
> template <typename Integral>
> inline error_code parse(string_view& tail, string_view str, int base=3D0=
, prefix_set=20
> prefixes_enabled=3Dall_prefixes, bool leading_plus_enabled =3D true) {
> return parse(tail, str, base, no_digit_separators, prefixes_enabled,=20
> leading_plus_enabled);
> }=20
> =20
> Why 36 prefix bits? There are only 3 availablechoices: 0(x|X), 0,=20
> 0(b|B).=20
>
> I wouldn't add multiple digit separator support. You should know what=20
> character you accept as digit separator. Remember these calls are=20
> locale-independent to be fast and simple. You most likely know the format=
=20
> of your input and I doubt it has mixed digit separators.
>
Yes looking back at it I agree with you. The multiple separators just adds=
=20
too much complexity.
=20
>
> Ignoring the input/output arguments this should do the job
> ... parse(..., int radix) // The default accepts '+' but no separators
> ... parse(..., int radix, no_plus_prefix_t)
> ... parse(..., int radix, char separator)
> ... parse(..., int radix, no_plus_prefix_t, char separator)
>
> One *could* squeeze the plus prefix option into the radix param since its=
=20
> only valid values are in [0;36] and there are plenty of bits left. Give i=
t=20
> a custom type and you have yourself a pretty fluent interface:
>
> ... parse(..., parse_options_t); // parse_options_t is a concrete type (n=
o=20
> int or enum)
>
> parse(..., radix(10) | no_plus_prefix | digit_separator(','))
>
>
>=20
--=20
---=20
You received this message because you are subscribed to the Google Groups "=
ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposa=
ls/.
------=_Part_300_293737266.1434554411459
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable
<div dir=3D"ltr"><br><br>On Wednesday, June 17, 2015 at 7:25:11 AM UTC-4, O=
laf van der Spek wrote:<blockquote class=3D"gmail_quote" style=3D"margin: 0=
;margin-left: 0.8ex;border-left: 1px #ccc solid;padding-left: 1ex;">2015-06=
-17 2:47 GMT+02:00 Matthew Fioravante <<a target=3D"_blank" rel=3D"nofol=
low">fmatth...@gmail.com</a>>:
<br>>> Prefixes aren't allowed when base !=3D 0
<br>>
<br>>
<br>> FYI they are allowed for strtol(). Also see my earlier example abo=
ut
<br>> prefixes again.
<br>
<br>I should have said when base isn't 0 or 16.
<br></blockquote><div><br>Also 8 and 2.<br> <br></div><blockquote clas=
s=3D"gmail_quote" style=3D"margin: 0;margin-left: 0.8ex;border-left: 1px #c=
cc solid;padding-left: 1ex;">When base is 10 prefixes aren't allowed (AFAIK=
).
<br></blockquote><div><br>Yes but prefixes don't make sense here. There is =
no prefix for 10 or anything else not 2,8,16.<br></div><blockquote class=3D=
"gmail_quote" style=3D"margin: 0;margin-left: 0.8ex;border-left: 1px #ccc s=
olid;padding-left: 1ex;">
<br>> I think we should also make base =3D 0 by default instead of base=
=3D10.
<br>
<br>Why?
<br></blockquote><div><br>I was thinking the default behavior to enable all=
numeric formats. But I'm fine with default =3D 10 or whatever.<br>&n=
bsp;<br></div><br>On Wednesday, June 17, 2015 at 10:31:04 AM UTC-4, Miro Kn=
ejp wrote:<blockquote class=3D"gmail_quote" style=3D"margin: 0;margin-left:=
0.8ex;border-left: 1px #ccc solid;padding-left: 1ex;">
=20
=20
=20
<div bgcolor=3D"#FFFFFF" text=3D"#000000">
<br>
<br>
<div>Am 17.06.2015 um 02:47 schrieb Matthew
Fioravante:<br>
</div>
<blockquote type=3D"cite">
<div dir=3D"ltr">
<div><br>
</div>
<div>On Tuesday, June 16, 2015 at 2:22:55 AM UTC-4, Olaf van der
Spek wrote:<br>
</div>
<blockquote class=3D"gmail_quote" style=3D"margin:0;margin-left:0.8=
ex;border-left:1px #ccc solid;padding-left:1ex">2015-06-16
5:10 GMT+02:00 Miro Knejp <<a rel=3D"nofollow">miro....@gmail.=
com</a>>:
<br>
>> What's C-specific besides bases?
<br>
>>
<br>
> Prefixes, separators, suffixes. '+' and '-' are
technically not part of the
<br>
> literal but not accepting '-' for negative types is
nonsense. There is no
<br>
> requirement for accepting '+' in either case, it's
redundant and not correct
<br>
> in all use cases.
<br>
<br>
Prefixes aren't allowed when base !=3D 0
<br>
</blockquote>
<div><br>
</div>
<div>FYI they are allowed for strtol(). Also see my earlier
example about prefixes again.</div>
<div><br>
</div>
<div><a href=3D"http://en.cppreference.com/w/cpp/string/byte/strtol=
" target=3D"_blank" rel=3D"nofollow" onmousedown=3D"this.href=3D'http://www=
..google.com/url?q\75http%3A%2F%2Fen.cppreference.com%2Fw%2Fcpp%2Fstring%2Fb=
yte%2Fstrtol\46sa\75D\46sntz\0751\46usg\75AFQjCNFQpAHDrq2OUNmm4Dy8Mmc1NQoGG=
A';return true;" onclick=3D"this.href=3D'http://www.google.com/url?q\75http=
%3A%2F%2Fen.cppreference.com%2Fw%2Fcpp%2Fstring%2Fbyte%2Fstrtol\46sa\75D\46=
sntz\0751\46usg\75AFQjCNFQpAHDrq2OUNmm4Dy8Mmc1NQoGGA';return true;">http://=
en.cppreference.com/w/<wbr>cpp/string/byte/strtol</a><br>
</div>
<div> </div>
<blockquote class=3D"gmail_quote" style=3D"margin:0;margin-left:0.8=
ex;border-left:1px #ccc solid;padding-left:1ex">What
separators? The dot etc for floating point numbers?
<br>
What suffixes?
<br>
<br>
<br>
> I didn't say don't support signed types, I said don't
force in extras which aren't necessary for the foundation.
Always make those optional.
<br>
<br>
Got a proposal for how to specify what is and what is not
allowed?
<br>
<br>
</blockquote>
<div><br>
</div>
<div>So lets enumerate all of the variations in input for
integers. </div>
<div><br>
</div>
<div>Required rules:</div>
<div>-The digits for the absolute integer value</div>
<div>-A leading '-' for signed inputs.</div>
<div> </div>
<div>Useful optional rules:</div>
<div>- A base prefix (e.g 0x, 0, 0b), required for
auto-detection and optionally allowed for a fixed base.</div>
<div>- A leading '+' which is redundant</div>
<div>
<div>- Supporting digit separators (, or . or any character
and length). This can be useful because you basically have
to reimplement your own parser from scratch if you want to
do this. Digit separators are also allowed in C++ literals
so there is a compatibility argument.</div>
</div>
<div><br>
</div>
</div>
</blockquote>
This seems reasonable.<br>
<blockquote type=3D"cite">
<div dir=3D"ltr">
<div><br>
</div>
<div>Questionable rules:</div>
<div>- A leading '-' for unsigned values, causing a parse into
signed followed by a conversion to unsigned.</div>
<div>- 'true' and 'false' for auto detected bool. <br>
</div>
</div>
</blockquote>
Don't think this is a good idea. Accepting negative numbers for
unsigned types just seems very surprising.</div></blockquote><div><br>I=
agree with you here but I'm still mentioning it for completeness. Any prop=
osal should probably discuss this rule and justify why it was deemed a bad =
idea.<br> </div><blockquote class=3D"gmail_quote" style=3D"margin: 0;m=
argin-left: 0.8ex;border-left: 1px #ccc solid;padding-left: 1ex;"><div bgco=
lor=3D"#FFFFFF" text=3D"#000000"> And I would say that 0/1
should suffice for bool in the basic version. Maybe go as far as add
two overloads each accepting two chars or two strings for the
true/false case.<br></div></blockquote><div><br>0 and 1 is fine. I real=
ly don't care too much about true/false but it could be nice to have.<br>&n=
bsp;</div><blockquote class=3D"gmail_quote" style=3D"margin: 0;margin-left:=
0.8ex;border-left: 1px #ccc solid;padding-left: 1ex;"><div bgcolor=3D"#FFF=
FFF" text=3D"#000000">
<blockquote type=3D"cite">
<div dir=3D"ltr">
<div><br>
</div>
<div>I agree with the others who would like to use this function
with only the necessary rules in order to use it effectively
in a high level strict parsing protocol. These cases are rare
however, </div>
</div>
</blockquote>
How do you know it is rare? Do you have numbers to back this claim?
We are talking about an international standard that is used by
virtually every industry in existence in some way. =E2=80=9CNobody know=
s
what most C++ programmers do.=E2=80=9D<br></div></blockquote><div><br>I=
can't speak for other people but 99% of the time I'm doing this I'm not so=
strict on limiting the input grammar to some external specification. I'll =
take whatever the standard function does and that's usually good enough. Th=
e defaults should be aimed at something easy to use standalone by someone w=
anting to develop something quickly. The defaults should not be "surprising=
" to novices. Users with strict requirements have to learn the parameters a=
nd specify them carefully to match the desired behavior.<br> </div><bl=
ockquote class=3D"gmail_quote" style=3D"margin: 0;margin-left: 0.8ex;border=
-left: 1px #ccc solid;padding-left: 1ex;"><div bgcolor=3D"#FFFFFF" text=3D"=
#000000">
<blockquote type=3D"cite">
<div dir=3D"ltr">
<div>and I think by default doing the most obvious thing is
best. Just like with std::atomic, the default is the slowest
but also the easiest to understand and use quickly. </div>
<div><br>
</div>
<div>I think we should also make base =3D 0 by default instead of
base=3D10.<br>
</div>
</div>
</blockquote>
This is a bad idea. Having a number be interpreted differently
because it has a leading zero should not be the default surprise
unless one has to explicitly enable the octal prefix.<br></div></blockq=
uote><div><br>But if you use a leading 0 for a C++ integer literal you get =
the same behavior. That being said, I would bet leading 0's in input text t=
ypically are more often meant to be parsed as decimal, not octal so you may=
be right that default =3D10 is best. The 0 prefix for octal is unfortunate =
but its been around forever and known by the whole world so changing it is =
a non-starter.<br><br>It looks like other languages use base=3D10 default a=
s well. For example, in python int("0234") =3D=3D 234.<br></div><blockquote=
class=3D"gmail_quote" style=3D"margin: 0;margin-left: 0.8ex;border-left: 1=
px #ccc solid;padding-left: 1ex;"><div bgcolor=3D"#FFFFFF" text=3D"#000000"=
>
<blockquote type=3D"cite">
<div dir=3D"ltr">
<div><br>
</div>
<div>Here is one interface for this. I'm still using string_view
here to not get back into the discussion of the return /
out-param issue.</div>
<div><br>
</div>
<div>
<div style=3D"border:1px solid rgb(187,187,187);word-wrap:break-w=
ord;background-color:rgb(250,250,250)"><code>
<div><span style=3D"color:#000">using prefix_set =3D
std::bitset<36>;<br>
</span><span style=3D"color:#008">using</span><span style=
=3D"color:#000">
digit_separator_char_set </span><span style=3D"color:#660=
">=3D</span><span style=3D"color:#000"> std</span><span style=3D"color:#660=
">::</span><span style=3D"color:#000">bitset</span><span style=3D"color:#66=
0"><</span><span style=3D"color:#066">0x100</span><span style=3D"color:#=
660">>;<br>
</span><span style=3D"color:#000"><br>
static constexpr auto auto_base =3D 0;<br>
static constexpr auto all_prefixes =3D
prefix_set().flip();<br>
static constexpr auto no_prefixes =3D prefix_set();<br>
static constexpr auto all_digit_separators =3D
digit_separator_char_set().<wbr>flio();<br>
static constexpr auto no_digit_separators =3D
digit_separator_char_set();<br>
<br>
//Default is allow no digit separators, auto detect
base, all prefixes enabled, leading plus enabled, <b=
r>
</span><span style=3D"color:#008">template</span><span styl=
e=3D"color:#000"> </span><span style=3D"color:#660"><</span><span style=
=3D"color:#008">typename</span><span style=3D"color:#000"> </span><span sty=
le=3D"color:#606">Integral</span><span style=3D"color:#660">></span><spa=
n style=3D"color:#000"><br>
error_code parse</span><span style=3D"color:#660">(</span=
><span style=3D"color:#000">string_view</span><span style=3D"color:#660">&a=
mp;</span><span style=3D"color:#000"> tail</span><span style=3D"color:#660"=
>,</span><span style=3D"color:#000">
string_view str</span><span style=3D"color:#660">,</span>=
<span style=3D"color:#000"> </span><span style=3D"font-family:Arial,He=
lvetica,sans-serif;color:rgb(0,0,136)">int</span><span style=3D"font-family=
:Arial,Helvetica,sans-serif;color:rgb(0,0,0)"> </span><span style=3D"font-f=
amily:Arial,Helvetica,sans-serif;color:rgb(0,0,136)">base</span><span style=
=3D"font-family:Arial,Helvetica,sans-serif;color:rgb(102,102,0)">,</span><s=
pan style=3D"font-family:Arial,Helvetica,sans-serif;color:rgb(0,0,0)"> =
;</span><span style=3D"font-family:Arial,Helvetica,sans-serif;color:rgb(0,0=
,0)">digit_separator_set
digit_separators</span><font style=3D"font-family:Arial,H=
elvetica,sans-serif" color=3D"#000000"><span style=3D"color:rgb(102,102,0)"=
>, </span></font><span style=3D"font-family:Arial,Helvetica,sans-serif;colo=
r:rgb(0,0,0)">prefix_set
prefixes_enabled</span><span style=3D"font-family:Arial,H=
elvetica,sans-serif;color:rgb(102,102,0)">=3D</span><span style=3D"font-fam=
ily:Arial,Helvetica,sans-serif;color:rgb(0,0,0)">all_prefixes</span><span s=
tyle=3D"font-family:Arial,Helvetica,sans-serif;color:rgb(102,102,0)">,</spa=
n><span style=3D"font-family:Arial,Helvetica,sans-serif;color:rgb(0,0,0)"> =
</span><span style=3D"font-family:Arial,Helvetica,sans-serif;color:rgb(0,0,=
136)">bool</span><span style=3D"font-family:Arial,Helvetica,sans-serif;colo=
r:rgb(0,0,0)">
leading_plus_enabled </span><span style=3D"font-family:Ar=
ial,Helvetica,sans-serif;color:rgb(102,102,0)">=3D</span><span style=3D"fon=
t-family:Arial,Helvetica,sans-serif;color:rgb(0,0,0)"> </span><span style=
=3D"font-family:Arial,Helvetica,sans-serif;color:rgb(0,0,136)">true</span><=
font style=3D"font-family:Arial,Helvetica,sans-serif" color=3D"#000000"><sp=
an style=3D"color:#660">}</span></font><font style=3D"font-family:Arial,Hel=
vetica,sans-serif" color=3D"#000000"><span style=3D"color:#660">);</span></=
font></div>
<div><span style=3D"color:#008">template</span><span style=3D=
"color:#000"> </span><span style=3D"color:#660"><</span><span style=3D"c=
olor:#008">typename</span><span style=3D"color:#000"> </span><span style=3D=
"color:#606">Integral</span><span style=3D"color:#660">></span><span sty=
le=3D"color:#000"><br>
error_code parse</span><span style=3D"color:#660">(</span=
><span style=3D"color:#000">string_view</span><span style=3D"color:#660">&a=
mp;</span><span style=3D"color:#000"> tail</span><span style=3D"color:#660"=
>,</span><span style=3D"color:#000">
string_view str</span><span style=3D"color:#660">,</span>=
<span style=3D"color:#000"> </span><span style=3D"color:#008">int</span><sp=
an style=3D"color:#000"> </span><span style=3D"color:#008">base</span><span=
style=3D"color:#660">, </span><span style=3D"font-family:Arial,Helvet=
ica,sans-serif;color:rgb(0,0,136)">char</span><span style=3D"font-family:Ar=
ial,Helvetica,sans-serif;color:rgb(0,0,0)"> digit_separator</span><spa=
n style=3D"font-family:Arial,Helvetica,sans-serif;color:rgb(0,136,0)">,&nbs=
p;</span><span style=3D"font-family:Arial,Helvetica,sans-serif;color:rgb(0,=
0,0)">pr<wbr>efix_set
prefixes_enabled</span><span style=3D"font-family:Arial,H=
elvetica,sans-serif;color:rgb(102,102,0)">=3D</span><span style=3D"font-fam=
ily:Arial,Helvetica,sans-serif;color:rgb(0,0,0)">all_prefixes</span><span s=
tyle=3D"font-family:Arial,Helvetica,sans-serif;color:rgb(102,102,0)">,</spa=
n><span style=3D"font-family:Arial,Helvetica,sans-serif;color:rgb(0,0,0)"> =
</span><span style=3D"font-family:Arial,Helvetica,sans-serif;color:rgb(0,0,=
136)">bool</span><span style=3D"font-family:Arial,Helvetica,sans-serif;colo=
r:rgb(0,0,0)">
leading_plus_enabled </span><span style=3D"font-family:Ar=
ial,Helvetica,sans-serif;color:rgb(102,102,0)">=3D</span><span style=3D"fon=
t-family:Arial,Helvetica,sans-serif;color:rgb(0,0,0)"> </span><span style=
=3D"font-family:Arial,Helvetica,sans-serif;color:rgb(0,0,136)">true</span><=
font style=3D"font-family:Arial,Helvetica,sans-serif" color=3D"#000000"><sp=
an style=3D"color:#660">);<br>
</span></font><span style=3D"font-family:Arial,Helvetica,=
sans-serif;color:rgb(0,0,136)">template</span><span style=3D"font-family:Ar=
ial,Helvetica,sans-serif;color:rgb(0,0,0)"> </span><span style=3D"font=
-family:Arial,Helvetica,sans-serif;color:rgb(102,102,0)"><</span><span s=
tyle=3D"font-family:Arial,Helvetica,sans-serif;color:rgb(0,0,136)">typename=
</span><span style=3D"font-family:Arial,Helvetica,sans-serif;color:rgb(0,0,=
0)"> </span><span style=3D"font-family:Arial,Helvetica,sans-serif;colo=
r:rgb(102,0,102)">Integral</span><span style=3D"font-family:Arial,Helvetica=
,sans-serif;color:rgb(102,102,0)">></span><font style=3D"font-family:Ari=
al,Helvetica,sans-serif" color=3D"#000000"><span style=3D"color:#660"><br>
</span></font></div>
<span style=3D"color:rgb(0,0,0)">inline error_code parse</spa=
n><span style=3D"color:rgb(102,102,0)">(</span><span style=3D"color:rgb(0,0=
,0)">string_view</span><span style=3D"color:rgb(102,102,0)">&</span><sp=
an style=3D"color:rgb(0,0,0)"> tail</span><span style=3D"color:rgb(102=
,102,0)">,</span><span style=3D"color:rgb(0,0,0)"> <wbr>string_view st=
r</span><span style=3D"color:rgb(102,102,0)">,</span><span style=3D"color:r=
gb(0,0,0)"> </span><span style=3D"color:rgb(0,0,136)">int</span><span =
style=3D"color:rgb(0,0,0)"> </span><span style=3D"color:rgb(0,0,136)">=
base=3D0</span><span style=3D"color:rgb(102,102,0)">,</span><span style=3D"=
font-family:Arial,Helvetica,sans-serif;color:rgb(0,136,0)"> </span><sp=
an style=3D"font-family:Arial,Helvetica,sans-serif;color:rgb(0,0,0)">prefix=
_set
prefixes_enabled</span><span style=3D"font-family:Arial,Hel=
vetica,sans-serif;color:rgb(102,102,0)">=3D</span><span style=3D"font-famil=
y:Arial,Helvetica,sans-serif;color:rgb(0,0,0)">all_prefixes</span><span sty=
le=3D"font-family:Arial,Helvetica,sans-serif;color:rgb(102,102,0)">,</span>=
<span style=3D"font-family:Arial,Helvetica,sans-serif;color:rgb(0,0,0)"><wb=
r> </span><span style=3D"font-family:Arial,Helvetica,sans-serif;color:=
rgb(0,0,136)">bool</span><span style=3D"font-family:Arial,Helvetica,sans-se=
rif;color:rgb(0,0,0)"> leading_plus_enabled </span><span style=3D=
"font-family:Arial,Helvetica,sans-serif;color:rgb(102,102,0)">=3D</span><sp=
an style=3D"font-family:Arial,Helvetica,sans-serif;color:rgb(0,0,0)"> =
</span><span style=3D"font-family:Arial,Helvetica,sans-serif;color:rgb(0,0,=
136)">t<wbr>rue</span><font style=3D"font-family:Arial,Helvetica,sans-serif=
" color=3D"#000000"><span style=3D"color:rgb(102,102,0)">) {<br>
return parse(tail, str, base, no_digit_separators,
prefixes_enabled, leading_plus_enabled);<br>
</span></font>
<div><font color=3D"#000000"><span style=3D"color:#000">}&nbs=
p;</span></font></div>
</code></div>
<div><br>
</div>
</div>
</div>
</blockquote>
Why 36 prefix bits? There are only 3 availablechoices: 0(x|X), 0,
0(b|B). <br>
<br>
I wouldn't add multiple digit separator support. You should know
what character you accept as digit separator. Remember these calls
are locale-independent to be fast and simple. You most likely know
the format of your input and I doubt it has mixed digit separators.<br>=
</div></blockquote><div><br>Yes looking back at it I agree with you. The mu=
ltiple separators just adds too much complexity.<br> </div><blockquote=
class=3D"gmail_quote" style=3D"margin: 0;margin-left: 0.8ex;border-left: 1=
px #ccc solid;padding-left: 1ex;"><div bgcolor=3D"#FFFFFF" text=3D"#000000"=
>
<br>
Ignoring the input/output arguments this should do the job<br>
... parse(..., int radix) // The default accepts '+' but no
separators<br>
... parse(..., int radix, no_plus_prefix_t)<br>
... parse(..., int radix, char separator)<br>
... parse(..., int radix, no_plus_prefix_t, char separator)<br>
<br>
One *could* squeeze the plus prefix option into the radix param
since its only valid values are in [0;36] and there are plenty of
bits left. Give it a custom type and you have yourself a pretty
fluent interface:<br>
<br>
... parse(..., parse_options_t); // parse_options_t is a concrete
type (no int or enum)<br>
<br>
parse(..., radix(10) | no_plus_prefix | digit_separator(','))<br>
<br>
<br>
</div>
</blockquote></div>
<p></p>
-- <br />
<br />
--- <br />
You received this message because you are subscribed to the Google Groups &=
quot;ISO C++ Standard - Future Proposals" group.<br />
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to <a href=3D"mailto:std-proposals+unsubscribe@isocpp.org">std-proposa=
ls+unsubscribe@isocpp.org</a>.<br />
To post to this group, send email to <a href=3D"mailto:std-proposals@isocpp=
..org">std-proposals@isocpp.org</a>.<br />
Visit this group at <a href=3D"http://groups.google.com/a/isocpp.org/group/=
std-proposals/">http://groups.google.com/a/isocpp.org/group/std-proposals/<=
/a>.<br />
------=_Part_300_293737266.1434554411459--
------=_Part_299_1824107234.1434554411459--
.
Author: Matthew Fioravante <fmatthew5876@gmail.com>
Date: Wed, 17 Jun 2015 08:22:34 -0700 (PDT)
Raw View
------=_Part_6425_620626058.1434554554661
Content-Type: multipart/alternative;
boundary="----=_Part_6426_534083327.1434554554661"
------=_Part_6426_534083327.1434554554661
Content-Type: text/plain; charset=UTF-8
On Wednesday, June 17, 2015 at 10:55:06 AM UTC-4, Matthew Woehlke wrote:
>
> On 2015-06-17 10:31, Miro Knejp wrote:
> > I would say that 0/1 should suffice for bool in the basic version.
> > Maybe go as far as add two overloads each accepting two chars or two
> > strings for the true/false case.
>
> Accepting regex's would be better; that way you can say that true is
> e.g. "1|y(es)?|t(rue)?" (case insensitive), rather than being limited to
> exactly one accepted string. (Having faster overloads that take char or
> string literals is fine too, but if we have those, I would also have a
> regex overload.)
>
I think thats an entirely different proposal. Adding regex support just for
the small corner case of bool bloats the scope of this too much.
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
------=_Part_6426_534083327.1434554554661
Content-Type: text/html; charset=UTF-8
<div dir="ltr"><br><br>On Wednesday, June 17, 2015 at 10:55:06 AM UTC-4, Matthew Woehlke wrote:<blockquote class="gmail_quote" style="margin: 0;margin-left: 0.8ex;border-left: 1px #ccc solid;padding-left: 1ex;">On 2015-06-17 10:31, Miro Knejp wrote:
<br>> I would say that 0/1 should suffice for bool in the basic version.
<br>> Maybe go as far as add two overloads each accepting two chars or two
<br>> strings for the true/false case.
<br>
<br>Accepting regex's would be better; that way you can say that true is
<br>e.g. "1|y(es)?|t(rue)?" (case insensitive), rather than being limited to
<br>exactly one accepted string. (Having faster overloads that take char or
<br>string literals is fine too, but if we have those, I would also have a
<br>regex overload.)
<br></blockquote><div><br>I think thats an entirely different proposal. Adding regex support just for the small corner case of bool bloats the scope of this too much.<br></div></div>
<p></p>
-- <br />
<br />
--- <br />
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.<br />
To unsubscribe from this group and stop receiving emails from it, send an email to <a href="mailto:std-proposals+unsubscribe@isocpp.org">std-proposals+unsubscribe@isocpp.org</a>.<br />
To post to this group, send email to <a href="mailto:std-proposals@isocpp.org">std-proposals@isocpp.org</a>.<br />
Visit this group at <a href="http://groups.google.com/a/isocpp.org/group/std-proposals/">http://groups.google.com/a/isocpp.org/group/std-proposals/</a>.<br />
------=_Part_6426_534083327.1434554554661--
------=_Part_6425_620626058.1434554554661--
.
Author: Matthew Woehlke <mw_triad@users.sourceforge.net>
Date: Wed, 17 Jun 2015 11:38:41 -0400
Raw View
On 2015-06-17 11:22, Matthew Fioravante wrote:
> On Wednesday, June 17, 2015 at 10:55:06 AM UTC-4, Matthew Woehlke wrote:
>> On 2015-06-17 10:31, Miro Knejp wrote:
>>> I would say that 0/1 should suffice for bool in the basic version.
>>> Maybe go as far as add two overloads each accepting two chars or two
>>> strings for the true/false case.
>>
>> Accepting regex's would be better; that way you can say that true is
>> e.g. "1|y(es)?|t(rue)?" (case insensitive), rather than being limited to
>> exactly one accepted string. (Having faster overloads that take char or
>> string literals is fine too, but if we have those, I would also have a
>> regex overload.)
>
> I think thats an entirely different proposal. Adding regex support just for
> the small corner case of bool bloats the scope of this too much.
Fair enough. On that note, however, I'm inclined to feel the same way
about even having a bool parser. Do we *really* need that? It doesn't
seem nearly as valuable as a numeric parser.
--
Matthew
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
.
Author: Olaf van der Spek <olafvdspek@gmail.com>
Date: Wed, 17 Jun 2015 18:06:40 +0200
Raw View
2015-06-17 17:20 GMT+02:00 Matthew Fioravante <fmatthew5876@gmail.com>:
>
>
> On Wednesday, June 17, 2015 at 7:25:11 AM UTC-4, Olaf van der Spek wrote:
>>
>> 2015-06-17 2:47 GMT+02:00 Matthew Fioravante <fmatth...@gmail.com>:
>> >> Prefixes aren't allowed when base != 0
>> >
>> >
>> > FYI they are allowed for strtol(). Also see my earlier example about
>> > prefixes again.
>>
>> I should have said when base isn't 0 or 16.
>
>
> Also 8 and 2.
Was there a proposal to accept 0b for strtol?
When base = 8, the leading 0 in 0777 doesn't have to be a prefix per se.
>>
>> When base is 10 prefixes aren't allowed (AFAIK).
>
>
> Yes but prefixes don't make sense here. There is no prefix for 10 or
> anything else not 2,8,16.
I know, hence my question to Miro what he needed. If he's only using
base = 10 then disallowing prefixes isn't an issue (for him).
>> Don't think this is a good idea. Accepting negative numbers for unsigned
>> types just seems very surprising.
>
>
> I agree with you here but I'm still mentioning it for completeness. Any
> proposal should probably discuss this rule and justify why it was deemed a
> bad idea.
Returning out-of-range for -1 when parsing an unsigned type seems like
a no-brainer.
> that default =10 is best. The 0 prefix for octal is unfortunate but its been
> around forever and known by the whole world so changing it is a non-starter.
IMO it should be deprecated (in C++) and excluded when base = 0.
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
.
Author: Miro Knejp <miro.knejp@gmail.com>
Date: Thu, 18 Jun 2015 00:49:36 +0200
Raw View
Am 17.06.2015 um 18:06 schrieb Olaf van der Spek:
> 2015-06-17 17:20 GMT+02:00 Matthew Fioravante <fmatthew5876@gmail.com>:
>>
>> On Wednesday, June 17, 2015 at 7:25:11 AM UTC-4, Olaf van der Spek wrote:
>>> 2015-06-17 2:47 GMT+02:00 Matthew Fioravante <fmatth...@gmail.com>:
>>>>> Prefixes aren't allowed when base != 0
>>>>
>>>> FYI they are allowed for strtol(). Also see my earlier example about
>>>> prefixes again.
>>> I should have said when base isn't 0 or 16.
>>
>> Also 8 and 2.
> Was there a proposal to accept 0b for strtol?
> When base = 8, the leading 0 in 0777 doesn't have to be a prefix per se.
If an explicit base is specified any prefix parsing should be disabled.
>
>>> When base is 10 prefixes aren't allowed (AFAIK).
>>
>> Yes but prefixes don't make sense here. There is no prefix for 10 or
>> anything else not 2,8,16.
> I know, hence my question to Miro what he needed. If he's only using
> base = 10 then disallowing prefixes isn't an issue (for him).
That's totally fine. Matthew did a good job at summing up the essentials
versus the optionals.
>
>>> Don't think this is a good idea. Accepting negative numbers for unsigned
>>> types just seems very surprising.
>>
>> I agree with you here but I'm still mentioning it for completeness. Any
>> proposal should probably discuss this rule and justify why it was deemed a
>> bad idea.
> Returning out-of-range for -1 when parsing an unsigned type seems like
> a no-brainer.
I'd argue that in that case the parsing of unsigned types should fail on
the minus sign the same way it would for any other invalid character
without consuming any input.
>
>> that default =10 is best. The 0 prefix for octal is unfortunate but its been
>> around forever and known by the whole world so changing it is a non-starter.
> IMO it should be deprecated (in C++) and excluded when base = 0.
>
I agree but I also see that getting this through might be difficult. I
don't think "the whole world" consists only of C/C++ programmers, and
there are also *users* of programs who may not know anything about
programming and that adding a leading zero completely changes how an
input is interpreted. Even if the user is tech savvy they might not know
the program is written in C++ with its own arcane special rules. In my
experience the leading zero is nothing but a trap to catch and torture
the unknowing.
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
.
Author: Thiago Macieira <thiago@macieira.org>
Date: Wed, 17 Jun 2015 17:31:25 -0700
Raw View
On Wednesday 17 June 2015 11:38:41 Matthew Woehlke wrote:
> On 2015-06-17 11:22, Matthew Fioravante wrote:
> > On Wednesday, June 17, 2015 at 10:55:06 AM UTC-4, Matthew Woehlke wrote:
> >> On 2015-06-17 10:31, Miro Knejp wrote:
> >>> I would say that 0/1 should suffice for bool in the basic version.
> >>> Maybe go as far as add two overloads each accepting two chars or two
> >>> strings for the true/false case.
> >>
> >> Accepting regex's would be better; that way you can say that true is
> >> e.g. "1|y(es)?|t(rue)?" (case insensitive), rather than being limited to
> >> exactly one accepted string. (Having faster overloads that take char or
> >> string literals is fine too, but if we have those, I would also have a
> >> regex overload.)
> >
> > I think thats an entirely different proposal. Adding regex support just
> > for
> > the small corner case of bool bloats the scope of this too much.
>
> Fair enough. On that note, however, I'm inclined to feel the same way
> about even having a bool parser. Do we *really* need that? It doesn't
> seem nearly as valuable as a numeric parser.
We should have a bool parser if and only if std::is_integral<bool>::value. It
should parse *numbers* from std::numeric_limits<bool>::min() to
std::numeric_limits<bool>::max().
I imagine that the front-end template interface would be something like:
// skip bikeshedding about input, output and error
extern std::expected<uint64_t, code>
parse_internal(const char *begin, const char *end,
uint64_t zero, uint64_t max, int base);
extern std::expected<int64_t, code>
parse_internal(const char *begin, const char *end,
int64_t min, int64_t max, int base);
template <typename T>
typename std::enable_if<std::is_integral<T>::value,
std::expected<T, code>>::type
parse_number(std::string_view str, int base)
{
typedef typename std::conditional<std::is_unsigned<T>::value,
uint64_t, int64_t>::type Int64;
Int64 min = std::numeric_limits<T>::min();
Int64 max = std::numeric_limits<T>::max();
return parse_internal(str.begin(), str.end(), min, max, base);
}
--
Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org
Software Architect - Intel Open Source Technology Center
PGP/GPG: 0x6EF45358; fingerprint:
E067 918B B660 DBD1 105C 966C 33F5 F005 6EF4 5358
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
.
Author: Jens Maurer <Jens.Maurer@gmx.net>
Date: Thu, 18 Jun 2015 08:34:23 +0200
Raw View
On 06/18/2015 02:31 AM, Thiago Macieira wrote:
> On Wednesday 17 June 2015 11:38:41 Matthew Woehlke wrote:
>> On 2015-06-17 11:22, Matthew Fioravante wrote:
>>> On Wednesday, June 17, 2015 at 10:55:06 AM UTC-4, Matthew Woehlke wrote:
>>>> On 2015-06-17 10:31, Miro Knejp wrote:
>>>>> I would say that 0/1 should suffice for bool in the basic version.
>>>>> Maybe go as far as add two overloads each accepting two chars or two
>>>>> strings for the true/false case.
>>>>
>>>> Accepting regex's would be better; that way you can say that true is
>>>> e.g. "1|y(es)?|t(rue)?" (case insensitive), rather than being limited to
>>>> exactly one accepted string. (Having faster overloads that take char or
>>>> string literals is fine too, but if we have those, I would also have a
>>>> regex overload.)
>>>
>>> I think thats an entirely different proposal. Adding regex support just
>>> for
>>> the small corner case of bool bloats the scope of this too much.
>>
>> Fair enough. On that note, however, I'm inclined to feel the same way
>> about even having a bool parser. Do we *really* need that? It doesn't
>> seem nearly as valuable as a numeric parser.
>
> We should have a bool parser if and only if std::is_integral<bool>::value.
"bool" is an integral type.
> It
> should parse *numbers* from std::numeric_limits<bool>::min() to
> std::numeric_limits<bool>::max().
So, that means 0 and 1.
> I imagine that the front-end template interface would be something like:
>
> // skip bikeshedding about input, output and error
> extern std::expected<uint64_t, code>
> parse_internal(const char *begin, const char *end,
> uint64_t zero, uint64_t max, int base);
> extern std::expected<int64_t, code>
> parse_internal(const char *begin, const char *end,
> int64_t min, int64_t max, int base);
>
> template <typename T>
> typename std::enable_if<std::is_integral<T>::value,
> std::expected<T, code>>::type
> parse_number(std::string_view str, int base)
> {
> typedef typename std::conditional<std::is_unsigned<T>::value,
> uint64_t, int64_t>::type Int64;
> Int64 min = std::numeric_limits<T>::min();
> Int64 max = std::numeric_limits<T>::max();
> return parse_internal(str.begin(), str.end(), min, max, base);
> }
Signed vs. unsigned parsing is probably different enough
(minus sign) that those two shouldn't be handled by the
same function.
Jens
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
.
Author: Thiago Macieira <thiago@macieira.org>
Date: Wed, 17 Jun 2015 23:42:40 -0700
Raw View
On Thursday 18 June 2015 08:34:23 Jens Maurer wrote:
> So, that means 0 and 1.
Right. So no "yes" or "Yes" or "true" or localised names.
>
> > I imagine that the front-end template interface would be something like:
> >
> >
> > // skip bikeshedding about input, output and error
> > extern std::expected<uint64_t, code>
> > parse_internal(const char *begin, const char *end,
> > uint64_t zero, uint64_t max, int base);
> > extern std::expected<int64_t, code>
> > parse_internal(const char *begin, const char *end,
> > int64_t min, int64_t max, int base);
> >
> >
> >
> > template <typename T>
> > typename std::enable_if<std::is_integral<T>::value,
> > std::expected<T, code>>::type
> > parse_number(std::string_view str, int base)
> > {
> > typedef typename
> > std::conditional<std::is_unsigned<T>::value,
> > uint64_t, int64_t>::type Int64;
> > Int64 min = std::numeric_limits<T>::min();
> > Int64 max = std::numeric_limits<T>::max();
> > return parse_internal(str.begin(), str.end(), min, max,
> > base); }
>
> Signed vs. unsigned parsing is probably different enough
> (minus sign) that those two shouldn't be handled by the
> same function.
They aren't. The typedef causes the selection of a different overload based on
whether it's signed or unsigned. Of course, that's internal detail: what
matters is whether parse_number<T> is documented/specified to handle the minus
sign or not for a given T.
--
Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org
Software Architect - Intel Open Source Technology Center
PGP/GPG: 0x6EF45358; fingerprint:
E067 918B B660 DBD1 105C 966C 33F5 F005 6EF4 5358
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
.
Author: Olaf van der Spek <olafvdspek@gmail.com>
Date: Thu, 18 Jun 2015 09:27:24 +0200
Raw View
2015-06-18 0:49 GMT+02:00 Miro Knejp <miro.knejp@gmail.com>:
>> Was there a proposal to accept 0b for strtol?
>> When base = 8, the leading 0 in 0777 doesn't have to be a prefix per se.
>
> If an explicit base is specified any prefix parsing should be disabled.
Sounds good but would be incompatible with strtol.
>> Returning out-of-range for -1 when parsing an unsigned type seems like
>> a no-brainer.
>
> I'd argue that in that case the parsing of unsigned types should fail on the
> minus sign the same way it would for any other invalid character without
> consuming any input.
Does it matter where it fails?
--
Olaf
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
.
Author: Matthew Woehlke <mw_triad@users.sourceforge.net>
Date: Thu, 18 Jun 2015 10:15:41 -0400
Raw View
On 2015-06-18 03:27, Olaf van der Spek wrote:
> 2015-06-18 0:49 GMT+02:00 Miro Knejp <miro.knejp@gmail.com>:
>>> Was there a proposal to accept 0b for strtol?
>>> When base = 8, the leading 0 in 0777 doesn't have to be a prefix per se.
>>
>> If an explicit base is specified any prefix parsing should be disabled.
>
> Sounds good but would be incompatible with strtol.
See my previous comment: I'm inclined to think that we should have at
least one parse function which (at least in the case of an explicit
base, if said flavor even allows base == 0) does the absolute minimum
necessary. This means not accepting any prefix except '-' in the signed
case. (Or possibly the function would not even do that, and only parse
unsigned numbers.) Everything else can be built on top of that without
much overhead, but the reverse (i.e. if you need a more restrictive
parser) is not true.
Something like this (semi-pythonic pseudocode):
parse_relaxed<T>(input, base):
if input[0] == '0':
if len(input) == 1:
return 0
if base in {0, 2} and input[1] == 'b':
result = parse_strict(input[2:], base=2)
elif base in {0, 16} and input[1] == 'x':
result = parse_strict(input[2:], base=16)
elif base == 0
result = parse_strict(input[1:], base=8)
else
result = parse_strict(input[1:], base=base)
elif input[0] == '+':
result = parse_strict(input[1:], base=10)
elif input[0] == '-': # skip if T is unsigned
result = parse_strict(input[1:], base=10)
result < ((my_max / 2) + 1) or raise RangeError
return -result
result < my_max or raise RangeError # omit if T is unsigned
return result
That may not be the exact correct flow (e.g. it would disallow "+0xf"...
which I would be inclined to do, but could be debatable), but should be
sufficient to illustrate that the relaxed parse can be implemented in
terms of the strict parse with little to no overhead (compared to what
would be required to implement it self contained).
Details of how to handle errors, accept input, and indicate characters
consumed are ignored in the above as they are not important to show the
logic flow. Any choices made in the above regarding the same should be
considered arbitrary; don't read anything into them.
>>> Returning out-of-range for -1 when parsing an unsigned type seems like
>>> a no-brainer.
>>
>> I'd argue that in that case the parsing of unsigned types should fail on the
>> minus sign the same way it would for any other invalid character without
>> consuming any input.
>
> Does it matter where it fails?
Yes; it should fail before the first unparsable character. For an input
like "-5", that means before the '-' which is equivalent to "without
consuming any input". IOW, the same as it would fail for an input like
".5" or "[5".
I expect Miro was not intending to refer to the case of input like
"42-7", which would be expected to consume the "42" and then stop, for
both the signed or unsigned flavors of the function.
--
Matthew
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
.
Author: Miro Knejp <miro.knejp@gmail.com>
Date: Thu, 18 Jun 2015 16:52:58 +0200
Raw View
Am 18.06.2015 um 09:27 schrieb Olaf van der Spek:
> 2015-06-18 0:49 GMT+02:00 Miro Knejp <miro.knejp@gmail.com>:
>>> Was there a proposal to accept 0b for strtol?
>>> When base = 8, the leading 0 in 0777 doesn't have to be a prefix per se.
>> If an explicit base is specified any prefix parsing should be disabled.
> Sounds good but would be incompatible with strtol.
I don't really care that much. Those functions aren't going away and are
very convient to use. They have their own domain to exist in. The
semantics we are discussing here for the new functions are already
incompatible with the strtox family for good reasons. Whether base=16
should additionally swallow "0x" (and similar to 2 and "0b") or not
should be an option because it is only desirable in C-like contexts.
Whether it's on by default I don't really care.
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
.
Author: Miro Knejp <miro.knejp@gmail.com>
Date: Thu, 18 Jun 2015 16:56:09 +0200
Raw View
Am 18.06.2015 um 16:15 schrieb Matthew Woehlke:
> On 2015-06-18 03:27, Olaf van der Spek wrote:
>> 2015-06-18 0:49 GMT+02:00 Miro Knejp <miro.knejp@gmail.com>:
>>>> Was there a proposal to accept 0b for strtol?
>>>> When base = 8, the leading 0 in 0777 doesn't have to be a prefix per se.
>>> If an explicit base is specified any prefix parsing should be disabled.
>> Sounds good but would be incompatible with strtol.
> See my previous comment: I'm inclined to think that we should have at
> least one parse function which (at least in the case of an explicit
> base, if said flavor even allows base == 0) does the absolute minimum
> necessary. This means not accepting any prefix except '-' in the signed
> case. (Or possibly the function would not even do that, and only parse
> unsigned numbers.) Everything else can be built on top of that without
> much overhead, but the reverse (i.e. if you need a more restrictive
> parser) is not true.
>
> Something like this (semi-pythonic pseudocode):
>
> parse_relaxed<T>(input, base):
> if input[0] == '0':
> if len(input) == 1:
> return 0
>
> if base in {0, 2} and input[1] == 'b':
> result = parse_strict(input[2:], base=2)
> elif base in {0, 16} and input[1] == 'x':
> result = parse_strict(input[2:], base=16)
> elif base == 0
> result = parse_strict(input[1:], base=8)
> else
> result = parse_strict(input[1:], base=base)
>
> elif input[0] == '+':
> result = parse_strict(input[1:], base=10)
>
> elif input[0] == '-': # skip if T is unsigned
> result = parse_strict(input[1:], base=10)
> result < ((my_max / 2) + 1) or raise RangeError
> return -result
>
> result < my_max or raise RangeError # omit if T is unsigned
> return result
>
> That may not be the exact correct flow (e.g. it would disallow "+0xf"...
> which I would be inclined to do, but could be debatable), but should be
> sufficient to illustrate that the relaxed parse can be implemented in
> terms of the strict parse with little to no overhead (compared to what
> would be required to implement it self contained).
>
> Details of how to handle errors, accept input, and indicate characters
> consumed are ignored in the above as they are not important to show the
> logic flow. Any choices made in the above regarding the same should be
> considered arbitrary; don't read anything into them.
Can you compress this into a regex? ;-)
>
>>>> Returning out-of-range for -1 when parsing an unsigned type seems like
>>>> a no-brainer.
>>> I'd argue that in that case the parsing of unsigned types should fail on the
>>> minus sign the same way it would for any other invalid character without
>>> consuming any input.
>> Does it matter where it fails?
> Yes; it should fail before the first unparsable character. For an input
> like "-5", that means before the '-' which is equivalent to "without
> consuming any input". IOW, the same as it would fail for an input like
> ".5" or "[5".
>
> I expect Miro was not intending to refer to the case of input like
> "42-7", which would be expected to consume the "42" and then stop, for
> both the signed or unsigned flavors of the function.
>
Correct. For unsigned types '-' should just be treated like any other
invalid character. So with radix=10 the input "42-7" should yield
exactly the same result as "42x7". Same goes for "-42" and "x42". Stop
at the first invalid character. If that is the first character the
entire input sequence is invalid.
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
.
Author: Jens Maurer <Jens.Maurer@gmx.net>
Date: Thu, 18 Jun 2015 23:09:45 +0200
Raw View
On 06/18/2015 09:27 AM, Olaf van der Spek wrote:
> 2015-06-18 0:49 GMT+02:00 Miro Knejp <miro.knejp@gmail.com>:
>>> Was there a proposal to accept 0b for strtol?
>>> When base = 8, the leading 0 in 0777 doesn't have to be a prefix per se.
>>
>> If an explicit base is specified any prefix parsing should be disabled.
>
> Sounds good but would be incompatible with strtol.
Great. Compatibility with strtol (which is locale-dependent anyway)
doesn't seem like a goal per se.
>>> Returning out-of-range for -1 when parsing an unsigned type seems like
>>> a no-brainer.
>>
>> I'd argue that in that case the parsing of unsigned types should fail on the
>> minus sign the same way it would for any other invalid character without
>> consuming any input.
>
> Does it matter where it fails?
Yes: It's the difference between "no characters consumed" vs.
"minus sign plus digits fully consumed, and then I discovered
the value is wrong".
Jens
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
.
Author: Thiago Macieira <thiago@macieira.org>
Date: Thu, 18 Jun 2015 15:12:48 -0700
Raw View
On Thursday 18 June 2015 23:09:45 Jens Maurer wrote:
> On 06/18/2015 09:27 AM, Olaf van der Spek wrote:
> > 2015-06-18 0:49 GMT+02:00 Miro Knejp <miro.knejp@gmail.com>:
> >>> Was there a proposal to accept 0b for strtol?
> >>> When base = 8, the leading 0 in 0777 doesn't have to be a prefix per se.
> >>
> >> If an explicit base is specified any prefix parsing should be disabled.
> >
> > Sounds good but would be incompatible with strtol.
>
> Great. Compatibility with strtol (which is locale-dependent anyway)
> doesn't seem like a goal per se.
Compatibility with strtol_l might be.
--
Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org
Software Architect - Intel Open Source Technology Center
PGP/GPG: 0x6EF45358; fingerprint:
E067 918B B660 DBD1 105C 966C 33F5 F005 6EF4 5358
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
.
Author: Olaf van der Spek <olafvdspek@gmail.com>
Date: Fri, 19 Jun 2015 00:23:35 +0200
Raw View
2015-06-18 23:09 GMT+02:00 Jens Maurer <Jens.Maurer@gmx.net>:
>>>> Returning out-of-range for -1 when parsing an unsigned type seems like
>>>> a no-brainer.
>>>
>>> I'd argue that in that case the parsing of unsigned types should fail on the
>>> minus sign the same way it would for any other invalid character without
>>> consuming any input.
>>
>> Does it matter where it fails?
>
> Yes: It's the difference between "no characters consumed" vs.
> "minus sign plus digits fully consumed, and then I discovered
> the value is wrong".
Weren't we going to 'return' first on failure? Basically not consuming
anything..
--
Olaf
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
.
Author: Jens Maurer <Jens.Maurer@gmx.net>
Date: Fri, 19 Jun 2015 08:51:31 +0200
Raw View
On 06/19/2015 12:23 AM, Olaf van der Spek wrote:
> 2015-06-18 23:09 GMT+02:00 Jens Maurer <Jens.Maurer@gmx.net>:
>>>>> Returning out-of-range for -1 when parsing an unsigned type seems like
>>>>> a no-brainer.
>>>>
>>>> I'd argue that in that case the parsing of unsigned types should fail on the
>>>> minus sign the same way it would for any other invalid character without
>>>> consuming any input.
>>>
>>> Does it matter where it fails?
>>
>> Yes: It's the difference between "no characters consumed" vs.
>> "minus sign plus digits fully consumed, and then I discovered
>> the value is wrong".
>
> Weren't we going to 'return' first on failure? Basically not consuming
> anything..
I think it doesn't make sense to return "first" for overflow,
i.e. a syntactically well-formed number that just happens not
to fit (the client will want to consume the digits anyway, probably).
This seems fundamentally different from parsing a string that isn't
a syntactic number to start with.
Jens
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
.
Author: Olaf van der Spek <olafvdspek@gmail.com>
Date: Fri, 19 Jun 2015 13:44:23 +0200
Raw View
2015-06-19 8:51 GMT+02:00 Jens Maurer <Jens.Maurer@gmx.net>:
> On 06/19/2015 12:23 AM, Olaf van der Spek wrote:
>> 2015-06-18 23:09 GMT+02:00 Jens Maurer <Jens.Maurer@gmx.net>:
>>>>>> Returning out-of-range for -1 when parsing an unsigned type seems like
>>>>>> a no-brainer.
>>>>>
>>>>> I'd argue that in that case the parsing of unsigned types should fail on the
>>>>> minus sign the same way it would for any other invalid character without
>>>>> consuming any input.
>>>>
>>>> Does it matter where it fails?
>>>
>>> Yes: It's the difference between "no characters consumed" vs.
>>> "minus sign plus digits fully consumed, and then I discovered
>>> the value is wrong".
>>
>> Weren't we going to 'return' first on failure? Basically not consuming
>> anything..
>
> I think it doesn't make sense to return "first" for overflow,
> i.e. a syntactically well-formed number that just happens not
> to fit (the client will want to consume the digits anyway, probably).
>
> This seems fundamentally different from parsing a string that isn't
> a syntactic number to start with.
Then sometimes input is consumed and sometimes it's not depending on
what error occurred. Couldn't that be problematic?
--
Olaf
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
.
Author: John Bytheway <jbytheway@gmail.com>
Date: Mon, 22 Jun 2015 13:18:24 -0400
Raw View
On 2015-06-17 12:06, Olaf van der Spek wrote:
> 2015-06-17 17:20 GMT+02:00 Matthew Fioravante <fmatthew5876@gmail.com>:
>> that default =10 is best. The 0 prefix for octal is unfortunate but its been
>> around forever and known by the whole world so changing it is a non-starter.
>
> IMO it should be deprecated (in C++) and excluded when base = 0.
FWIW, Python 3 decided to abandon the '0' prefix for octal in favour of
'0o', which I much prefer.
John Bytheway
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
.
Author: Olaf van der Spek <olafvdspek@gmail.com>
Date: Tue, 23 Jun 2015 11:11:10 +0200
Raw View
2015-06-22 19:18 GMT+02:00 John Bytheway <jbytheway@gmail.com>:
> On 2015-06-17 12:06, Olaf van der Spek wrote:
>> 2015-06-17 17:20 GMT+02:00 Matthew Fioravante <fmatthew5876@gmail.com>:
>>> that default =10 is best. The 0 prefix for octal is unfortunate but its been
>>> around forever and known by the whole world so changing it is a non-starter.
>>
>> IMO it should be deprecated (in C++) and excluded when base = 0.
>
> FWIW, Python 3 decided to abandon the '0' prefix for octal in favour of
> '0o', which I much prefer.
In source code, parsing functions or both?
Got a link?
--
Olaf
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
.
Author: John Bytheway <jbytheway@gmail.com>
Date: Tue, 23 Jun 2015 09:05:48 -0400
Raw View
On 2015-06-23 05:11, Olaf van der Spek wrote:
> 2015-06-22 19:18 GMT+02:00 John Bytheway <jbytheway@gmail.com>:
>> On 2015-06-17 12:06, Olaf van der Spek wrote:
>>> 2015-06-17 17:20 GMT+02:00 Matthew Fioravante <fmatthew5876@gmail.com>:
>>>> that default =10 is best. The 0 prefix for octal is unfortunate but its been
>>>> around forever and known by the whole world so changing it is a non-starter.
>>>
>>> IMO it should be deprecated (in C++) and excluded when base = 0.
>>
>> FWIW, Python 3 decided to abandon the '0' prefix for octal in favour of
>> '0o', which I much prefer.
>
> In source code, parsing functions or both?
Both
> Got a link?
https://www.python.org/dev/peps/pep-3127/ describes the changes and
their rationale.
John
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
.