Topic: Use cases for extended string_view types
Author: Matthew Fioravante <fmatthew5876@gmail.com>
Date: Thu, 14 May 2015 18:32:38 -0700 (PDT)
Raw View
------=_Part_234_1791383491.1431653558073
Content-Type: multipart/alternative;
boundary="----=_Part_235_61602297.1431653558073"
------=_Part_235_61602297.1431653558073
Content-Type: text/plain; charset=UTF-8
I've found some cases where having a library of string_view types allow
efficient text processing, parsing, and storage in a strong type-safe
manner. The potential downside is a large number of string types.
mstring_view: a mutable string_view
array<mstring_view,kMaxSplits+1> split_store;
auto p = parser(filename);
array_view<mstring_view> s = p.splitNextLine(split_store,',');
In the above example, p reads the next line of the file, and splits it
using ',' into split_store, returning a view to
std::max(split_store,size()< number of splits + 1). parser is implemented
efficiently, so that the resulting view points directly into the underlying
internal I/O buffer to the file. This approach limits the split algorithm
to only 1 memory allocation for the internal file buffer, zero data copies,
1 parsing pass which can be optimized with simd instructions. This level of
efficiency is not possible with the API of fgets() and/or std::getline().
Working with regular string_view is fine, but one can achieve additional
optimization if one can write into the resulting buffer to transform the
text further without making copies of the data which will likely require
memory allocations. The data is just sitting in the internal file buffer
unused during this time, so there is no reason not to allow writes to it.
zstring_view: a string_view (O(1) length()) which is guaranteed to be null
terminated
zmstring-view: an mstring_view (O(1) length()) which is guaranteed to be
null terminated
Unfortunately, I think legacy C API's are here to stay for a long time.
Especially operating system API's like posix. There are some cases when we
know the underlying string data is null terminated and we can use this fact
to interact easily and efficiently with C API's. zstring_view is more
limited in that there is no substr() operation.
//We throw away the null termination invariant, even though it is
guaranteed to exist.
constexpr string_view foo = "foo"sv;
//We retain the null termination invariant and can take advantage of it.
constexpr zstring_view bar = "bar"zsv;
constexpr zstring_view kLibraryPath = "./path/libfoo.so"zsv;
void *dl = dlopen(kLibraryPath.c_str());
This can also be used with my earlier example.
array<zmstring_view,kMaxSplits+1> split_store();
array_view<zmstring_view> s = p.splitNextLine(split_store,',');
The parsing algorithm here can replace all instances of ',' with '\0' and
thus give us null terminated strings we can directly pass to C APIs.
zstring_ptr: a thin type wrapper around const char* (zstring_ptr::strlen()
for length)
zmstring_ptr: a thin type wrapper around char* (zstring_ptr::strlen() for
length)
This one has more uses than just a C API
Storing a string as a single null terminated pointer is more compact than
storing a pointer and a length. If size is important, this can cut the
memory usage of a data structure storing string_views in half. While we can
currently achieve this with const char* and char*, but they are conflated
with pointers which causes problems and ambiguities.
HashMap<const char*,IntId>; //Hashes address
HashMap<zstring_ptr,IntId>; //Hashes string value, using only sizeof(char*)
bytes for each key.
bool operator==(const char*, const char*); //Compares addresses, resulting
in many surprises for novices
bool operator==(zstring_ptr l,zstring_ptr> r) //Compares values using
strcmp()
bool operator==(zstring_ptr l,string_view r) //Compares values
Would you use these types in your projects?
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
------=_Part_235_61602297.1431653558073
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable
<div dir=3D"ltr">I've found some cases where having a library of string_vie=
w types allow efficient text processing, parsing, and storage in a strong t=
ype-safe manner. The potential downside is a large number of string types.<=
div><br></div><div>mstring_view: a mutable string_view</div><div><br></div>=
<div><div class=3D"prettyprint" style=3D"border: 1px solid rgb(187, 187, 18=
7); word-wrap: break-word; background-color: rgb(250, 250, 250);"><code cla=
ss=3D"prettyprint"><div class=3D"subprettyprint"><span style=3D"color: #000=
;" class=3D"styled-by-prettify"><br>array</span><span style=3D"color: #660;=
" class=3D"styled-by-prettify"><</span><span style=3D"color: #000;" clas=
s=3D"styled-by-prettify">mstring_view</span><span style=3D"color: #660;" cl=
ass=3D"styled-by-prettify">,</span><span style=3D"color: #000;" class=3D"st=
yled-by-prettify">kMaxSplits</span><span style=3D"color: #660;" class=3D"st=
yled-by-prettify">+</span><span style=3D"color: #066;" class=3D"styled-by-p=
rettify">1</span><span style=3D"color: #660;" class=3D"styled-by-prettify">=
></span><span style=3D"color: #000;" class=3D"styled-by-prettify"> split=
</span><font color=3D"#666600"><span style=3D"color: #000;" class=3D"styled=
-by-prettify">_store</span><span style=3D"color: #660;" class=3D"styled-by-=
prettify">;</span><span style=3D"color: #000;" class=3D"styled-by-prettify"=
><br><br></span></font><span style=3D"color: #008;" class=3D"styled-by-pret=
tify">auto</span><span style=3D"color: #000;" class=3D"styled-by-prettify">=
p </span><span style=3D"color: #660;" class=3D"styled-by-prettify">=3D</sp=
an><span style=3D"color: #000;" class=3D"styled-by-prettify"> parser</span>=
<span style=3D"color: #660;" class=3D"styled-by-prettify">(</span><span sty=
le=3D"color: #000;" class=3D"styled-by-prettify">filename</span><span style=
=3D"color: #660;" class=3D"styled-by-prettify">);</span><span style=3D"colo=
r: #000;" class=3D"styled-by-prettify"><br>array_view</span><span style=3D"=
color: #080;" class=3D"styled-by-prettify"><mstring_view></span><span=
style=3D"color: #000;" class=3D"styled-by-prettify"> s </span><span style=
=3D"color: #660;" class=3D"styled-by-prettify">=3D</span><font color=3D"#00=
0000"><span style=3D"color: #000;" class=3D"styled-by-prettify"> p</span><s=
pan style=3D"color: #660;" class=3D"styled-by-prettify">.</span><span style=
=3D"color: #000;" class=3D"styled-by-prettify">splitNextLine</span><span st=
yle=3D"color: #660;" class=3D"styled-by-prettify">(</span><span style=3D"co=
lor: #000;" class=3D"styled-by-prettify">split_store</span><span style=3D"c=
olor: #660;" class=3D"styled-by-prettify">,</span><span style=3D"color: #08=
0;" class=3D"styled-by-prettify">','</span><span style=3D"color: #660;" cla=
ss=3D"styled-by-prettify">);</span><span style=3D"color: #000;" class=3D"st=
yled-by-prettify"><br></span></font></div></code></div><div><br></div>In th=
e above example, p reads the next line of the file, and splits it using ','=
into split_store, returning a view to std::max(split_store,size()< numb=
er of splits + 1). parser is implemented efficiently, so that the resulting=
view points directly into the underlying internal I/O buffer to the file. =
This approach limits the split algorithm to only 1 memory allocation for th=
e internal file buffer, zero data copies, 1 parsing pass which can be optim=
ized with simd instructions. This level of efficiency is not possible with =
the API of fgets() and/or std::getline().</div><div><br></div><div>Working =
with regular string_view is fine, but one can achieve additional optimizati=
on if one can write into the resulting buffer to transform the text further=
without making copies of the data which will likely require memory allocat=
ions. The data is just sitting in the internal file buffer unused during th=
is time, so there is no reason not to allow writes to it.</div><div><br></d=
iv><div>zstring_view: a string_view (O(1) length()) which is guaranteed to =
be null terminated</div><div>zmstring-view: an mstring_view (O(1) length())=
which is guaranteed to be null terminated</div><div><br></div><div>Unfortu=
nately, I think legacy C API's are here to stay for a long time. Especially=
operating system API's like posix. There are some cases when we know the u=
nderlying string data is null terminated and we can use this fact to intera=
ct easily and efficiently with C API's. zstring_view is more limited in tha=
t there is no substr() operation. </div><div><br></div><div><div class=
=3D"prettyprint" style=3D"border: 1px solid rgb(187, 187, 187); word-wrap: =
break-word; background-color: rgb(250, 250, 250);"><code class=3D"prettypri=
nt"><div class=3D"subprettyprint"><span style=3D"color: #800;" class=3D"sty=
led-by-prettify">//We throw away the null termination invariant, even thoug=
h it is guaranteed to exist.</span><span style=3D"color: #000;" class=3D"st=
yled-by-prettify"><br></span><span style=3D"color: #008;" class=3D"styled-b=
y-prettify">constexpr</span><span style=3D"color: #000;" class=3D"styled-by=
-prettify"> string</span><font color=3D"#666600"><span style=3D"color: #000=
;" class=3D"styled-by-prettify">_view foo </span><span style=3D"color: #660=
;" class=3D"styled-by-prettify">=3D</span><span style=3D"color: #000;" clas=
s=3D"styled-by-prettify"> </span><span style=3D"color: #080;" class=3D"styl=
ed-by-prettify">"foo"</span><span style=3D"color: #000;" class=3D"styled-by=
-prettify">sv</span><span style=3D"color: #660;" class=3D"styled-by-prettif=
y">;</span><span style=3D"color: #000;" class=3D"styled-by-prettify"><br></=
span></font><span style=3D"color: #800;" class=3D"styled-by-prettify">//We =
retain the null termination invariant and can take advantage of it.</span><=
span style=3D"color: #000;" class=3D"styled-by-prettify"><br></span><span s=
tyle=3D"color: #008;" class=3D"styled-by-prettify">constexpr</span><span st=
yle=3D"color: #000;" class=3D"styled-by-prettify"> zstring_view bar </span>=
<span style=3D"color: #660;" class=3D"styled-by-prettify">=3D</span><span s=
tyle=3D"color: #000;" class=3D"styled-by-prettify"> </span><span style=3D"c=
olor: #080;" class=3D"styled-by-prettify">"bar"</span><span style=3D"color:=
#000;" class=3D"styled-by-prettify">zsv</span><span style=3D"color: #660;"=
class=3D"styled-by-prettify">;</span><span style=3D"color: #000;" class=3D=
"styled-by-prettify"><br><br><br></span><span style=3D"color: #008;" class=
=3D"styled-by-prettify">constexpr</span><span style=3D"color: #000;" class=
=3D"styled-by-prettify"> zstring_view kLibrary</span><font color=3D"#666600=
"><span style=3D"color: #000;" class=3D"styled-by-prettify">P</span></font>=
<span style=3D"color: #000;" class=3D"styled-by-prettify">ath </span><span =
style=3D"color: #660;" class=3D"styled-by-prettify">=3D</span><font color=
=3D"#000000"><span style=3D"color: #000;" class=3D"styled-by-prettify"> </s=
pan><span style=3D"color: #080;" class=3D"styled-by-prettify">"./path/libfo=
o.so"</span><span style=3D"color: #000;" class=3D"styled-by-prettify">zsv</=
span><span style=3D"color: #660;" class=3D"styled-by-prettify">;</span><spa=
n style=3D"color: #000;" class=3D"styled-by-prettify"><br></span><span styl=
e=3D"color: #008;" class=3D"styled-by-prettify">void</span><span style=3D"c=
olor: #000;" class=3D"styled-by-prettify"> </span><span style=3D"color: #66=
0;" class=3D"styled-by-prettify">*</span><span style=3D"color: #000;" class=
=3D"styled-by-prettify">dl </span><span style=3D"color: #660;" class=3D"sty=
led-by-prettify">=3D</span><span style=3D"color: #000;" class=3D"styled-by-=
prettify"> dlopen</span><span style=3D"color: #660;" class=3D"styled-by-pre=
ttify">(</span><span style=3D"color: #000;" class=3D"styled-by-prettify">kL=
ibraryPath</span><span style=3D"color: #660;" class=3D"styled-by-prettify">=
..</span><span style=3D"color: #000;" class=3D"styled-by-prettify">c_str</sp=
an><span style=3D"color: #660;" class=3D"styled-by-prettify">());</span></f=
ont><font color=3D"#008800"></font></div></code></div><div><br></div>This c=
an also be used with my earlier example.</div><div><br></div><div><div clas=
s=3D"prettyprint" style=3D"border: 1px solid rgb(187, 187, 187); word-wrap:=
break-word; background-color: rgb(250, 250, 250);"><code class=3D"prettypr=
int"><div class=3D"subprettyprint"><span style=3D"color: #000;" class=3D"st=
yled-by-prettify">array</span><span style=3D"color: #660;" class=3D"styled-=
by-prettify"><</span><span style=3D"color: #000;" class=3D"styled-by-pre=
ttify">zmstring</span><font color=3D"#666600"><span style=3D"color: #000;" =
class=3D"styled-by-prettify">_view</span><span style=3D"color: #660;" class=
=3D"styled-by-prettify">,</span><span style=3D"color: #000;" class=3D"style=
d-by-prettify">kMaxSplits</span><span style=3D"color: #660;" class=3D"style=
d-by-prettify">+</span><span style=3D"color: #066;" class=3D"styled-by-pret=
tify">1</span><span style=3D"color: #660;" class=3D"styled-by-prettify">>=
;</span><span style=3D"color: #000;" class=3D"styled-by-prettify"> split_st=
ore</span><span style=3D"color: #660;" class=3D"styled-by-prettify">();</sp=
an></font><span style=3D"color: #000;" class=3D"styled-by-prettify"><br>arr=
ay_view</span><span style=3D"color: #080;" class=3D"styled-by-prettify"><=
;zmstring_view></span><span style=3D"color: #000;" class=3D"styled-by-pr=
ettify"> s </span><span style=3D"color: #660;" class=3D"styled-by-prettify"=
>=3D</span><span style=3D"color: #000;" class=3D"styled-by-prettify"> p</sp=
an><span style=3D"color: #660;" class=3D"styled-by-prettify">.</span><span =
style=3D"color: #000;" class=3D"styled-by-prettify">splitNextLine</span><sp=
an style=3D"color: #660;" class=3D"styled-by-prettify">(</span><span style=
=3D"color: #000;" class=3D"styled-by-prettify">split_store</span><span styl=
e=3D"color: #660;" class=3D"styled-by-prettify">,</span><span style=3D"colo=
r: #080;" class=3D"styled-by-prettify">','</span><span style=3D"color: #660=
;" class=3D"styled-by-prettify">);</span></div></code></div><font color=3D"=
#000000" style=3D"font-family: monospace; background-color: rgb(250, 250, 2=
50);"><span class=3D"styled-by-prettify" style=3D"color: rgb(102, 102, 0);"=
><div><font color=3D"#000000" style=3D"font-family: monospace; background-c=
olor: rgb(250, 250, 250);"><span class=3D"styled-by-prettify" style=3D"colo=
r: rgb(102, 102, 0);"><br></span></font></div></span></font>The parsing alg=
orithm here can replace all instances of ',' with '\0' and thus give us nul=
l terminated strings we can directly pass to C APIs.<font color=3D"#000000"=
style=3D"font-family: monospace; background-color: rgb(250, 250, 250);"><s=
pan class=3D"styled-by-prettify" style=3D"color: rgb(102, 102, 0);"><br></s=
pan></font></div><div><br></div><div>zstring_ptr: a thin type wrapper aroun=
d const char* (zstring_ptr::strlen() for length)</div><div>zmstring_ptr: a =
thin type wrapper around char* (zstring_ptr::strlen() for length)</div><div=
><br></div><div>This one has more uses than just a C API </div><div>St=
oring a string as a single null terminated pointer is more compact than sto=
ring a pointer and a length. If size is important, this can cut the memory =
usage of a data structure storing string_views in half. While we can curren=
tly achieve this with const char* and char*, but they are conflated with po=
inters which causes problems and ambiguities. </div><div><br></div><di=
v><div class=3D"prettyprint" style=3D"border: 1px solid rgb(187, 187, 187);=
word-wrap: break-word; background-color: rgb(250, 250, 250);"><code class=
=3D"prettyprint"><div class=3D"subprettyprint"><span style=3D"color: #606;"=
class=3D"styled-by-prettify">HashMap</span><span style=3D"color: #660;" cl=
ass=3D"styled-by-prettify"><</span><span style=3D"color: #008;" class=3D=
"styled-by-prettify">const</span><span style=3D"color: #000;" class=3D"styl=
ed-by-prettify"> </span><span style=3D"color: #008;" class=3D"styled-by-pre=
ttify">char</span><span style=3D"color: #660;" class=3D"styled-by-prettify"=
>*,</span><span style=3D"color: #606;" class=3D"styled-by-prettify">Int</sp=
an><font color=3D"#000088"><span style=3D"color: #606;" class=3D"styled-by-=
prettify">Id</span></font><span style=3D"color: #660;" class=3D"styled-by-p=
rettify">>;</span><span style=3D"color: #000;" class=3D"styled-by-pretti=
fy"> </span><span style=3D"color: #800;" class=3D"styled-by-prettify">//Has=
hes address</span><span style=3D"color: #000;" class=3D"styled-by-prettify"=
><br></span><font color=3D"#000000"><span style=3D"color: #606;" class=3D"s=
tyled-by-prettify">HashMap</span></font><span style=3D"color: #660;" class=
=3D"styled-by-prettify"><</span><span style=3D"color: #000;" class=3D"st=
yled-by-prettify">zstring_ptr</span><span style=3D"color: #660;" class=3D"s=
tyled-by-prettify">,</span><span style=3D"color: #606;" class=3D"styled-by-=
prettify">Int</span><font color=3D"#000000"><span style=3D"color: #606;" cl=
ass=3D"styled-by-prettify">Id</span></font><span style=3D"color: #660;" cla=
ss=3D"styled-by-prettify">>;</span><span style=3D"color: #000;" class=3D=
"styled-by-prettify"> </span><span style=3D"color: #800;" class=3D"styled-b=
y-prettify">//Hashes string value, using only sizeof(char*) bytes for each =
key.</span><span style=3D"color: #000;" class=3D"styled-by-prettify"><br><b=
r></span><span style=3D"color: #008;" class=3D"styled-by-prettify">bool</sp=
an><span style=3D"color: #000;" class=3D"styled-by-prettify"> </span><span =
style=3D"color: #008;" class=3D"styled-by-prettify">operator</span><span st=
yle=3D"color: #660;" class=3D"styled-by-prettify">=3D=3D(</span><span style=
=3D"color: #008;" class=3D"styled-by-prettify">const</span><span style=3D"c=
olor: #000;" class=3D"styled-by-prettify"> </span><span style=3D"color: #00=
8;" class=3D"styled-by-prettify">char</span><span style=3D"color: #660;" cl=
ass=3D"styled-by-prettify">*,</span><span style=3D"color: #000;" class=3D"s=
tyled-by-prettify"> </span><span style=3D"color: #008;" class=3D"styled-by-=
prettify">const</span><span style=3D"color: #000;" class=3D"styled-by-prett=
ify"> </span><span style=3D"color: #008;" class=3D"styled-by-prettify">char=
</span><span style=3D"color: #660;" class=3D"styled-by-prettify">*);</span>=
<span style=3D"color: #000;" class=3D"styled-by-prettify"> </span><span sty=
le=3D"color: #800;" class=3D"styled-by-prettify">//Compares addresses, resu=
lting in many surprises for novices</span><span style=3D"color: #000;" clas=
s=3D"styled-by-prettify"><br></span><span style=3D"color: #008;" class=3D"s=
tyled-by-prettify">bool</span><span style=3D"color: #000;" class=3D"styled-=
by-prettify"> </span><span style=3D"color: #008;" class=3D"styled-by-pretti=
fy">operator</span><span style=3D"color: #660;" class=3D"styled-by-prettify=
">=3D=3D(</span><span style=3D"color: #000;" class=3D"styled-by-prettify">z=
string_ptr l</span><span style=3D"color: #660;" class=3D"styled-by-prettify=
">,</span><span style=3D"color: #000;" class=3D"styled-by-prettify">zstring=
_ptr</span><span style=3D"color: #660;" class=3D"styled-by-prettify">></=
span><span style=3D"color: #000;" class=3D"styled-by-prettify"> r</span><sp=
an style=3D"color: #660;" class=3D"styled-by-prettify">)</span><font color=
=3D"#000000"><span style=3D"color: #000;" class=3D"styled-by-prettify"> </s=
pan><span style=3D"color: #800;" class=3D"styled-by-prettify">//Compares va=
lues using strcmp()</span></font><span style=3D"color: #000;" class=3D"styl=
ed-by-prettify"><br></span><span style=3D"color: #008;" class=3D"styled-by-=
prettify">bool</span><span style=3D"color: #000;" class=3D"styled-by-pretti=
fy"> </span><span style=3D"color: #008;" class=3D"styled-by-prettify">opera=
tor</span><span style=3D"color: #660;" class=3D"styled-by-prettify">=3D=3D(=
</span><span style=3D"color: #000;" class=3D"styled-by-prettify">zstring_pt=
r l</span><span style=3D"color: #660;" class=3D"styled-by-prettify">,</span=
><span style=3D"color: #000;" class=3D"styled-by-prettify">string_view r</s=
pan><span style=3D"color: #660;" class=3D"styled-by-prettify">)</span><font=
color=3D"#880000"><span style=3D"color: #000;" class=3D"styled-by-prettify=
"> </span><span style=3D"color: #800;" class=3D"styled-by-prettify">//Compa=
res values</span></font><font color=3D"#000000"></font></div></code></div><=
br><br></div><div>Would you use these types in your projects?</div><div><br=
></div></div>
<p></p>
-- <br />
<br />
--- <br />
You received this message because you are subscribed to the Google Groups &=
quot;ISO C++ Standard - Future Proposals" group.<br />
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to <a href=3D"mailto:std-proposals+unsubscribe@isocpp.org">std-proposa=
ls+unsubscribe@isocpp.org</a>.<br />
To post to this group, send email to <a href=3D"mailto:std-proposals@isocpp=
..org">std-proposals@isocpp.org</a>.<br />
Visit this group at <a href=3D"http://groups.google.com/a/isocpp.org/group/=
std-proposals/">http://groups.google.com/a/isocpp.org/group/std-proposals/<=
/a>.<br />
------=_Part_235_61602297.1431653558073--
------=_Part_234_1791383491.1431653558073--
.
Author: "'Jeffrey Yasskin' via ISO C++ Standard - Future Proposals" <std-proposals@isocpp.org>
Date: Thu, 14 May 2015 18:38:41 -0700
Raw View
--bcaec51b1fb7ac9f36051614e851
Content-Type: text/plain; charset=UTF-8
On Thu, May 14, 2015 at 6:32 PM, Matthew Fioravante <fmatthew5876@gmail.com>
wrote:
> I've found some cases where having a library of string_view types allow
> efficient text processing, parsing, and storage in a strong type-safe
> manner. The potential downside is a large number of string types.
>
> mstring_view: a mutable string_view
>
>
> array<mstring_view,kMaxSplits+1> split_store;
>
> auto p = parser(filename);
> array_view<mstring_view> s = p.splitNextLine(split_store,',');
>
> In the above example, p reads the next line of the file, and splits it
> using ',' into split_store, returning a view to
> std::max(split_store,size()< number of splits + 1). parser is implemented
> efficiently, so that the resulting view points directly into the underlying
> internal I/O buffer to the file. This approach limits the split algorithm
> to only 1 memory allocation for the internal file buffer, zero data copies,
> 1 parsing pass which can be optimized with simd instructions. This level of
> efficiency is not possible with the API of fgets() and/or std::getline().
>
> Working with regular string_view is fine, but one can achieve additional
> optimization if one can write into the resulting buffer to transform the
> text further without making copies of the data which will likely require
> memory allocations. The data is just sitting in the internal file buffer
> unused during this time, so there is no reason not to allow writes to it.
>
Can you give some examples of these transformations? I see the example of
replacing delimiters with '\0' to allow use of functions that expect a
null-terminated strings. What else do you use it for? Do you have a link to
any code that does this?
Thanks,
Jeffrey
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
--bcaec51b1fb7ac9f36051614e851
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable
<div dir=3D"ltr"><div class=3D"gmail_extra"><div class=3D"gmail_quote">On T=
hu, May 14, 2015 at 6:32 PM, Matthew Fioravante <span dir=3D"ltr"><<a hr=
ef=3D"mailto:fmatthew5876@gmail.com" target=3D"_blank" class=3D"cremed">fma=
tthew5876@gmail.com</a>></span> wrote:<br><blockquote class=3D"gmail_quo=
te" style=3D"margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"=
><div dir=3D"ltr">I've found some cases where having a library of strin=
g_view types allow efficient text processing, parsing, and storage in a str=
ong type-safe manner. The potential downside is a large number of string ty=
pes.<div><br></div><div>mstring_view: a mutable string_view</div><div><br><=
/div><div><div style=3D"border:1px solid rgb(187,187,187);word-wrap:break-w=
ord;background-color:rgb(250,250,250)"><code><div><span style=3D"color:#000=
"><br>array</span><span style=3D"color:#660"><</span><span style=3D"colo=
r:#000">mstring_view</span><span style=3D"color:#660">,</span><span style=
=3D"color:#000">kMaxSplits</span><span style=3D"color:#660">+</span><span s=
tyle=3D"color:#066">1</span><span style=3D"color:#660">></span><span sty=
le=3D"color:#000"> split</span><font color=3D"#666600"><span style=3D"color=
:#000">_store</span><span style=3D"color:#660">;</span><span style=3D"color=
:#000"><br><br></span></font><span style=3D"color:#008">auto</span><span st=
yle=3D"color:#000"> p </span><span style=3D"color:#660">=3D</span><span sty=
le=3D"color:#000"> parser</span><span style=3D"color:#660">(</span><span st=
yle=3D"color:#000">filename</span><span style=3D"color:#660">);</span><span=
style=3D"color:#000"><br>array_view</span><span style=3D"color:#080"><m=
string_view></span><span style=3D"color:#000"> s </span><span style=3D"c=
olor:#660">=3D</span><font color=3D"#000000"><span style=3D"color:#000"> p<=
/span><span style=3D"color:#660">.</span><span style=3D"color:#000">splitNe=
xtLine</span><span style=3D"color:#660">(</span><span style=3D"color:#000">=
split_store</span><span style=3D"color:#660">,</span><span style=3D"color:#=
080">','</span><span style=3D"color:#660">);</span><span style=3D"c=
olor:#000"><br></span></font></div></code></div><div><br></div>In the above=
example, p reads the next line of the file, and splits it using ','=
; into split_store, returning a view to std::max(split_store,size()< num=
ber of splits + 1). parser is implemented efficiently, so that the resultin=
g view points directly into the underlying internal I/O buffer to the file.=
This approach limits the split algorithm to only 1 memory allocation for t=
he internal file buffer, zero data copies, 1 parsing pass which can be opti=
mized with simd instructions. This level of efficiency is not possible with=
the API of fgets() and/or std::getline().</div><div><br></div><div>Working=
with regular string_view is fine, but one can achieve additional optimizat=
ion if one can write into the resulting buffer to transform the text furthe=
r without making copies of the data which will likely require memory alloca=
tions. The data is just sitting in the internal file buffer unused during t=
his time, so there is no reason not to allow writes to it.</div></div></blo=
ckquote><div><br></div><div>Can you give some examples of these transformat=
ions? I see the example of replacing delimiters with '\0' to allow =
use of functions that expect a null-terminated strings. What else do you us=
e it for? Do you have a link to any code that does this?</div><div><br></di=
v><div>Thanks,</div><div>Jeffrey</div></div><br></div></div>
<p></p>
-- <br />
<br />
--- <br />
You received this message because you are subscribed to the Google Groups &=
quot;ISO C++ Standard - Future Proposals" group.<br />
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to <a href=3D"mailto:std-proposals+unsubscribe@isocpp.org">std-proposa=
ls+unsubscribe@isocpp.org</a>.<br />
To post to this group, send email to <a href=3D"mailto:std-proposals@isocpp=
..org">std-proposals@isocpp.org</a>.<br />
Visit this group at <a href=3D"http://groups.google.com/a/isocpp.org/group/=
std-proposals/">http://groups.google.com/a/isocpp.org/group/std-proposals/<=
/a>.<br />
--bcaec51b1fb7ac9f36051614e851--
.
Author: Matthew Fioravante <fmatthew5876@gmail.com>
Date: Thu, 14 May 2015 19:08:51 -0700 (PDT)
Raw View
------=_Part_245_944523337.1431655731053
Content-Type: multipart/alternative;
boundary="----=_Part_246_404606335.1431655731053"
------=_Part_246_404606335.1431655731053
Content-Type: text/plain; charset=UTF-8
On Thursday, May 14, 2015 at 9:39:23 PM UTC-4, Jeffrey Yasskin wrote:
>
> On Thu, May 14, 2015 at 6:32 PM, Matthew Fioravante <fmatth...@gmail.com
> <javascript:>> wrote:
>
>> I've found some cases where having a library of string_view types allow
>> efficient text processing, parsing, and storage in a strong type-safe
>> manner. The potential downside is a large number of string types.
>>
>> mstring_view: a mutable string_view
>>
>>
>> array<mstring_view,kMaxSplits+1> split_store;
>>
>> auto p = parser(filename);
>> array_view<mstring_view> s = p.splitNextLine(split_store,',');
>>
>> In the above example, p reads the next line of the file, and splits it
>> using ',' into split_store, returning a view to
>> std::max(split_store,size()< number of splits + 1). parser is implemented
>> efficiently, so that the resulting view points directly into the underlying
>> internal I/O buffer to the file. This approach limits the split algorithm
>> to only 1 memory allocation for the internal file buffer, zero data copies,
>> 1 parsing pass which can be optimized with simd instructions. This level of
>> efficiency is not possible with the API of fgets() and/or std::getline().
>>
>> Working with regular string_view is fine, but one can achieve additional
>> optimization if one can write into the resulting buffer to transform the
>> text further without making copies of the data which will likely require
>> memory allocations. The data is just sitting in the internal file buffer
>> unused during this time, so there is no reason not to allow writes to it.
>>
>
> Can you give some examples of these transformations?
>
One of the most common transformations is to_upper() or to_lower() if you
want to do case insensitive comparison. If you need to compare the data
many times, you can transform it in place once and the compiler can use the
optimal bulk simd / streaming instructions to compare the strings.
Another is that right now we still don't have a string_view compatible
strtod() and friends. You can use zstring_view and strtod() to quickly
parse comma separated numbers.
Do you have a link to any code that does this?
>
I don't have links to any public code which does this kind of thing but
I've used it before with considerable speed ups over the conventional
getline() solution. I'm not sure if any major open source projects are
using these techniques for their parsing code.
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
------=_Part_246_404606335.1431655731053
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable
<div dir=3D"ltr"><br><br>On Thursday, May 14, 2015 at 9:39:23 PM UTC-4, Jef=
frey Yasskin wrote:<blockquote class=3D"gmail_quote" style=3D"margin: 0;mar=
gin-left: 0.8ex;border-left: 1px #ccc solid;padding-left: 1ex;"><div dir=3D=
"ltr"><div><div class=3D"gmail_quote">On Thu, May 14, 2015 at 6:32 PM, Matt=
hew Fioravante <span dir=3D"ltr"><<a href=3D"javascript:" target=3D"_bla=
nk" gdf-obfuscated-mailto=3D"ANCzr3DmBoEJ" rel=3D"nofollow" onmousedown=3D"=
this.href=3D'javascript:';return true;" onclick=3D"this.href=3D'javascript:=
';return true;">fmatth...@gmail.com</a>></span> wrote:<br><blockquote cl=
ass=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1px #ccc solid;p=
adding-left:1ex"><div dir=3D"ltr">I've found some cases where having a libr=
ary of string_view types allow efficient text processing, parsing, and stor=
age in a strong type-safe manner. The potential downside is a large number =
of string types.<div><br></div><div>mstring_view: a mutable string_view</di=
v><div><br></div><div><div style=3D"border:1px solid rgb(187,187,187);word-=
wrap:break-word;background-color:rgb(250,250,250)"><code><div><span style=
=3D"color:#000"><br>array</span><span style=3D"color:#660"><</span><span=
style=3D"color:#000">mstring_view</span><span style=3D"color:#660">,</span=
><span style=3D"color:#000">kMaxSplits</span><span style=3D"color:#660">+</=
span><span style=3D"color:#066"><wbr>1</span><span style=3D"color:#660">>=
;</span><span style=3D"color:#000"> split</span><font color=3D"#666600"><sp=
an style=3D"color:#000">_store</span><span style=3D"color:#660">;</span><sp=
an style=3D"color:#000"><br><br></span></font><span style=3D"color:#008">au=
to</span><span style=3D"color:#000"> p </span><span style=3D"color:#660">=
=3D</span><span style=3D"color:#000"> parser</span><span style=3D"color:#66=
0">(</span><span style=3D"color:#000">filename</span><span style=3D"color:#=
660">);</span><span style=3D"color:#000"><br>array_view</span><span style=
=3D"color:#080"><mstring_view></span><span style=3D"color:#000"> s </=
span><span style=3D"color:#660">=3D</span><font color=3D"#000000"><span sty=
le=3D"color:#000"> p</span><span style=3D"color:#660">.</span><span style=
=3D"color:#000">splitNextLine</span><span style=3D"color:#660">(</span><spa=
n style=3D"color:#000">split_store</span><span style=3D"color:#660">,</span=
><span style=3D"color:#080">',<wbr>'</span><span style=3D"color:#660">);</s=
pan><span style=3D"color:#000"><br></span></font></div></code></div><div><b=
r></div>In the above example, p reads the next line of the file, and splits=
it using ',' into split_store, returning a view to std::max(split_store,si=
ze()< number of splits + 1). parser is implemented efficiently, so that =
the resulting view points directly into the underlying internal I/O buffer =
to the file. This approach limits the split algorithm to only 1 memory allo=
cation for the internal file buffer, zero data copies, 1 parsing pass which=
can be optimized with simd instructions. This level of efficiency is not p=
ossible with the API of fgets() and/or std::getline().</div><div><br></div>=
<div>Working with regular string_view is fine, but one can achieve addition=
al optimization if one can write into the resulting buffer to transform the=
text further without making copies of the data which will likely require m=
emory allocations. The data is just sitting in the internal file buffer unu=
sed during this time, so there is no reason not to allow writes to it.</div=
></div></blockquote><div><br></div><div>Can you give some examples of these=
transformations?</div></div></div></div></blockquote><div><br></div><div>O=
ne of the most common transformations is to_upper() or to_lower() if you wa=
nt to do case insensitive comparison. If you need to compare the data many =
times, you can transform it in place once and the compiler can use the opti=
mal bulk simd / streaming instructions to compare the strings.</div><div><b=
r></div><div>Another is that right now we still don't have a string_view co=
mpatible strtod() and friends. You can use zstring_view and strtod() to qui=
ckly parse comma separated numbers.</div><div><br></div><blockquote class=
=3D"gmail_quote" style=3D"margin: 0;margin-left: 0.8ex;border-left: 1px #cc=
c solid;padding-left: 1ex;"><div dir=3D"ltr"><div><div class=3D"gmail_quote=
"><div>Do you have a link to any code that does this?</div></div></div></di=
v></blockquote><div><br></div><div>I don't have links to any public code wh=
ich does this kind of thing but I've used it before with considerable speed=
ups over the conventional getline() solution. I'm not sure if any major op=
en source projects are using these techniques for their parsing code.</div>=
</div>
<p></p>
-- <br />
<br />
--- <br />
You received this message because you are subscribed to the Google Groups &=
quot;ISO C++ Standard - Future Proposals" group.<br />
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to <a href=3D"mailto:std-proposals+unsubscribe@isocpp.org">std-proposa=
ls+unsubscribe@isocpp.org</a>.<br />
To post to this group, send email to <a href=3D"mailto:std-proposals@isocpp=
..org">std-proposals@isocpp.org</a>.<br />
Visit this group at <a href=3D"http://groups.google.com/a/isocpp.org/group/=
std-proposals/">http://groups.google.com/a/isocpp.org/group/std-proposals/<=
/a>.<br />
------=_Part_246_404606335.1431655731053--
------=_Part_245_944523337.1431655731053--
.
Author: Nicol Bolas <jmckesson@gmail.com>
Date: Thu, 14 May 2015 19:45:45 -0700 (PDT)
Raw View
------=_Part_175_84258680.1431657945903
Content-Type: multipart/alternative;
boundary="----=_Part_176_943911387.1431657945904"
------=_Part_176_943911387.1431657945904
Content-Type: text/plain; charset=UTF-8
On Thursday, May 14, 2015 at 9:32:38 PM UTC-4, Matthew Fioravante wrote:
>
> I've found some cases where having a library of string_view types allow
> efficient text processing, parsing, and storage in a strong type-safe
> manner. The potential downside is a large number of string types.
>
> mstring_view: a mutable string_view
>
>
> array<mstring_view,kMaxSplits+1> split_store;
>
> auto p = parser(filename);
> array_view<mstring_view> s = p.splitNextLine(split_store,',');
>
> In the above example, p reads the next line of the file, and splits it
> using ',' into split_store, returning a view to
> std::max(split_store,size()< number of splits + 1). parser is implemented
> efficiently, so that the resulting view points directly into the underlying
> internal I/O buffer to the file. This approach limits the split algorithm
> to only 1 memory allocation for the internal file buffer, zero data copies,
> 1 parsing pass which can be optimized with simd instructions. This level of
> efficiency is not possible with the API of fgets() and/or std::getline().
>
> Working with regular string_view is fine, but one can achieve additional
> optimization if one can write into the resulting buffer to transform the
> text further without making copies of the data which will likely require
> memory allocations. The data is just sitting in the internal file buffer
> unused during this time, so there is no reason not to allow writes to it.
>
As a bikeshed point, if it's a "view", then it's not mutable. That's why
the word "view" was chosen; you can look, but not touch.
Furthermore, I see no need for a class specifically for this. It seems to
me that the range proposal will cover this circumstance well enough. What
you have is a range of iterators, when the iterators are `const char*`. The
reason that `string_view` exists separately from that proposal is because
it's an exceptionally common case, one that has certain special needs.
If you're dealing with parsing a mutable character buffer, then you're
probably dealing with iterators already (or something close enough). So
ranges would only be par for the course.
zstring_view: a string_view (O(1) length()) which is guaranteed to be null
> terminated
> zmstring-view: an mstring_view (O(1) length()) which is guaranteed to be
> null terminated
>
Well, there are all the issues I pointed out on the previous `zstring_view`
thread, which are not addressed here.
> Unfortunately, I think legacy C API's are here to stay for a long time.
> Especially operating system API's like posix. There are some cases when we
> know the underlying string data is null terminated and we can use this fact
> to interact easily and efficiently with C API's. zstring_view is more
> limited in that there is no substr() operation.
>
> //We throw away the null termination invariant, even though it is
> guaranteed to exist.
> constexpr string_view foo = "foo"sv;
> //We retain the null termination invariant and can take advantage of it.
> constexpr zstring_view bar = "bar"zsv;
>
>
> constexpr zstring_view kLibraryPath = "./path/libfoo.so"zsv;
> void *dl = dlopen(kLibraryPath.c_str());
>
>
While this does show a more useful use-case than the previous thread, it's
a use case that really only matters when dealing with string literals. And
while there are quite a few string literals in code, there are many more
strings in code (particularly internationalized codebases) that aren't
literals. Which means that they're going to be stored in a std::string
(already null-terminated) or some other similar object that can also
already be null-terminated.
zstring_ptr: a thin type wrapper around const char* (zstring_ptr::strlen()
> for length)
> zmstring_ptr: a thin type wrapper around char* (zstring_ptr::strlen() for
> length)
>
> This one has more uses than just a C API
> Storing a string as a single null terminated pointer is more compact than
> storing a pointer and a length. If size is important, this can cut the
> memory usage of a data structure storing string_views in half.
>
The only way that would be true is if the *only* thing in that data
structure was a `string_view` (or array thereof). And in that case, I'd
much rather we were clear that it's just storing `const char*`s by making
the data structure just store `const char*`s. At least then we'd know what
we're dealing with.
Remember: the standard should not require a specific implementation. It
shouldn't say things like the size of a type must be no greater than the
size of a `char*`. And without that guarantee in the standard, we can't
assume implementations will implement it.
Lastly, this really isn't what the standard library is for. If a particular
user has need for such a thing, then let them write it. But the standard
library doesn't have to cover everyone's usage scenarios.
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
------=_Part_176_943911387.1431657945904
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable
<div dir=3D"ltr">On Thursday, May 14, 2015 at 9:32:38 PM UTC-4, Matthew Fio=
ravante wrote:<blockquote class=3D"gmail_quote" style=3D"margin: 0;margin-l=
eft: 0.8ex;border-left: 1px #ccc solid;padding-left: 1ex;"><div dir=3D"ltr"=
>I've found some cases where having a library of string_view types allow ef=
ficient text processing, parsing, and storage in a strong type-safe manner.=
The potential downside is a large number of string types.<div><br></div><d=
iv>mstring_view: a mutable string_view</div><div><br></div><div><div style=
=3D"border:1px solid rgb(187,187,187);word-wrap:break-word;background-color=
:rgb(250,250,250)"><code><div><span style=3D"color:#000"><br>array</span><s=
pan style=3D"color:#660"><</span><span style=3D"color:#000">mstring_view=
</span><span style=3D"color:#660">,</span><span style=3D"color:#000">kMaxSp=
lits</span><span style=3D"color:#660">+</span><span style=3D"color:#066"><w=
br>1</span><span style=3D"color:#660">></span><span style=3D"color:#000"=
> split</span><font color=3D"#666600"><span style=3D"color:#000">_store</sp=
an><span style=3D"color:#660">;</span><span style=3D"color:#000"><br><br></=
span></font><span style=3D"color:#008">auto</span><span style=3D"color:#000=
"> p </span><span style=3D"color:#660">=3D</span><span style=3D"color:#000"=
> parser</span><span style=3D"color:#660">(</span><span style=3D"color:#000=
">filename</span><span style=3D"color:#660">);</span><span style=3D"color:#=
000"><br>array_view</span><span style=3D"color:#080"><mstring_view></=
span><span style=3D"color:#000"> s </span><span style=3D"color:#660">=3D</s=
pan><font color=3D"#000000"><span style=3D"color:#000"> p</span><span style=
=3D"color:#660">.</span><span style=3D"color:#000">splitNextLine</span><spa=
n style=3D"color:#660">(</span><span style=3D"color:#000">split_store</span=
><span style=3D"color:#660">,</span><span style=3D"color:#080">',<wbr>'</sp=
an><span style=3D"color:#660">);</span><span style=3D"color:#000"><br></spa=
n></font></div></code></div><div><br></div>In the above example, p reads th=
e next line of the file, and splits it using ',' into split_store, returnin=
g a view to std::max(split_store,size()< number of splits + 1). parser i=
s implemented efficiently, so that the resulting view points directly into =
the underlying internal I/O buffer to the file. This approach limits the sp=
lit algorithm to only 1 memory allocation for the internal file buffer, zer=
o data copies, 1 parsing pass which can be optimized with simd instructions=
.. This level of efficiency is not possible with the API of fgets() and/or s=
td::getline().</div><div><br></div><div>Working with regular string_view is=
fine, but one can achieve additional optimization if one can write into th=
e resulting buffer to transform the text further without making copies of t=
he data which will likely require memory allocations. The data is just sitt=
ing in the internal file buffer unused during this time, so there is no rea=
son not to allow writes to it.</div></div></blockquote><div><br>As a bikesh=
ed point, if it's a "view", then it's not mutable. That's why the word "vie=
w" was chosen; you can look, but not touch.<br><br>Furthermore, I see no ne=
ed for a class specifically for this. It seems to me that the range proposa=
l will cover this circumstance well enough. What you have is a range of ite=
rators, when the iterators are `const char*`. The reason that `string_view`=
exists separately from that proposal is because it's an exceptionally comm=
on case, one that has certain special needs.<br><br>If you're dealing with =
parsing a mutable character buffer, then you're probably dealing with itera=
tors already (or something close enough). So ranges would only be par for t=
he course.<br><br></div><blockquote class=3D"gmail_quote" style=3D"margin: =
0;margin-left: 0.8ex;border-left: 1px #ccc solid;padding-left: 1ex;"><div d=
ir=3D"ltr"><div></div><div>zstring_view: a string_view (O(1) length()) whic=
h is guaranteed to be null terminated</div><div>zmstring-view: an mstring_v=
iew (O(1) length()) which is guaranteed to be null terminated</div></div></=
blockquote><div><br>Well, there are all the issues I pointed out on the pre=
vious `zstring_view` thread, which are not addressed here.<br> </div><=
blockquote class=3D"gmail_quote" style=3D"margin: 0;margin-left: 0.8ex;bord=
er-left: 1px #ccc solid;padding-left: 1ex;"><div dir=3D"ltr"><div></div><di=
v>Unfortunately, I think legacy C API's are here to stay for a long time. E=
specially operating system API's like posix. There are some cases when we k=
now the underlying string data is null terminated and we can use this fact =
to interact easily and efficiently with C API's. zstring_view is more limit=
ed in that there is no substr() operation. </div><div><br></div><div><=
div style=3D"border:1px solid rgb(187,187,187);word-wrap:break-word;backgro=
und-color:rgb(250,250,250)"><code><div><span style=3D"color:#800">//We thro=
w away the null termination invariant, even though it is guaranteed to exis=
t.</span><span style=3D"color:#000"><br></span><span style=3D"color:#008">c=
onstexpr</span><span style=3D"color:#000"> string</span><font color=3D"#666=
600"><span style=3D"color:#000">_view foo </span><span style=3D"color:#660"=
>=3D</span><span style=3D"color:#000"> </span><span style=3D"color:#080">"f=
oo"</span><span style=3D"color:#000">sv</span><span style=3D"color:#660">;<=
/span><span style=3D"color:#000"><br></span></font><span style=3D"color:#80=
0">//We retain the null termination invariant and can take advantage of it.=
</span><span style=3D"color:#000"><br></span><span style=3D"color:#008">con=
stexpr</span><span style=3D"color:#000"> zstring_view bar </span><span styl=
e=3D"color:#660">=3D</span><span style=3D"color:#000"> </span><span style=
=3D"color:#080">"bar"</span><span style=3D"color:#000">zsv</span><span styl=
e=3D"color:#660">;</span><span style=3D"color:#000"><br><br><br></span><spa=
n style=3D"color:#008">constexpr</span><span style=3D"color:#000"> zstring_=
view kLibrary</span><font color=3D"#666600"><span style=3D"color:#000">P</s=
pan></font><span style=3D"color:#000">ath </span><span style=3D"color:#660"=
>=3D</span><font color=3D"#000000"><span style=3D"color:#000"> </span><span=
style=3D"color:#080">"./path/libfoo.so"</span><span style=3D"color:#000">z=
sv</span><span style=3D"color:#660">;</span><span style=3D"color:#000"><br>=
</span><span style=3D"color:#008">void</span><span style=3D"color:#000"> </=
span><span style=3D"color:#660">*</span><span style=3D"color:#000">dl </spa=
n><span style=3D"color:#660">=3D</span><span style=3D"color:#000"> dlopen</=
span><span style=3D"color:#660">(</span><span style=3D"color:#000">kLibrary=
Path</span><span style=3D"color:#660">.</span><span style=3D"color:#000">c_=
str</span><span style=3D"color:#660">());</span></font><font color=3D"#0088=
00"></font></div></code></div><div><br></div></div></div></blockquote><div>=
<br>While this does show a more useful use-case than the previous thread, i=
t's a use case that really only matters when dealing with string literals. =
And while there are quite a few string literals in code, there are many mor=
e strings in code (particularly internationalized codebases) that aren't li=
terals. Which means that they're going to be stored in a std::string (alrea=
dy null-terminated) or some other similar object that can also already be n=
ull-terminated.<br><br></div><blockquote class=3D"gmail_quote" style=3D"mar=
gin: 0;margin-left: 0.8ex;border-left: 1px #ccc solid;padding-left: 1ex;"><=
div dir=3D"ltr"><div></div></div></blockquote><blockquote class=3D"gmail_qu=
ote" style=3D"margin: 0;margin-left: 0.8ex;border-left: 1px #ccc solid;padd=
ing-left: 1ex;"><div dir=3D"ltr"><div></div><div>zstring_ptr: a thin type w=
rapper around const char* (zstring_ptr::strlen() for length)</div><div>zmst=
ring_ptr: a thin type wrapper around char* (zstring_ptr::strlen() for lengt=
h)</div><div><br></div><div>This one has more uses than just a C API <=
/div><div>Storing a string as a single null terminated pointer is more comp=
act than storing a pointer and a length. If size is important, this can cut=
the memory usage of a data structure storing string_views in half.</div></=
div></blockquote><div><br>The only way that would be true is if the <i>only=
</i> thing in that data structure was a `string_view` (or array thereof). A=
nd in that case, I'd much rather we were clear that it's just storing `cons=
t char*`s by making the data structure just store `const char*`s. At least =
then we'd know what we're dealing with.<br><br>Remember: the standard shoul=
d not require a specific implementation. It shouldn't say things like the s=
ize of a type must be no greater than the size of a `char*`. And without th=
at guarantee in the standard, we can't assume implementations will implemen=
t it.<br><br>Lastly, this really isn't what the standard library is for. If=
a particular user has need for such a thing, then let them write it. But t=
he standard library doesn't have to cover everyone's usage scenarios.</div>=
</div>
<p></p>
-- <br />
<br />
--- <br />
You received this message because you are subscribed to the Google Groups &=
quot;ISO C++ Standard - Future Proposals" group.<br />
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to <a href=3D"mailto:std-proposals+unsubscribe@isocpp.org">std-proposa=
ls+unsubscribe@isocpp.org</a>.<br />
To post to this group, send email to <a href=3D"mailto:std-proposals@isocpp=
..org">std-proposals@isocpp.org</a>.<br />
Visit this group at <a href=3D"http://groups.google.com/a/isocpp.org/group/=
std-proposals/">http://groups.google.com/a/isocpp.org/group/std-proposals/<=
/a>.<br />
------=_Part_176_943911387.1431657945904--
------=_Part_175_84258680.1431657945903--
.
Author: Matthew Fioravante <fmatthew5876@gmail.com>
Date: Thu, 14 May 2015 19:51:01 -0700 (PDT)
Raw View
------=_Part_40_1876751404.1431658261195
Content-Type: multipart/alternative;
boundary="----=_Part_41_321039892.1431658261195"
------=_Part_41_321039892.1431658261195
Content-Type: text/plain; charset=UTF-8
On Thursday, May 14, 2015 at 10:45:45 PM UTC-4, Nicol Bolas wrote:
>
> On Thursday, May 14, 2015 at 9:32:38 PM UTC-4, Matthew Fioravante wrote:
>>
>> I've found some cases where having a library of string_view types allow
>> efficient text processing, parsing, and storage in a strong type-safe
>> manner. The potential downside is a large number of string types.
>>
>> mstring_view: a mutable string_view
>>
>>
>> array<mstring_view,kMaxSplits+1> split_store;
>>
>> auto p = parser(filename);
>> array_view<mstring_view> s = p.splitNextLine(split_store,',');
>>
>> In the above example, p reads the next line of the file, and splits it
>> using ',' into split_store, returning a view to
>> std::max(split_store,size()< number of splits + 1). parser is implemented
>> efficiently, so that the resulting view points directly into the underlying
>> internal I/O buffer to the file. This approach limits the split algorithm
>> to only 1 memory allocation for the internal file buffer, zero data copies,
>> 1 parsing pass which can be optimized with simd instructions. This level of
>> efficiency is not possible with the API of fgets() and/or std::getline().
>>
>> Working with regular string_view is fine, but one can achieve additional
>> optimization if one can write into the resulting buffer to transform the
>> text further without making copies of the data which will likely require
>> memory allocations. The data is just sitting in the internal file buffer
>> unused during this time, so there is no reason not to allow writes to it.
>>
>
> As a bikeshed point, if it's a "view", then it's not mutable. That's why
> the word "view" was chosen; you can look, but not touch.
>
I disagree, an array_view for example is mutable by default. string_view is
immutable by default because its the most useful default use case. A view
is a view over the data.
This is similar but reversed from array_view where we have array_view<T>
for a read or write view over the data and carray_view<T> (array_view<const
T>) for a read only view of the data.
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
------=_Part_41_321039892.1431658261195
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable
<div dir=3D"ltr"><br><br>On Thursday, May 14, 2015 at 10:45:45 PM UTC-4, Ni=
col Bolas wrote:<blockquote class=3D"gmail_quote" style=3D"margin: 0;margin=
-left: 0.8ex;border-left: 1px #ccc solid;padding-left: 1ex;"><div dir=3D"lt=
r">On Thursday, May 14, 2015 at 9:32:38 PM UTC-4, Matthew Fioravante wrote:=
<blockquote class=3D"gmail_quote" style=3D"margin:0;margin-left:0.8ex;borde=
r-left:1px #ccc solid;padding-left:1ex"><div dir=3D"ltr">I've found some ca=
ses where having a library of string_view types allow efficient text proces=
sing, parsing, and storage in a strong type-safe manner. The potential down=
side is a large number of string types.<div><br></div><div>mstring_view: a =
mutable string_view</div><div><br></div><div><div style=3D"border:1px solid=
rgb(187,187,187);word-wrap:break-word;background-color:rgb(250,250,250)"><=
code><div><span style=3D"color:#000"><br>array</span><span style=3D"color:#=
660"><</span><span style=3D"color:#000">mstring_view</span><span style=
=3D"color:#660">,</span><span style=3D"color:#000">kMaxSplits</span><span s=
tyle=3D"color:#660">+</span><span style=3D"color:#066"><wbr>1</span><span s=
tyle=3D"color:#660">></span><span style=3D"color:#000"> split</span><fon=
t color=3D"#666600"><span style=3D"color:#000">_store</span><span style=3D"=
color:#660">;</span><span style=3D"color:#000"><br><br></span></font><span =
style=3D"color:#008">auto</span><span style=3D"color:#000"> p </span><span =
style=3D"color:#660">=3D</span><span style=3D"color:#000"> parser</span><sp=
an style=3D"color:#660">(</span><span style=3D"color:#000">filename</span><=
span style=3D"color:#660">);</span><span style=3D"color:#000"><br>array_vie=
w</span><span style=3D"color:#080"><mstring_view></span><span style=
=3D"color:#000"> s </span><span style=3D"color:#660">=3D</span><font color=
=3D"#000000"><span style=3D"color:#000"> p</span><span style=3D"color:#660"=
>.</span><span style=3D"color:#000">splitNextLine</span><span style=3D"colo=
r:#660">(</span><span style=3D"color:#000">split_store</span><span style=3D=
"color:#660">,</span><span style=3D"color:#080">',<wbr>'</span><span style=
=3D"color:#660">);</span><span style=3D"color:#000"><br></span></font></div=
></code></div><div><br></div>In the above example, p reads the next line of=
the file, and splits it using ',' into split_store, returning a view to st=
d::max(split_store,size()< number of splits + 1). parser is implemented =
efficiently, so that the resulting view points directly into the underlying=
internal I/O buffer to the file. This approach limits the split algorithm =
to only 1 memory allocation for the internal file buffer, zero data copies,=
1 parsing pass which can be optimized with simd instructions. This level o=
f efficiency is not possible with the API of fgets() and/or std::getline().=
</div><div><br></div><div>Working with regular string_view is fine, but one=
can achieve additional optimization if one can write into the resulting bu=
ffer to transform the text further without making copies of the data which =
will likely require memory allocations. The data is just sitting in the int=
ernal file buffer unused during this time, so there is no reason not to all=
ow writes to it.</div></div></blockquote><div><br>As a bikeshed point, if i=
t's a "view", then it's not mutable. That's why the word "view" was chosen;=
you can look, but not touch.<br></div></div></blockquote><div><br></div><d=
iv>I disagree, an array_view for example is mutable by default. string_view=
is immutable by default because its the most useful default use case. A vi=
ew is a view over the data. </div><div><br></div><div>This is similar =
but reversed from array_view where we have array_view<T> for a read o=
r write view over the data and carray_view<T> (array_view<const T&=
gt;) for a read only view of the data. </div></div>
<p></p>
-- <br />
<br />
--- <br />
You received this message because you are subscribed to the Google Groups &=
quot;ISO C++ Standard - Future Proposals" group.<br />
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to <a href=3D"mailto:std-proposals+unsubscribe@isocpp.org">std-proposa=
ls+unsubscribe@isocpp.org</a>.<br />
To post to this group, send email to <a href=3D"mailto:std-proposals@isocpp=
..org">std-proposals@isocpp.org</a>.<br />
Visit this group at <a href=3D"http://groups.google.com/a/isocpp.org/group/=
std-proposals/">http://groups.google.com/a/isocpp.org/group/std-proposals/<=
/a>.<br />
------=_Part_41_321039892.1431658261195--
------=_Part_40_1876751404.1431658261195--
.
Author: Nicol Bolas <jmckesson@gmail.com>
Date: Thu, 14 May 2015 19:54:01 -0700 (PDT)
Raw View
------=_Part_180_1648309180.1431658441320
Content-Type: multipart/alternative;
boundary="----=_Part_181_593341960.1431658441320"
------=_Part_181_593341960.1431658441320
Content-Type: text/plain; charset=UTF-8
On Thursday, May 14, 2015 at 10:08:51 PM UTC-4, Matthew Fioravante wrote:
> Another is that right now we still don't have a string_view compatible
> strtod() and friends. You can use zstring_view and strtod() to quickly
> parse comma separated numbers.
>
Or, you could use std::stod
<http://en.cppreference.com/w/cpp/string/basic_string/stof>, once they
update the functions to use them. Maybe that would be a good proposal to
make: get other functions to actually take std::string_view as well as
std::string.
It makes more sense to make C++ proposals to help C++ be more compatible
with itself than to help C++ be more compatible with C.
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
------=_Part_181_593341960.1431658441320
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable
<div dir=3D"ltr"><br><br>On Thursday, May 14, 2015 at 10:08:51 PM UTC-4, Ma=
tthew Fioravante wrote:<br><blockquote class=3D"gmail_quote" style=3D"margi=
n: 0;margin-left: 0.8ex;border-left: 1px #ccc solid;padding-left: 1ex;"><di=
v dir=3D"ltr"><div>Another is that right now we still don't have a string_v=
iew compatible strtod() and friends. You can use zstring_view and strtod() =
to quickly parse comma separated numbers.</div></div></blockquote><div><br>=
Or, you could use <a href=3D"http://en.cppreference.com/w/cpp/string/basic_=
string/stof">std::stod</a>, once they update the functions to use them. May=
be that would be a good proposal to make: get other functions to actually t=
ake std::string_view as well as std::string.<br><br>It makes more sense to =
make C++ proposals to help C++ be more compatible with itself than to help =
C++ be more compatible with C.<br></div></div>
<p></p>
-- <br />
<br />
--- <br />
You received this message because you are subscribed to the Google Groups &=
quot;ISO C++ Standard - Future Proposals" group.<br />
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to <a href=3D"mailto:std-proposals+unsubscribe@isocpp.org">std-proposa=
ls+unsubscribe@isocpp.org</a>.<br />
To post to this group, send email to <a href=3D"mailto:std-proposals@isocpp=
..org">std-proposals@isocpp.org</a>.<br />
Visit this group at <a href=3D"http://groups.google.com/a/isocpp.org/group/=
std-proposals/">http://groups.google.com/a/isocpp.org/group/std-proposals/<=
/a>.<br />
------=_Part_181_593341960.1431658441320--
------=_Part_180_1648309180.1431658441320--
.
Author: "'Jeffrey Yasskin' via ISO C++ Standard - Future Proposals" <std-proposals@isocpp.org>
Date: Fri, 15 May 2015 00:10:35 -0700
Raw View
--001a11347f20a381420516198b7a
Content-Type: text/plain; charset=UTF-8
On Thu, May 14, 2015 at 7:08 PM, Matthew Fioravante <fmatthew5876@gmail.com>
wrote:
>
>
> On Thursday, May 14, 2015 at 9:39:23 PM UTC-4, Jeffrey Yasskin wrote:
>>
>> On Thu, May 14, 2015 at 6:32 PM, Matthew Fioravante <fmatth...@gmail.com>
>> wrote:
>>
>>> I've found some cases where having a library of string_view types allow
>>> efficient text processing, parsing, and storage in a strong type-safe
>>> manner. The potential downside is a large number of string types.
>>>
>>> mstring_view: a mutable string_view
>>>
>>>
>>> array<mstring_view,kMaxSplits+1> split_store;
>>>
>>> auto p = parser(filename);
>>> array_view<mstring_view> s = p.splitNextLine(split_store,',');
>>>
>>> In the above example, p reads the next line of the file, and splits it
>>> using ',' into split_store, returning a view to
>>> std::max(split_store,size()< number of splits + 1). parser is implemented
>>> efficiently, so that the resulting view points directly into the underlying
>>> internal I/O buffer to the file. This approach limits the split algorithm
>>> to only 1 memory allocation for the internal file buffer, zero data copies,
>>> 1 parsing pass which can be optimized with simd instructions. This level of
>>> efficiency is not possible with the API of fgets() and/or std::getline().
>>>
>>> Working with regular string_view is fine, but one can achieve additional
>>> optimization if one can write into the resulting buffer to transform the
>>> text further without making copies of the data which will likely require
>>> memory allocations. The data is just sitting in the internal file buffer
>>> unused during this time, so there is no reason not to allow writes to it.
>>>
>>
>> Can you give some examples of these transformations?
>>
>
> One of the most common transformations is to_upper() or to_lower() if you
> want to do case insensitive comparison. If you need to compare the data
> many times, you can transform it in place once and the compiler can use the
> optimal bulk simd / streaming instructions to compare the strings.
>
Code-unit-wise to_upper() and to_lower() give the wrong answer, except for
English-only text, which we shouldn't be designing for anymore. Collation
keys are also generally a different length than their source string.
See http://userguide.icu-project.org/transforms/casemappings and
ftp://ftp.unicode.org/Public/UCD/latest/ucd/SpecialCasing.txt.
> Another is that right now we still don't have a string_view compatible
> strtod() and friends. You can use zstring_view and strtod() to quickly
> parse comma separated numbers.
>
We'll get these; we just need someone to write up the proposal. I suspect
the right signatures are actually something like
optional<double> consume_double(string_view& str);
And you'd check str.empty() if you want the double to occupy the whole
string. But that's something the paper author should double-check.
Do you have a link to any code that does this?
>>
>
> I don't have links to any public code which does this kind of thing but
> I've used it before with considerable speed ups over the conventional
> getline() solution. I'm not sure if any major open source projects are
> using these techniques for their parsing code.
>
I definitely believe that splitting strings into string_views is faster;
it's just the mutation I'm skeptical of. (Not because it's called a "view".
You're right that array_views are/should be mutable by default.)
Jeffrey
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
--001a11347f20a381420516198b7a
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable
<div dir=3D"ltr"><div class=3D"gmail_extra"><div class=3D"gmail_quote">On T=
hu, May 14, 2015 at 7:08 PM, Matthew Fioravante <span dir=3D"ltr"><<a hr=
ef=3D"mailto:fmatthew5876@gmail.com" target=3D"_blank" class=3D"cremed">fma=
tthew5876@gmail.com</a>></span> wrote:<br><blockquote class=3D"gmail_quo=
te" style=3D"margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-col=
or:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div dir=3D"l=
tr"><br><br>On Thursday, May 14, 2015 at 9:39:23 PM UTC-4, Jeffrey Yasskin =
wrote:<span class=3D""><blockquote class=3D"gmail_quote" style=3D"margin:0p=
x 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);bo=
rder-left-style:solid;padding-left:1ex"><div dir=3D"ltr"><div><div class=3D=
"gmail_quote">On Thu, May 14, 2015 at 6:32 PM, Matthew Fioravante <span dir=
=3D"ltr"><<a rel=3D"nofollow" class=3D"cremed">fmatth...@gmail.com</a>&g=
t;</span> wrote:<br><blockquote class=3D"gmail_quote" style=3D"margin:0px 0=
px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);borde=
r-left-style:solid;padding-left:1ex"><div dir=3D"ltr">I've found some c=
ases where having a library of string_view types allow efficient text proce=
ssing, parsing, and storage in a strong type-safe manner. The potential dow=
nside is a large number of string types.<div><br></div><div>mstring_view: a=
mutable string_view</div><div><br></div><div><div style=3D"border:1px soli=
d rgb(187,187,187);word-wrap:break-word;background-color:rgb(250,250,250)">=
<code><div><span style=3D"color:rgb(0,0,0)"><br>array</span><span style=3D"=
color:rgb(102,102,0)"><</span><span style=3D"color:rgb(0,0,0)">mstring_v=
iew</span><span style=3D"color:rgb(102,102,0)">,</span><span style=3D"color=
:rgb(0,0,0)">kMaxSplits</span><span style=3D"color:rgb(102,102,0)">+</span>=
<span style=3D"color:rgb(0,102,102)">1</span><span style=3D"color:rgb(102,1=
02,0)">></span><span style=3D"color:rgb(0,0,0)"> split</span><font color=
=3D"#666600"><span style=3D"color:rgb(0,0,0)">_store</span><span style=3D"c=
olor:rgb(102,102,0)">;</span><span style=3D"color:rgb(0,0,0)"><br><br></spa=
n></font><span style=3D"color:rgb(0,0,136)">auto</span><span style=3D"color=
:rgb(0,0,0)"> p </span><span style=3D"color:rgb(102,102,0)">=3D</span><span=
style=3D"color:rgb(0,0,0)"> parser</span><span style=3D"color:rgb(102,102,=
0)">(</span><span style=3D"color:rgb(0,0,0)">filename</span><span style=3D"=
color:rgb(102,102,0)">);</span><span style=3D"color:rgb(0,0,0)"><br>array_v=
iew</span><span style=3D"color:rgb(0,136,0)"><mstring_view></span><sp=
an style=3D"color:rgb(0,0,0)"> s </span><span style=3D"color:rgb(102,102,0)=
">=3D</span><font color=3D"#000000"><span style=3D"color:rgb(0,0,0)"> p</sp=
an><span style=3D"color:rgb(102,102,0)">.</span><span style=3D"color:rgb(0,=
0,0)">splitNextLine</span><span style=3D"color:rgb(102,102,0)">(</span><spa=
n style=3D"color:rgb(0,0,0)">split_store</span><span style=3D"color:rgb(102=
,102,0)">,</span><span style=3D"color:rgb(0,136,0)">','</span><span=
style=3D"color:rgb(102,102,0)">);</span><span style=3D"color:rgb(0,0,0)"><=
br></span></font></div></code></div><div><br></div>In the above example, p =
reads the next line of the file, and splits it using ',' into split=
_store, returning a view to std::max(split_store,size()< number of split=
s + 1). parser is implemented efficiently, so that the resulting view point=
s directly into the underlying internal I/O buffer to the file. This approa=
ch limits the split algorithm to only 1 memory allocation for the internal =
file buffer, zero data copies, 1 parsing pass which can be optimized with s=
imd instructions. This level of efficiency is not possible with the API of =
fgets() and/or std::getline().</div><div><br></div><div>Working with regula=
r string_view is fine, but one can achieve additional optimization if one c=
an write into the resulting buffer to transform the text further without ma=
king copies of the data which will likely require memory allocations. The d=
ata is just sitting in the internal file buffer unused during this time, so=
there is no reason not to allow writes to it.</div></div></blockquote><div=
><br></div><div>Can you give some examples of these transformations?</div><=
/div></div></div></blockquote><div><br></div></span><div>One of the most co=
mmon transformations is to_upper() or to_lower() if you want to do case ins=
ensitive comparison. If you need to compare the data many times, you can tr=
ansform it in place once and the compiler can use the optimal bulk simd / s=
treaming instructions to compare the strings.</div></div></blockquote><div>=
<br></div><div>Code-unit-wise to_upper() and to_lower() give the wrong answ=
er, except for English-only text, which we shouldn't be designing for a=
nymore. Collation keys are also generally a different length than their sou=
rce string.</div><div><br></div><div>See=C2=A0<a href=3D"http://userguide.i=
cu-project.org/transforms/casemappings">http://userguide.icu-project.org/tr=
ansforms/casemappings</a> and=C2=A0<a href=3D"ftp://ftp.unicode.org/Public/=
UCD/latest/ucd/SpecialCasing.txt">ftp://ftp.unicode.org/Public/UCD/latest/u=
cd/SpecialCasing.txt</a>.</div><div>=C2=A0</div><blockquote class=3D"gmail_=
quote" style=3D"margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-=
color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div dir=
=3D"ltr"><div></div><div>Another is that right now we still don't have =
a string_view compatible strtod() and friends. You can use zstring_view and=
strtod() to quickly parse comma separated numbers.</div></div></blockquote=
><div><br></div><div>We'll get these; we just need someone to write up =
the proposal. I suspect the right signatures are actually something like</d=
iv><div><br></div><div>optional<double> consume_double(string_view&am=
p; str);</div><div><br></div><div>And you'd check str.empty() if you wa=
nt the double to occupy the whole string. But that's something the pape=
r author should double-check.</div><div><br></div><blockquote class=3D"gmai=
l_quote" style=3D"margin:0px 0px 0px 0.8ex;border-left-width:1px;border-lef=
t-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div dir=
=3D"ltr"><span class=3D""><div></div><blockquote class=3D"gmail_quote" styl=
e=3D"margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(2=
04,204,204);border-left-style:solid;padding-left:1ex"><div dir=3D"ltr"><div=
><div class=3D"gmail_quote"><div>Do you have a link to any code that does t=
his?</div></div></div></div></blockquote><div><br></div></span><div>I don&#=
39;t have links to any public code which does this kind of thing but I'=
ve used it before with considerable speed ups over the conventional getline=
() solution. I'm not sure if any major open source projects are using t=
hese techniques for their parsing code.</div></div></blockquote><div><br></=
div><div>I definitely believe that splitting strings into string_views is f=
aster; it's just the mutation I'm skeptical of. (Not because it'=
;s called a "view". You're right that array_views are/should =
be mutable by default.)</div><div><br></div><div>Jeffrey</div></div></div><=
/div>
<p></p>
-- <br />
<br />
--- <br />
You received this message because you are subscribed to the Google Groups &=
quot;ISO C++ Standard - Future Proposals" group.<br />
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to <a href=3D"mailto:std-proposals+unsubscribe@isocpp.org">std-proposa=
ls+unsubscribe@isocpp.org</a>.<br />
To post to this group, send email to <a href=3D"mailto:std-proposals@isocpp=
..org">std-proposals@isocpp.org</a>.<br />
Visit this group at <a href=3D"http://groups.google.com/a/isocpp.org/group/=
std-proposals/">http://groups.google.com/a/isocpp.org/group/std-proposals/<=
/a>.<br />
--001a11347f20a381420516198b7a--
.
Author: Olaf van der Spek <olafvdspek@gmail.com>
Date: Fri, 15 May 2015 04:53:37 -0700 (PDT)
Raw View
------=_Part_247_803525021.1431690817228
Content-Type: multipart/alternative;
boundary="----=_Part_248_2104600988.1431690817237"
------=_Part_248_2104600988.1431690817237
Content-Type: text/plain; charset=UTF-8
Op vrijdag 15 mei 2015 04:08:51 UTC+2 schreef Matthew Fioravante:
>
> One of the most common transformations is to_upper() or to_lower() if you
> want to do case insensitive comparison. If you need to compare the data
> many times, you can transform it in place once and the compiler can use the
> optimal bulk simd / streaming instructions to compare the strings.
>
Don't sizes change with UTF-8 to-upper/to-lower?
> Another is that right now we still don't have a string_view compatible
> strtod() and friends. You can use zstring_view and strtod() to quickly
> parse comma separated numbers.
>
string_view variants of strtod-like functions should be introduced, no need
for zstring_view here..
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
------=_Part_248_2104600988.1431690817237
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable
<div dir=3D"ltr"><br><br>Op vrijdag 15 mei 2015 04:08:51 UTC+2 schreef Matt=
hew Fioravante:<blockquote class=3D"gmail_quote" style=3D"margin: 0;margin-=
left: 0.8ex;border-left: 1px #ccc solid;padding-left: 1ex;"><div dir=3D"ltr=
"><div>One of the most common transformations is to_upper() or to_lower() i=
f you want to do case insensitive comparison. If you need to compare the da=
ta many times, you can transform it in place once and the compiler can use =
the optimal bulk simd / streaming instructions to compare the strings.</div=
></div></blockquote><div><br></div><div>Don't sizes change with UTF-8 to-up=
per/to-lower?</div><div> </div><blockquote class=3D"gmail_quote" style=
=3D"margin: 0;margin-left: 0.8ex;border-left: 1px #ccc solid;padding-left: =
1ex;"><div dir=3D"ltr"><div>Another is that right now we still don't have a=
string_view compatible strtod() and friends. You can use zstring_view and =
strtod() to quickly parse comma separated numbers.</div></div></blockquote>=
<div><br></div><div>string_view variants of strtod-like functions should be=
introduced, no need for zstring_view here..</div><div><br></div></div>
<p></p>
-- <br />
<br />
--- <br />
You received this message because you are subscribed to the Google Groups &=
quot;ISO C++ Standard - Future Proposals" group.<br />
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to <a href=3D"mailto:std-proposals+unsubscribe@isocpp.org">std-proposa=
ls+unsubscribe@isocpp.org</a>.<br />
To post to this group, send email to <a href=3D"mailto:std-proposals@isocpp=
..org">std-proposals@isocpp.org</a>.<br />
Visit this group at <a href=3D"http://groups.google.com/a/isocpp.org/group/=
std-proposals/">http://groups.google.com/a/isocpp.org/group/std-proposals/<=
/a>.<br />
------=_Part_248_2104600988.1431690817237--
------=_Part_247_803525021.1431690817228--
.
Author: Douglas Boffey <douglas.boffey@gmail.com>
Date: Fri, 15 May 2015 21:59:53 +0100
Raw View
--047d7bd74f9e3932f605162520cb
Content-Type: text/plain; charset=UTF-8
> Don't sizes change with UTF-8 to-upper/to-lower?
Can you give an example where a letter has a different length between upper
and lower case, when UTF8 encoded?
On Fri, May 15, 2015 at 12:53 PM, Olaf van der Spek <olafvdspek@gmail.com>
wrote:
>
>
> Op vrijdag 15 mei 2015 04:08:51 UTC+2 schreef Matthew Fioravante:
>>
>> One of the most common transformations is to_upper() or to_lower() if you
>> want to do case insensitive comparison. If you need to compare the data
>> many times, you can transform it in place once and the compiler can use the
>> optimal bulk simd / streaming instructions to compare the strings.
>>
>
> Don't sizes change with UTF-8 to-upper/to-lower?
>
>
>> Another is that right now we still don't have a string_view compatible
>> strtod() and friends. You can use zstring_view and strtod() to quickly
>> parse comma separated numbers.
>>
>
> string_view variants of strtod-like functions should be introduced, no
> need for zstring_view here..
>
> --
>
> ---
> You received this message because you are subscribed to the Google Groups
> "ISO C++ Standard - Future Proposals" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to std-proposals+unsubscribe@isocpp.org.
> To post to this group, send email to std-proposals@isocpp.org.
> Visit this group at
> http://groups.google.com/a/isocpp.org/group/std-proposals/.
>
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
--047d7bd74f9e3932f605162520cb
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable
<div dir=3D"ltr"><div>> Don't sizes change with UTF-8 to-upper/to-lo=
wer?<br><br></div>Can you give an example where a letter has a different le=
ngth between upper and lower case, when UTF8 encoded?<br><div><br></div></d=
iv><div class=3D"gmail_extra"><br><div class=3D"gmail_quote">On Fri, May 15=
, 2015 at 12:53 PM, Olaf van der Spek <span dir=3D"ltr"><<a href=3D"mail=
to:olafvdspek@gmail.com" target=3D"_blank">olafvdspek@gmail.com</a>></sp=
an> wrote:<br><blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;=
border-left:1px #ccc solid;padding-left:1ex"><div dir=3D"ltr"><br><br>Op vr=
ijdag 15 mei 2015 04:08:51 UTC+2 schreef Matthew Fioravante:<span class=3D"=
"><blockquote class=3D"gmail_quote" style=3D"margin:0;margin-left:0.8ex;bor=
der-left:1px #ccc solid;padding-left:1ex"><div dir=3D"ltr"><div>One of the =
most common transformations is to_upper() or to_lower() if you want to do c=
ase insensitive comparison. If you need to compare the data many times, you=
can transform it in place once and the compiler can use the optimal bulk s=
imd / streaming instructions to compare the strings.</div></div></blockquot=
e><div><br></div></span><div>Don't sizes change with UTF-8 to-upper/to-=
lower?</div><span class=3D""><div>=C2=A0</div><blockquote class=3D"gmail_qu=
ote" style=3D"margin:0;margin-left:0.8ex;border-left:1px #ccc solid;padding=
-left:1ex"><div dir=3D"ltr"><div>Another is that right now we still don'=
;t have a string_view compatible strtod() and friends. You can use zstring_=
view and strtod() to quickly parse comma separated numbers.</div></div></bl=
ockquote><div><br></div></span><div>string_view variants of strtod-like fun=
ctions should be introduced, no need for zstring_view here..</div><div><br>=
</div></div><div class=3D"HOEnZb"><div class=3D"h5">
<p></p>
-- <br>
<br>
--- <br>
You received this message because you are subscribed to the Google Groups &=
quot;ISO C++ Standard - Future Proposals" group.<br>
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to <a href=3D"mailto:std-proposals+unsubscribe@isocpp.org" target=3D"_=
blank">std-proposals+unsubscribe@isocpp.org</a>.<br>
To post to this group, send email to <a href=3D"mailto:std-proposals@isocpp=
..org" target=3D"_blank">std-proposals@isocpp.org</a>.<br>
Visit this group at <a href=3D"http://groups.google.com/a/isocpp.org/group/=
std-proposals/" target=3D"_blank">http://groups.google.com/a/isocpp.org/gro=
up/std-proposals/</a>.<br>
</div></div></blockquote></div><br></div>
<p></p>
-- <br />
<br />
--- <br />
You received this message because you are subscribed to the Google Groups &=
quot;ISO C++ Standard - Future Proposals" group.<br />
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to <a href=3D"mailto:std-proposals+unsubscribe@isocpp.org">std-proposa=
ls+unsubscribe@isocpp.org</a>.<br />
To post to this group, send email to <a href=3D"mailto:std-proposals@isocpp=
..org">std-proposals@isocpp.org</a>.<br />
Visit this group at <a href=3D"http://groups.google.com/a/isocpp.org/group/=
std-proposals/">http://groups.google.com/a/isocpp.org/group/std-proposals/<=
/a>.<br />
--047d7bd74f9e3932f605162520cb--
.
Author: "'Jeffrey Yasskin' via ISO C++ Standard - Future Proposals" <std-proposals@isocpp.org>
Date: Fri, 15 May 2015 14:08:27 -0700
Raw View
On Fri, May 15, 2015 at 1:59 PM, Douglas Boffey
<douglas.boffey@gmail.com> wrote:
>> Don't sizes change with UTF-8 to-upper/to-lower?
>
> Can you give an example where a letter has a different length between upper
> and lower case, when UTF8 encoded?
ftp://ftp.unicode.org/Public/UCD/latest/ucd/SpecialCasing.txt has a
bunch of examples of the *number of characters* changing when they
switch between upper, lower, and title case.
ftp://ftp.unicode.org/Public/UCD/latest/ucd/UnicodeData.txt also
includes at least
0130;LATIN CAPITAL LETTER I WITH DOT ABOVE;Lu;0;L;0049 0307;;;;N;LATIN
CAPITAL LETTER I DOT;;;0069;
0131;LATIN SMALL LETTER DOTLESS I;Ll;0;L;;;;;N;;;0049;;0049
> On Fri, May 15, 2015 at 12:53 PM, Olaf van der Spek <olafvdspek@gmail.com>
> wrote:
>>
>>
>>
>> Op vrijdag 15 mei 2015 04:08:51 UTC+2 schreef Matthew Fioravante:
>>>
>>> One of the most common transformations is to_upper() or to_lower() if you
>>> want to do case insensitive comparison. If you need to compare the data many
>>> times, you can transform it in place once and the compiler can use the
>>> optimal bulk simd / streaming instructions to compare the strings.
>>
>>
>> Don't sizes change with UTF-8 to-upper/to-lower?
>>
>>>
>>> Another is that right now we still don't have a string_view compatible
>>> strtod() and friends. You can use zstring_view and strtod() to quickly parse
>>> comma separated numbers.
>>
>>
>> string_view variants of strtod-like functions should be introduced, no
>> need for zstring_view here..
>>
>> --
>>
>> ---
>> You received this message because you are subscribed to the Google Groups
>> "ISO C++ Standard - Future Proposals" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to std-proposals+unsubscribe@isocpp.org.
>> To post to this group, send email to std-proposals@isocpp.org.
>> Visit this group at
>> http://groups.google.com/a/isocpp.org/group/std-proposals/.
>
>
> --
>
> ---
> You received this message because you are subscribed to the Google Groups
> "ISO C++ Standard - Future Proposals" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to std-proposals+unsubscribe@isocpp.org.
> To post to this group, send email to std-proposals@isocpp.org.
> Visit this group at
> http://groups.google.com/a/isocpp.org/group/std-proposals/.
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
.
Author: Matthew Fioravante <fmatthew5876@gmail.com>
Date: Sat, 16 May 2015 09:20:34 -0700 (PDT)
Raw View
------=_Part_1782_17565408.1431793234524
Content-Type: multipart/alternative;
boundary="----=_Part_1783_1016552617.1431793234524"
------=_Part_1783_1016552617.1431793234524
Content-Type: text/plain; charset=UTF-8
On Friday, May 15, 2015 at 3:10:58 AM UTC-4, Jeffrey Yasskin wrote:
>
> except for English-only text, which we shouldn't be designing for anymore.
>
ASCII still has an important role today, particularly in big data domains
like finance where you have huge ascii text data files that need to be
processed. Knowing that your data is single byte english text gives you a
huge advantage in implementing efficient processing.
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
------=_Part_1783_1016552617.1431793234524
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable
<div dir=3D"ltr"><br><br>On Friday, May 15, 2015 at 3:10:58 AM UTC-4, Jeffr=
ey Yasskin wrote:<blockquote class=3D"gmail_quote" style=3D"margin: 0;margi=
n-left: 0.8ex;border-left: 1px #ccc solid;padding-left: 1ex;"><div dir=3D"l=
tr"><div><div class=3D"gmail_quote"><div>except for English-only text, whic=
h we shouldn't be designing for anymore.</div></div></div></div></blockquot=
e><div><br></div><div>ASCII still has an important role today, particularly=
in big data domains like finance where you have huge ascii text data files=
that need to be processed. Knowing that your data is single byte english t=
ext gives you a huge advantage in implementing efficient processing.</div><=
/div>
<p></p>
-- <br />
<br />
--- <br />
You received this message because you are subscribed to the Google Groups &=
quot;ISO C++ Standard - Future Proposals" group.<br />
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to <a href=3D"mailto:std-proposals+unsubscribe@isocpp.org">std-proposa=
ls+unsubscribe@isocpp.org</a>.<br />
To post to this group, send email to <a href=3D"mailto:std-proposals@isocpp=
..org">std-proposals@isocpp.org</a>.<br />
Visit this group at <a href=3D"http://groups.google.com/a/isocpp.org/group/=
std-proposals/">http://groups.google.com/a/isocpp.org/group/std-proposals/<=
/a>.<br />
------=_Part_1783_1016552617.1431793234524--
------=_Part_1782_17565408.1431793234524--
.
Author: Olaf van der Spek <olafvdspek@gmail.com>
Date: Sat, 16 May 2015 18:21:41 +0200
Raw View
2015-05-16 18:20 GMT+02:00 Matthew Fioravante <fmatthew5876@gmail.com>:
> ASCII still has an important role today, particularly in big data domains
> like finance where you have huge ascii text data files that need to be
> processed. Knowing that your data is single byte english text gives you a
> huge advantage in implementing efficient processing.
Sure, but the standard functions shouldn't be limited to ASCII..
--
Olaf
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
.
Author: "'Jeffrey Yasskin' via ISO C++ Standard - Future Proposals" <std-proposals@isocpp.org>
Date: Sat, 16 May 2015 09:49:07 -0700
Raw View
On Sat, May 16, 2015 at 9:20 AM, Matthew Fioravante
<fmatthew5876@gmail.com> wrote:
>
>
> On Friday, May 15, 2015 at 3:10:58 AM UTC-4, Jeffrey Yasskin wrote:
>>
>> except for English-only text, which we shouldn't be designing for anymore.
>
>
> ASCII still has an important role today, particularly in big data domains
> like finance where you have huge ascii text data files that need to be
> processed. Knowing that your data is single byte english text gives you a
> huge advantage in implementing efficient processing.
'k, that's what you should put in your proposal. I'll still argue that
we shouldn't put an English-only library into an international
standard, but you might convince enough other people.
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
.
Author: Jens Maurer <Jens.Maurer@gmx.net>
Date: Sat, 16 May 2015 19:29:26 +0200
Raw View
On 05/15/2015 09:10 AM, 'Jeffrey Yasskin' via ISO C++ Standard - Future Proposals wrote:
> We'll get these; we just need someone to write up the proposal. I suspect the right signatures are actually something like
>
> optional<double> consume_double(string_view& str);
In my experience, functions like that are hard to make
efficient, because the compiler will (pessimistically) assume
the "char *" used inside might point to the "string_view"
object itself. (For this particular case, this issue might
be minuscule compared to the actual "double" parsing overhead.)
An iterator range approach might work better in that regard:
char * parse_double(char * first, char * last, double& out);
I'm interested in seeing the most versatile and basic interface
standardized, possibly in addition to a "nicer" high-level
interface. Regrettably, C++ I/O doesn't give you the former
for quite a few fundamental operations, such as T <-> string.
See N4412.
Jens
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
.
Author: Olaf van der Spek <olafvdspek@gmail.com>
Date: Sat, 16 May 2015 19:33:44 +0200
Raw View
2015-05-16 19:29 GMT+02:00 Jens Maurer <Jens.Maurer@gmx.net>:
> On 05/15/2015 09:10 AM, 'Jeffrey Yasskin' via ISO C++ Standard - Future Proposals wrote:
>> We'll get these; we just need someone to write up the proposal. I suspect the right signatures are actually something like
>>
>> optional<double> consume_double(string_view& str);
>
> In my experience, functions like that are hard to make
> efficient, because the compiler will (pessimistically) assume
> the "char *" used inside might point to the "string_view"
> object itself. (For this particular case, this issue might
> be minuscule compared to the actual "double" parsing overhead.)
How's that an issue as you're not writing to it?
IMO it should be a non-reference string_view anyway.
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
.
Author: Matthew Fioravante <fmatthew5876@gmail.com>
Date: Sat, 16 May 2015 11:42:39 -0700 (PDT)
Raw View
------=_Part_235_1049040792.1431801759794
Content-Type: multipart/alternative;
boundary="----=_Part_236_387849274.1431801759794"
------=_Part_236_387849274.1431801759794
Content-Type: text/plain; charset=UTF-8
I've made a few threads in the past talking about possible strtod()
implementations for string_view. Every time we get into a big discussion
about the procedural interface (optional? expected? out params? etc...).
Since most of these things like optional and expected are still too new,
nobody can really come to a conclusion.
The problem is that we have multiple piece of information we would like to
return
- The converted value.
- Whether or not the conversion succeded.
- The specific error if one occurred (overflow, underflow, failure to
parse, etc...).
- The tail of the string after the value consumed.
Nobody has quite figured out what a "modern" procedural interface is
supposed to look like. We all want something that's easy to use, enforces
or at least encourages correctness, and is composable. I think most people
agree that a class based interface where you create a Parser object, call
member functions, etc.. may be too heavyweight.
I currently use an interface like this which needs to be implemented for
every type supported.
template <typename T> error_code parse(T& value, string_view& tail,
string_view s);
Yes it uses those evil out parameters that we aren't supposed to use
anymore, but we can also write generic wrappers for a more modern
composable interface on top of this. The advantage of this base
implementation is that its very simple and has no dependencies on other
libraries.
template <typename T> error_code parse(T& value, string_view s); {
string_view t; return parse(value,t,s); }
template <typename T> T parseOr(string_view s, T value_if_error = T());
//I want to be careful
double value;
if(!parse(value, str)) {
//Handle the error
}
//Just give me a default value of 1.0 if the parsing failed
auto value = parseOr(str, 1.0);
One could also implement wrappers that use optional<T>, throws exceptions
on errors, expected<T>, etc..
The main problem I see with the out param approach as used here is that you
always have to default construct an object and then set its value. That
means this API would not be usable with objects that cannot be default
constructed and then modified later. I'm not sure such an interface is the
best one possible especially for standardization, but it works well enough
for me for now.
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
------=_Part_236_387849274.1431801759794
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable
<div dir=3D"ltr">I've made a few threads in the past talking about possible=
strtod() implementations for string_view. Every time we get into a big dis=
cussion about the procedural interface (optional? expected? out params? etc=
....). Since most of these things like optional and expected are still too n=
ew, nobody can really come to a conclusion.<div><br></div><div>The problem =
is that we have multiple piece of information we would like to return</div>=
<div>- The converted value.</div><div>- Whether or not the conversion succe=
ded.</div><div>- The specific error if one occurred (overflow, underflow, f=
ailure to parse, etc...).</div><div>- The tail of the string after the valu=
e consumed.</div><div><br></div><div>Nobody has quite figured out what a "m=
odern" procedural interface is supposed to look like. We all want something=
that's easy to use, enforces or at least encourages correctness, and is co=
mposable. I think most people agree that a class based interface where you =
create a Parser object, call member functions, etc.. may be too heavyweight=
..</div><div><br></div><div>I currently use an interface like this which nee=
ds to be implemented for every type supported.</div><div><div class=3D"pret=
typrint" style=3D"border: 1px solid rgb(187, 187, 187); word-wrap: break-wo=
rd; background-color: rgb(250, 250, 250);"><code class=3D"prettyprint"><div=
class=3D"subprettyprint"><span style=3D"color: #008;" class=3D"styled-by-p=
rettify">template</span><span style=3D"color: #000;" class=3D"styled-by-pre=
ttify"> </span><span style=3D"color: #660;" class=3D"styled-by-prettify">&l=
t;</span><span style=3D"color: #008;" class=3D"styled-by-prettify">typename=
</span><span style=3D"color: #000;" class=3D"styled-by-prettify"> T</span><=
span style=3D"color: #660;" class=3D"styled-by-prettify">></span><span s=
tyle=3D"color: #000;" class=3D"styled-by-prettify"> error_code parse</span>=
<span style=3D"color: #660;" class=3D"styled-by-prettify">(</span><span sty=
le=3D"color: #000;" class=3D"styled-by-prettify">T</span><span style=3D"col=
or: #660;" class=3D"styled-by-prettify">&</span><span style=3D"color: #=
000;" class=3D"styled-by-prettify"> value</span><span style=3D"color: #660;=
" class=3D"styled-by-prettify">,</span><font color=3D"#660066"><span style=
=3D"color: #000;" class=3D"styled-by-prettify"> string_view</span><span sty=
le=3D"color: #660;" class=3D"styled-by-prettify">&</span><span style=3D=
"color: #000;" class=3D"styled-by-prettify"> tail</span><span style=3D"colo=
r: #660;" class=3D"styled-by-prettify">,</span><span style=3D"color: #000;"=
class=3D"styled-by-prettify"> string_view s</span><span style=3D"color: #6=
60;" class=3D"styled-by-prettify">);</span></font></div></code></div><br>Ye=
s it uses those evil out parameters that we aren't supposed to use anymore,=
but we can also write generic wrappers for a more modern composable interf=
ace on top of this. The advantage of this base implementation is that its v=
ery simple and has no dependencies on other libraries.</div><div><br></div>=
<div><div class=3D"prettyprint" style=3D"border: 1px solid rgb(187, 187, 18=
7); word-wrap: break-word; background-color: rgb(250, 250, 250);"><code cla=
ss=3D"prettyprint"><div class=3D"subprettyprint"><span style=3D"color: #008=
;" class=3D"styled-by-prettify">template</span><span style=3D"color: #000;"=
class=3D"styled-by-prettify"> </span><span style=3D"color: #660;" class=3D=
"styled-by-prettify"><</span><span style=3D"color: #008;" class=3D"style=
d-by-prettify">typename</span><span style=3D"color: #000;" class=3D"styled-=
by-prettify"> T</span><span style=3D"color: #660;" class=3D"styled-by-prett=
ify">></span><span style=3D"color: #000;" class=3D"styled-by-prettify"> =
error_code parse</span><span style=3D"color: #660;" class=3D"styled-by-pret=
tify">(</span><span style=3D"color: #000;" class=3D"styled-by-prettify">T</=
span><span style=3D"color: #660;" class=3D"styled-by-prettify">&</span>=
<span style=3D"color: #000;" class=3D"styled-by-prettify"> value</span><spa=
n style=3D"color: #660;" class=3D"styled-by-prettify">,</span><span style=
=3D"color: #000;" class=3D"styled-by-prettify"> string_view s</span><span s=
tyle=3D"color: #660;" class=3D"styled-by-prettify">);</span><span style=3D"=
color: #000;" class=3D"styled-by-prettify"> </span><span style=3D"color: #6=
60;" class=3D"styled-by-prettify">{</span><span style=3D"color: #000;" clas=
s=3D"styled-by-prettify"> string_view t</span><span style=3D"color: #660;" =
class=3D"styled-by-prettify">;</span><span style=3D"color: #000;" class=3D"=
styled-by-prettify"> </span><span style=3D"color: #008;" class=3D"styled-by=
-prettify">return</span><span style=3D"color: #000;" class=3D"styled-by-pre=
ttify"> parse</span><span style=3D"color: #660;" class=3D"styled-by-prettif=
y">(</span><span style=3D"color: #000;" class=3D"styled-by-prettify">value<=
/span><span style=3D"color: #660;" class=3D"styled-by-prettify">,</span><sp=
an style=3D"color: #000;" class=3D"styled-by-prettify">t</span><span style=
=3D"color: #660;" class=3D"styled-by-prettify">,</span><span style=3D"color=
: #000;" class=3D"styled-by-prettify">s</span><span style=3D"color: #660;" =
class=3D"styled-by-prettify">);</span><span style=3D"color: #000;" class=3D=
"styled-by-prettify"> </span><span style=3D"color: #660;" class=3D"styled-b=
y-prettify">}</span><span style=3D"color: #000;" class=3D"styled-by-prettif=
y"><br></span><span style=3D"color: #008;" class=3D"styled-by-prettify">tem=
plate</span><span style=3D"color: #000;" class=3D"styled-by-prettify"> </sp=
an><span style=3D"color: #660;" class=3D"styled-by-prettify"><</span><sp=
an style=3D"color: #008;" class=3D"styled-by-prettify">typename</span><span=
style=3D"color: #000;" class=3D"styled-by-prettify"> T</span><span style=
=3D"color: #660;" class=3D"styled-by-prettify">></span><span style=3D"co=
lor: #000;" class=3D"styled-by-prettify"> T parseOr</span><span style=3D"co=
lor: #660;" class=3D"styled-by-prettify">(</span><span style=3D"color: #000=
;" class=3D"styled-by-prettify">string_view s</span><span style=3D"color: #=
660;" class=3D"styled-by-prettify">,</span><span style=3D"color: #000;" cla=
ss=3D"styled-by-prettify"> T value_if_error </span><span style=3D"color: #6=
60;" class=3D"styled-by-prettify">=3D</span><span style=3D"color: #000;" cl=
ass=3D"styled-by-prettify"> T</span><span style=3D"color: #660;" class=3D"s=
tyled-by-prettify">());</span><span style=3D"color: #000;" class=3D"styled-=
by-prettify"><br> </span><font color=3D"#000000"></font></div></code><=
/div><div><br></div><div><div class=3D"prettyprint" style=3D"border: 1px so=
lid rgb(187, 187, 187); word-wrap: break-word; background-color: rgb(250, 2=
50, 250);"><code class=3D"prettyprint"><div class=3D"subprettyprint"><font =
color=3D"#660066"><span style=3D"color: #800;" class=3D"styled-by-prettify"=
>//I want to be careful</span><span style=3D"color: #000;" class=3D"styled-=
by-prettify"><br></span><span style=3D"color: #008;" class=3D"styled-by-pre=
ttify">double</span><span style=3D"color: #000;" class=3D"styled-by-prettif=
y"> value</span><span style=3D"color: #660;" class=3D"styled-by-prettify">;=
</span><span style=3D"color: #000;" class=3D"styled-by-prettify"><br></span=
><span style=3D"color: #008;" class=3D"styled-by-prettify">if</span><span s=
tyle=3D"color: #660;" class=3D"styled-by-prettify">(!</span><span style=3D"=
color: #000;" class=3D"styled-by-prettify">parse</span><span style=3D"color=
: #660;" class=3D"styled-by-prettify">(</span><span style=3D"color: #000;" =
class=3D"styled-by-prettify">value</span><span style=3D"color: #660;" class=
=3D"styled-by-prettify">,</span><span style=3D"color: #000;" class=3D"style=
d-by-prettify"> str</span><span style=3D"color: #660;" class=3D"styled-by-p=
rettify">))</span><span style=3D"color: #000;" class=3D"styled-by-prettify"=
> </span><span style=3D"color: #660;" class=3D"styled-by-prettify">{</span>=
<span style=3D"color: #000;" class=3D"styled-by-prettify"><br> </span>=
<span style=3D"color: #800;" class=3D"styled-by-prettify">//Handle the erro=
r</span><span style=3D"color: #000;" class=3D"styled-by-prettify"><br></spa=
n><span style=3D"color: #660;" class=3D"styled-by-prettify">}</span><span s=
tyle=3D"color: #000;" class=3D"styled-by-prettify"><br><br></span><span sty=
le=3D"color: #800;" class=3D"styled-by-prettify">//Just give me a default v=
alue of 1.0 if the parsing failed</span><span style=3D"color: #000;" class=
=3D"styled-by-prettify"><br></span><span style=3D"color: #008;" class=3D"st=
yled-by-prettify">auto</span><span style=3D"color: #000;" class=3D"styled-b=
y-prettify"> value </span><span style=3D"color: #660;" class=3D"styled-by-p=
rettify">=3D</span><span style=3D"color: #000;" class=3D"styled-by-prettify=
"> parseOr</span><span style=3D"color: #660;" class=3D"styled-by-prettify">=
(</span><span style=3D"color: #000;" class=3D"styled-by-prettify">str</span=
><span style=3D"color: #660;" class=3D"styled-by-prettify">,</span><span st=
yle=3D"color: #000;" class=3D"styled-by-prettify"> </span><span style=3D"co=
lor: #066;" class=3D"styled-by-prettify">1.0</span><span style=3D"color: #6=
60;" class=3D"styled-by-prettify">);</span><span style=3D"color: #000;" cla=
ss=3D"styled-by-prettify"><br><br></span></font></div></code></div><br>One =
could also implement wrappers that use optional<T>, throws exceptions=
on errors, expected<T>, etc..</div><div><br></div><div>The main prob=
lem I see with the out param approach as used here is that you always have =
to default construct an object and then set its value. That means this API =
would not be usable with objects that cannot be default constructed and the=
n modified later. I'm not sure such an interface is the best one possible e=
specially for standardization, but it works well enough for me for now.</di=
v></div><div><br></div></div>
<p></p>
-- <br />
<br />
--- <br />
You received this message because you are subscribed to the Google Groups &=
quot;ISO C++ Standard - Future Proposals" group.<br />
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to <a href=3D"mailto:std-proposals+unsubscribe@isocpp.org">std-proposa=
ls+unsubscribe@isocpp.org</a>.<br />
To post to this group, send email to <a href=3D"mailto:std-proposals@isocpp=
..org">std-proposals@isocpp.org</a>.<br />
Visit this group at <a href=3D"http://groups.google.com/a/isocpp.org/group/=
std-proposals/">http://groups.google.com/a/isocpp.org/group/std-proposals/<=
/a>.<br />
------=_Part_236_387849274.1431801759794--
------=_Part_235_1049040792.1431801759794--
.
Author: Nicol Bolas <jmckesson@gmail.com>
Date: Sat, 16 May 2015 14:10:01 -0700 (PDT)
Raw View
------=_Part_1830_1242953614.1431810601188
Content-Type: multipart/alternative;
boundary="----=_Part_1831_1747956129.1431810601188"
------=_Part_1831_1747956129.1431810601188
Content-Type: text/plain; charset=UTF-8
On Saturday, May 16, 2015 at 2:42:39 PM UTC-4, Matthew Fioravante wrote:
>
> I've made a few threads in the past talking about possible strtod()
> implementations for string_view. Every time we get into a big discussion
> about the procedural interface (optional? expected? out params? etc...).
> Since most of these things like optional and expected are still too new,
> nobody can really come to a conclusion.
> <snip>
>
It would be highly unlikely that you could get the standards committee to
abandon exceptions as the standard means for error reporting. That's how
C++ has been since C++98, and unless something major happens, that's how
it's going to be in the future.
It would be better to use existing specifications as your guide. For
example, the FileSystem TS
<http://www.open-std.org/JTC1/SC22/WG21/docs/papers/2014/n3940.html> makes
it clear how a C++ standard library API should look if you're going to use
error codes:
* The output value is always the function's conceptual output, not an error
code.
* If a function needs to have an error code, then it has two overloads: one
that throws and one that uses an error code, which is an output parameter.
The only difference is the error code parameter.
There's no point in returning an optional for the throwing version; you'll
either get a valid value or you won't be in that scope anymore. And since
you want the two functions to be as identical as possible, it makes sense
to simply return by value in the error-code case. By using error-codes
rather than exceptions, the user has made it clear that a priori code
safety isn't their biggest concern. Or that they can't use exceptions at
all, in which case, std::optional can't really do anything to save them.
Also, it should be noted that Boost (and their users) have tons of
experience with optional. Many Boost libraries rely on it. Generally
speaking, Boost uses optional for when returning a non-value does not
represent an erroneous condition.
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
------=_Part_1831_1747956129.1431810601188
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable
<div dir=3D"ltr">On Saturday, May 16, 2015 at 2:42:39 PM UTC-4, Matthew Fio=
ravante wrote:<blockquote class=3D"gmail_quote" style=3D"margin: 0;margin-l=
eft: 0.8ex;border-left: 1px #ccc solid;padding-left: 1ex;"><div dir=3D"ltr"=
>I've made a few threads in the past talking about possible strtod() implem=
entations for string_view. Every time we get into a big discussion about th=
e procedural interface (optional? expected? out params? etc...). Since most=
of these things like optional and expected are still too new, nobody can r=
eally come to a conclusion.<div><snip><br></div></div></blockquote><d=
iv><br>It would be highly unlikely that you could get the standards committ=
ee to abandon exceptions as the standard means for error reporting. That's =
how C++ has been since C++98, and unless something major happens, that's ho=
w it's going to be in the future.<br><br>It would be better to use existing=
specifications as your guide. For example, the <a href=3D"http://www.open-=
std.org/JTC1/SC22/WG21/docs/papers/2014/n3940.html">FileSystem TS</a> makes=
it clear how a C++ standard library API should look if you're going to use=
error codes:<br><br>* The output value is always the function's conceptual=
output, not an error code.<br>* If a function needs to have an error code,=
then it has two overloads: one that throws and one that uses an error code=
, which is an output parameter. The only difference is the error code param=
eter.<br><br>There's no point in returning an optional for the throwing ver=
sion; you'll either get a valid value or you won't be in that scope anymore=
.. And since you want the two functions to be as identical as possible, it m=
akes sense to simply return by value in the error-code case. By using error=
-codes rather than exceptions, the user has made it clear that a priori cod=
e safety isn't their biggest concern. Or that they can't use exceptions at =
all, in which case, std::optional can't really do anything to save them.<br=
><br>Also, it should be noted that Boost (and their users) have tons of exp=
erience with optional. Many Boost libraries rely on it. Generally speaking,=
Boost uses optional for when returning a non-value does not represent an e=
rroneous condition.</div></div>
<p></p>
-- <br />
<br />
--- <br />
You received this message because you are subscribed to the Google Groups &=
quot;ISO C++ Standard - Future Proposals" group.<br />
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to <a href=3D"mailto:std-proposals+unsubscribe@isocpp.org">std-proposa=
ls+unsubscribe@isocpp.org</a>.<br />
To post to this group, send email to <a href=3D"mailto:std-proposals@isocpp=
..org">std-proposals@isocpp.org</a>.<br />
Visit this group at <a href=3D"http://groups.google.com/a/isocpp.org/group/=
std-proposals/">http://groups.google.com/a/isocpp.org/group/std-proposals/<=
/a>.<br />
------=_Part_1831_1747956129.1431810601188--
------=_Part_1830_1242953614.1431810601188--
.
Author: Olaf van der Spek <olafvdspek@gmail.com>
Date: Sat, 16 May 2015 23:14:12 +0200
Raw View
2015-05-16 23:10 GMT+02:00 Nicol Bolas <jmckesson@gmail.com>:
> * The output value is always the function's conceptual output, not an error
> code.
> * If a function needs to have an error code, then it has two overloads: one
> that throws and one that uses an error code, which is an output parameter.
> The only difference is the error code parameter.
I thought the FS TS was a great example of how NOT to do it..
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
.
Author: Nicol Bolas <jmckesson@gmail.com>
Date: Sat, 16 May 2015 17:46:12 -0700 (PDT)
Raw View
------=_Part_1923_1005434344.1431823572568
Content-Type: multipart/alternative;
boundary="----=_Part_1924_1933685012.1431823572569"
------=_Part_1924_1933685012.1431823572569
Content-Type: text/plain; charset=UTF-8
On Saturday, May 16, 2015 at 5:14:14 PM UTC-4, Olaf van der Spek wrote:
>
> 2015-05-16 23:10 GMT+02:00 Nicol Bolas <jmck...@gmail.com <javascript:>>:
> > * The output value is always the function's conceptual output, not an
> error
> > code.
> > * If a function needs to have an error code, then it has two overloads:
> one
> > that throws and one that uses an error code, which is an output
> parameter.
> > The only difference is the error code parameter.
>
> I thought the FS TS was a great example of how NOT to do it..
>
Whatever anyone's individual thoughts on the matter may be, the FileSystem
TS exists, passed PDTS phase, and while not quite a TS is being implemented
as we speak in multiple standard library implementations. That makes it de
factor prior precedent for C++ standardization, which puts the burden of
proof on the person arguing against it, not the person following it.
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
------=_Part_1924_1933685012.1431823572569
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable
<div dir=3D"ltr">On Saturday, May 16, 2015 at 5:14:14 PM UTC-4, Olaf van de=
r Spek wrote:<blockquote class=3D"gmail_quote" style=3D"margin: 0;margin-le=
ft: 0.8ex;border-left: 1px #ccc solid;padding-left: 1ex;">2015-05-16 23:10 =
GMT+02:00 Nicol Bolas <<a href=3D"javascript:" target=3D"_blank" gdf-obf=
uscated-mailto=3D"n-xYAg7pzzkJ" rel=3D"nofollow" onmousedown=3D"this.href=
=3D'javascript:';return true;" onclick=3D"this.href=3D'javascript:';return =
true;">jmck...@gmail.com</a>>:
<br>> * The output value is always the function's conceptual output, not=
an error
<br>> code.
<br>> * If a function needs to have an error code, then it has two overl=
oads: one
<br>> that throws and one that uses an error code, which is an output pa=
rameter.
<br>> The only difference is the error code parameter.
<br>
<br>I thought the FS TS was a great example of how NOT to do it..
<br></blockquote><div><br>Whatever anyone's individual thoughts on the matt=
er may be, the FileSystem TS exists, passed PDTS phase, and while not quite=
a TS is being implemented as we speak in multiple standard library impleme=
ntations. That makes it de factor prior precedent for C++ standardization, =
which puts the burden of proof on the person arguing against it, not the pe=
rson following it.<br></div></div>
<p></p>
-- <br />
<br />
--- <br />
You received this message because you are subscribed to the Google Groups &=
quot;ISO C++ Standard - Future Proposals" group.<br />
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to <a href=3D"mailto:std-proposals+unsubscribe@isocpp.org">std-proposa=
ls+unsubscribe@isocpp.org</a>.<br />
To post to this group, send email to <a href=3D"mailto:std-proposals@isocpp=
..org">std-proposals@isocpp.org</a>.<br />
Visit this group at <a href=3D"http://groups.google.com/a/isocpp.org/group/=
std-proposals/">http://groups.google.com/a/isocpp.org/group/std-proposals/<=
/a>.<br />
------=_Part_1924_1933685012.1431823572569--
------=_Part_1923_1005434344.1431823572568--
.
Author: Matthew Fioravante <fmatthew5876@gmail.com>
Date: Sun, 17 May 2015 06:17:16 -0700 (PDT)
Raw View
------=_Part_252_1320572702.1431868636156
Content-Type: multipart/alternative;
boundary="----=_Part_253_621910769.1431868636156"
------=_Part_253_621910769.1431868636156
Content-Type: text/plain; charset=UTF-8
On Saturday, May 16, 2015 at 5:14:14 PM UTC-4, Olaf van der Spek wrote:
>
>
> I thought the FS TS was a great example of how NOT to do it..
>
What are the major complains about the FS TS procedural interface? The most
important thing to me would be making sure exceptions are optional and it
looks like they are here.
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
------=_Part_253_621910769.1431868636156
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable
<div dir=3D"ltr"><br><br>On Saturday, May 16, 2015 at 5:14:14 PM UTC-4, Ola=
f van der Spek wrote:<blockquote class=3D"gmail_quote" style=3D"margin: 0;m=
argin-left: 0.8ex;border-left: 1px #ccc solid;padding-left: 1ex;"><br>I tho=
ught the FS TS was a great example of how NOT to do it..
<br></blockquote><div><br></div><div>What are the major complains about the=
FS TS procedural interface? The most important thing to me would be making=
sure exceptions are optional and it looks like they are here. </div><=
/div>
<p></p>
-- <br />
<br />
--- <br />
You received this message because you are subscribed to the Google Groups &=
quot;ISO C++ Standard - Future Proposals" group.<br />
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to <a href=3D"mailto:std-proposals+unsubscribe@isocpp.org">std-proposa=
ls+unsubscribe@isocpp.org</a>.<br />
To post to this group, send email to <a href=3D"mailto:std-proposals@isocpp=
..org">std-proposals@isocpp.org</a>.<br />
Visit this group at <a href=3D"http://groups.google.com/a/isocpp.org/group/=
std-proposals/">http://groups.google.com/a/isocpp.org/group/std-proposals/<=
/a>.<br />
------=_Part_253_621910769.1431868636156--
------=_Part_252_1320572702.1431868636156--
.
Author: "Vicente J. Botet Escriba" <vicente.botet@wanadoo.fr>
Date: Mon, 18 May 2015 21:56:25 +0200
Raw View
This is a multi-part message in MIME format.
--------------020703020501050608060601
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: quoted-printable
Le 17/05/15 15:17, Matthew Fioravante a =C3=A9crit :
>
>
> On Saturday, May 16, 2015 at 5:14:14 PM UTC-4, Olaf van der Spek wrote:
>
>
> I thought the FS TS was a great example of how NOT to do it..
>
>
> What are the major complains about the FS TS procedural interface? The=20
> most important thing to me would be making sure exceptions are=20
> optional and it looks like they are here.
>
Returning the error code as an out parameter makes the function to don't=20
compose well. This is why most of the functional libraries have adopted=20
monadic interfaces. Of course, C++ is a multi-paradigm language and as=20
people know better the imperative paradigm, it is normal that they could=20
find a well design the one taken by the FS TS.
I believe sincerely that the monadic interface would finish been=20
introduced in the standard. I know that there are a lot of reticences=20
because one particular use of monads could changes the way errors are=20
reported. But haven't we already introduced one more with the FS TS and=20
nobody has addressed comparable complaints?
Note that I'm not proposing to abandon exceptions, but I agree with Olaf=20
that the design taken by FS TS is worst than a monadic one.
If we have a function that can throw, as
R f(T1, T2);
The following seems, to me, a preferable way to introducing another=20
function that does the same but that cannot throw
expected<R, error_code> f(no_throw_t, T1, T2);
This introduces some constraints on R, but I can live with these=20
constraints (move shouldn't throw).
Nevertheless expected<T> is not enough, we need more, we need a complete=20
monadic interface. Once this is ready,
the monadic interface would compose clearly better than
R f(error_code&, T1, T2);
Monads are not an alternative to exceptions, but they are IMHO a valid=20
alternative to the way the FS TS report errors.
I know, I must write a paper if I want things change one day. However,=20
as there are a lot of reticences, the work needs to be done carefully,=20
check if the C++ community changes its mind about monads, and all this=20
takes time.
Vicente
--=20
---=20
You received this message because you are subscribed to the Google Groups "=
ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposa=
ls/.
--------------020703020501050608060601
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable
<html>
<head>
<meta content=3D"text/html; charset=3Dutf-8" http-equiv=3D"Content-Type=
">
</head>
<body bgcolor=3D"#FFFFFF" text=3D"#000000">
<div class=3D"moz-cite-prefix">Le 17/05/15 15:17, Matthew Fioravante a
=C3=A9crit=C2=A0:<br>
</div>
<blockquote
cite=3D"mid:93bc86a3-a287-4573-8feb-fb2ad584b043@isocpp.org"
type=3D"cite">
<div dir=3D"ltr"><br>
<br>
On Saturday, May 16, 2015 at 5:14:14 PM UTC-4, Olaf van der Spek
wrote:
<blockquote class=3D"gmail_quote" style=3D"margin: 0;margin-left:
0.8ex;border-left: 1px #ccc solid;padding-left: 1ex;"><br>
I thought the FS TS was a great example of how NOT to do it..
<br>
</blockquote>
<div><br>
</div>
<div>What are the major complains about the FS TS procedural
interface? The most important thing to me would be making sure
exceptions are optional and it looks like they are here.=C2=A0</d=
iv>
</div>
<br>
</blockquote>
Returning the error code as an out parameter makes the function to
don't compose well. This is why most of the functional libraries
have adopted monadic interfaces. Of course, C++ is a multi-paradigm
language and as people know better the imperative paradigm, it is
normal that they could find a well design the one taken by the FS
TS.<br>
<br>
I believe sincerely that the monadic interface would finish been
introduced in the standard. I know that there are a lot of
reticences because one particular use of monads could changes the
way errors are reported. But haven't we already introduced one more
with the FS TS and nobody has addressed comparable complaints?<br>
<br>
Note that I'm not proposing to abandon exceptions, but I agree with
Olaf that the design taken by FS TS is worst than a monadic one.<br>
<br>
If we have a function that can throw, as<br>
<br>
=C2=A0=C2=A0=C2=A0 R f(T1, T2); <br>
<br>
The following seems, to me, a preferable way to introducing another
function that does the same but that cannot throw <br>
<br>
=C2=A0=C2=A0=C2=A0 expected<R, error_code> f(no_throw_t, T1, T2);=
<br>
<br>
This introduces some constraints on R, but I can live with these
constraints (move shouldn't throw).<br>
<br>
Nevertheless expected<T> is not enough, we need more, we need
a complete monadic interface. Once this is ready, <br>
the monadic interface would compose clearly better than<br>
<br>
=C2=A0=C2=A0=C2=A0 R f(error_code&, T1, T2); <br>
<br>
Monads are not an alternative to exceptions, but they are IMHO a
valid alternative to the way the FS TS report errors.<br>
<br>
I know, I must write a paper if I want things change one day.
However, as there are a lot of reticences, the work needs to be done
carefully, check if the C++ community changes its mind about monads,
and all this takes time.<br>
<br>
Vicente<br>
<br>
<br>
</body>
</html>
<p></p>
-- <br />
<br />
--- <br />
You received this message because you are subscribed to the Google Groups &=
quot;ISO C++ Standard - Future Proposals" group.<br />
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to <a href=3D"mailto:std-proposals+unsubscribe@isocpp.org">std-proposa=
ls+unsubscribe@isocpp.org</a>.<br />
To post to this group, send email to <a href=3D"mailto:std-proposals@isocpp=
..org">std-proposals@isocpp.org</a>.<br />
Visit this group at <a href=3D"http://groups.google.com/a/isocpp.org/group/=
std-proposals/">http://groups.google.com/a/isocpp.org/group/std-proposals/<=
/a>.<br />
--------------020703020501050608060601--
.
Author: Olaf van der Spek <olafvdspek@gmail.com>
Date: Wed, 20 May 2015 15:42:49 +0200
Raw View
2015-05-17 15:17 GMT+02:00 Matthew Fioravante <fmatthew5876@gmail.com>:
>
>
> On Saturday, May 16, 2015 at 5:14:14 PM UTC-4, Olaf van der Spek wrote:
>>
>>
>> I thought the FS TS was a great example of how NOT to do it..
>
>
> What are the major complains about the FS TS procedural interface? The most
> important thing to me would be making sure exceptions are optional and it
> looks like they are here.
Posix:
if (unlink(path))
// unlink failed
C++ FS:
{
error_code ec;
fs::remove(path, ec);
if (ec)
// remove failed
}
--
Olaf
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
.
Author: Nicola Gigante <nicola.gigante@gmail.com>
Date: Wed, 20 May 2015 20:41:31 +0200
Raw View
--Apple-Mail=_B25D9A02-3A0A-41AE-B509-5FD408FB8557
Content-Transfer-Encoding: quoted-printable
Content-Type: text/plain; charset=UTF-8
> Il giorno 18/mag/2015, alle ore 21:56, Vicente J. Botet Escriba <vicente.=
botet@wanadoo.fr> ha scritto:
>=20
> Le 17/05/15 15:17, Matthew Fioravante a =C3=A9crit :
>>=20
>>=20
>> On Saturday, May 16, 2015 at 5:14:14 PM UTC-4, Olaf van der Spek wrote:
>>=20
>> I thought the FS TS was a great example of how NOT to do it..=20
>>=20
>> What are the major complains about the FS TS procedural interface? The m=
ost important thing to me would be making sure exceptions are optional and =
it looks like they are here.=20
>>=20
> Returning the error code as an out parameter makes the function to don't =
compose well. This is why most of the functional libraries have adopted mon=
adic interfaces. Of course, C++ is a multi-paradigm language and as people =
know better the imperative paradigm, it is normal that they could find a we=
ll design the one taken by the FS TS.
>=20
> I believe sincerely that the monadic interface would finish been introduc=
ed in the standard. I know that there are a lot of reticences because one p=
articular use of monads could changes the way errors are reported. But have=
n't we already introduced one more with the FS TS and nobody has addressed =
comparable complaints?
>=20
> Note that I'm not proposing to abandon exceptions, but I agree with Olaf =
that the design taken by FS TS is worst than a monadic one.
>=20
> If we have a function that can throw, as
>=20
> R f(T1, T2);=20
>=20
> The following seems, to me, a preferable way to introducing another funct=
ion that does the same but that cannot throw=20
>=20
> expected<R, error_code> f(no_throw_t, T1, T2);=20
>=20
> This introduces some constraints on R, but I can live with these constrai=
nts (move shouldn't throw).
>=20
> Nevertheless expected<T> is not enough, we need more, we need a complete =
monadic interface. Once this is ready,=20
> the monadic interface would compose clearly better than
>=20
> R f(error_code&, T1, T2);=20
>=20
> Monads are not an alternative to exceptions, but they are IMHO a valid al=
ternative to the way the FS TS report errors.
>=20
> I know, I must write a paper if I want things change one day. However, as=
there are a lot of reticences, the work needs to be done carefully, check =
if the C++ community changes its mind about monads, and all this takes time=
..
>=20
I totally agree.
I think it=E2=80=99s also worth to note that with the await/yield proposal =
in C++17
we=E2=80=99ll have a de-facto built-in syntax to deal with monadic values.
Imagine:
std::expected<std::string,fs::error_code> getcontents(std::string_view path=
);
// my code
std::expected<T, fs::error_code> myfunc()
{=20
T result;
std::string file contents =3D await getcontents(path);
// my code
yield result;
}
> Vicente
Bye,
Nicola
--=20
---=20
You received this message because you are subscribed to the Google Groups "=
ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposa=
ls/.
--Apple-Mail=_B25D9A02-3A0A-41AE-B509-5FD408FB8557
Content-Transfer-Encoding: quoted-printable
Content-Type: text/html; charset=UTF-8
<html><head><meta http-equiv=3D"Content-Type" content=3D"text/html charset=
=3Dutf-8"></head><body style=3D"word-wrap: break-word; -webkit-nbsp-mode: s=
pace; -webkit-line-break: after-white-space;" class=3D""><br class=3D""><di=
v><blockquote type=3D"cite" class=3D""><div class=3D"">Il giorno 18/mag/201=
5, alle ore 21:56, Vicente J. Botet Escriba <<a href=3D"mailto:vicente.b=
otet@wanadoo.fr" class=3D"">vicente.botet@wanadoo.fr</a>> ha scritto:</d=
iv><br class=3D"Apple-interchange-newline"><div class=3D"">
=20
<meta content=3D"text/html; charset=3Dutf-8" http-equiv=3D"Content-Type=
" class=3D"">
=20
<div bgcolor=3D"#FFFFFF" text=3D"#000000" class=3D"">
<div class=3D"moz-cite-prefix">Le 17/05/15 15:17, Matthew Fioravante a
=C3=A9crit :<br class=3D"">
</div>
<blockquote cite=3D"mid:93bc86a3-a287-4573-8feb-fb2ad584b043@isocpp.org=
" type=3D"cite" class=3D"">
<div dir=3D"ltr" class=3D""><br class=3D"">
<br class=3D"">
On Saturday, May 16, 2015 at 5:14:14 PM UTC-4, Olaf van der Spek
wrote:
<blockquote class=3D"gmail_quote" style=3D"margin: 0;margin-left:
0.8ex;border-left: 1px #ccc solid;padding-left: 1ex;"><br class=
=3D"">
I thought the FS TS was a great example of how NOT to do it..
<br class=3D"">
</blockquote>
<div class=3D""><br class=3D"">
</div>
<div class=3D"">What are the major complains about the FS TS proced=
ural
interface? The most important thing to me would be making sure
exceptions are optional and it looks like they are here. </d=
iv>
</div>
<br class=3D"">
</blockquote>
Returning the error code as an out parameter makes the function to
don't compose well. This is why most of the functional libraries
have adopted monadic interfaces. Of course, C++ is a multi-paradigm
language and as people know better the imperative paradigm, it is
normal that they could find a well design the one taken by the FS
TS.<br class=3D"">
<br class=3D"">
I believe sincerely that the monadic interface would finish been
introduced in the standard. I know that there are a lot of
reticences because one particular use of monads could changes the
way errors are reported. But haven't we already introduced one more
with the FS TS and nobody has addressed comparable complaints?<br class=
=3D"">
<br class=3D"">
Note that I'm not proposing to abandon exceptions, but I agree with
Olaf that the design taken by FS TS is worst than a monadic one.<br cla=
ss=3D"">
<br class=3D"">
If we have a function that can throw, as<br class=3D"">
<br class=3D"">
R f(T1, T2); <br class=3D"">
<br class=3D"">
The following seems, to me, a preferable way to introducing another
function that does the same but that cannot throw <br class=3D"">
<br class=3D"">
expected<R, error_code> f(no_throw_t, T1, T2);=
<br class=3D"">
<br class=3D"">
This introduces some constraints on R, but I can live with these
constraints (move shouldn't throw).<br class=3D"">
<br class=3D"">
Nevertheless expected<T> is not enough, we need more, we need
a complete monadic interface. Once this is ready, <br class=3D"">
the monadic interface would compose clearly better than<br class=3D"">
<br class=3D"">
R f(error_code&, T1, T2); <br class=3D"">
<br class=3D"">
Monads are not an alternative to exceptions, but they are IMHO a
valid alternative to the way the FS TS report errors.<br class=3D"">
<br class=3D"">
I know, I must write a paper if I want things change one day.
However, as there are a lot of reticences, the work needs to be done
carefully, check if the C++ community changes its mind about monads,
and all this takes time.<br class=3D"">
<br class=3D""></div></div></blockquote><div><br class=3D""></div><div>=
I totally agree.</div><div><br class=3D""></div><div>I think it=E2=80=99s a=
lso worth to note that with the await/yield proposal in C++17</div><div>we=
=E2=80=99ll have a de-facto built-in syntax to deal with monadic values.</d=
iv><div><br class=3D""></div><div>Imagine:</div><div><br class=3D""></div><=
div>std::expected<std::string,fs::error_code> getcontents(std::string=
_view path);</div><div><br class=3D""></div><div>// my code</div><div>std::=
expected<T, fs::error_code> myfunc()</div><div>{ </div><div>&nbs=
p; T result;</div><div> std::string file contents =3D await getconten=
ts(path);</div><div> // my code</div><div> yield result;</div><=
div>}</div><div><br class=3D""></div><br class=3D""><blockquote type=3D"cit=
e" class=3D""><div bgcolor=3D"#FFFFFF" text=3D"#000000" class=3D"">
Vicente<br class=3D""></div></blockquote><br class=3D""></div><div>Bye,=
</div><div>Nicola</div><br class=3D""></body></html>
<p></p>
-- <br />
<br />
--- <br />
You received this message because you are subscribed to the Google Groups &=
quot;ISO C++ Standard - Future Proposals" group.<br />
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to <a href=3D"mailto:std-proposals+unsubscribe@isocpp.org">std-proposa=
ls+unsubscribe@isocpp.org</a>.<br />
To post to this group, send email to <a href=3D"mailto:std-proposals@isocpp=
..org">std-proposals@isocpp.org</a>.<br />
Visit this group at <a href=3D"http://groups.google.com/a/isocpp.org/group/=
std-proposals/">http://groups.google.com/a/isocpp.org/group/std-proposals/<=
/a>.<br />
--Apple-Mail=_B25D9A02-3A0A-41AE-B509-5FD408FB8557--
.