Topic: String Ref Proposal
Author: Olaf van der Spek <olafvdspek@gmail.com>
Date: Fri, 23 Nov 2012 02:22:01 -0800 (PST)
Raw View
------=_Part_56_30731472.1353666121607
Content-Type: text/plain; charset=ISO-8859-1
Op vrijdag 6 juli 2012 15:29:40 UTC+2 schreef Ville Voutilainen het
volgende:
> http://www.open-std.org/JTC1/SC22/WG21/docs/papers/2012/n3371.html
>
> stoi(const string_ref & str, size_t * idx=0, int base=10);
Wouldn't it be better to pass string_ref by value? It's only two pointers.
By value has less aliasing issues and might allow simpler code generation.
Why is stoui missing?
explicit operator bool() const is missing.
--
------=_Part_56_30731472.1353666121607
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
Op vrijdag 6 juli 2012 15:29:40 UTC+2 schreef Ville Voutilainen het volgend=
e:<br><blockquote class=3D"gmail_quote" style=3D"margin: 0;margin-left: 0.8=
ex;border-left: 1px #ccc solid;padding-left: 1ex;"><p><a href=3D"http://www=
..open-std.org/JTC1/SC22/WG21/docs/papers/2012/n3371.html" target=3D"_blank"=
>http://www.open-std.org/JTC1/<wbr>SC22/WG21/docs/papers/2012/<wbr>n3371.ht=
ml</a></p></blockquote><div>> stoi(const string_ref & str, size_t * =
idx=3D0, int base=3D10);</div><div><br></div><div>Wouldn't it be better to =
pass string_ref by value? It's only two pointers. By value has less aliasin=
g issues and might allow simpler code generation.<br></div><div>Why is stou=
i missing?</div><div><br></div><div>explicit operator bool() const is missi=
ng.</div>
<p></p>
-- <br />
<br />
<br />
<br />
------=_Part_56_30731472.1353666121607--
.
Author: DeadMG <wolfeinstein@gmail.com>
Date: Fri, 23 Nov 2012 03:05:58 -0800 (PST)
Raw View
------=_Part_165_31787087.1353668758494
Content-Type: text/plain; charset=ISO-8859-1
No Unicode support? Needs fixing.
--
------=_Part_165_31787087.1353668758494
Content-Type: text/html; charset=ISO-8859-1
No Unicode support? Needs fixing.
<p></p>
-- <br />
<br />
<br />
<br />
------=_Part_165_31787087.1353668758494--
.
Author: Jeffrey Yasskin <jyasskin@googlers.com>
Date: Fri, 23 Nov 2012 10:05:18 -0800
Raw View
On Fri, Nov 23, 2012 at 2:22 AM, Olaf van der Spek <olafvdspek@gmail.com> wrote:
> Op vrijdag 6 juli 2012 15:29:40 UTC+2 schreef Ville Voutilainen het
> volgende:
>>
>> http://www.open-std.org/JTC1/SC22/WG21/docs/papers/2012/n3371.html
More recent version:
http://www.open-std.org/JTC1/SC22/WG21/docs/papers/2012/n3442.html
In-progress standard changes:
https://github.com/google/cxx-std-draft/compare/master...string-ref
I'm hoping to have a new version of the paper out to this list by the
end of this weekend.
>> stoi(const string_ref & str, size_t * idx=0, int base=10);
>
> Wouldn't it be better to pass string_ref by value? It's only two pointers.
(or a pointer+a length, at the implementer's discretion. ;)
> By value has less aliasing issues and might allow simpler code generation.
The current draft passes string_ref by value.
> Why is stoui missing?
stoui is "missing" because it's not in the current standard.
string_ref isn't intended improve anything about numeric conversions,
so I'm not changing anything there.
> explicit operator bool() const is missing.
string_ref mimics (a subset of) std::string's interface. Since no
existing container has an operator bool(), string_ref doesn't. If you
want to change that, write a separate proposal.
On Fri, Nov 23, 2012 at 3:05 AM, DeadMG <wolfeinstein@gmail.com> wrote:
> No Unicode support? Needs fixing.
People are working on fixing that in general. See
http://www.open-std.org/JTC1/SC22/WG21/docs/papers/2012/n3398.html,
for example. string_ref doesn't attempt to fix anything in that
direction because not every particular proposal has to fix every
problem at once.
Jeffrey
--
.
Author: Beman Dawes <bdawes@acm.org>
Date: Fri, 23 Nov 2012 18:43:30 -0500
Raw View
On Fri, Nov 23, 2012 at 1:05 PM, Jeffrey Yasskin <jyasskin@googlers.com> wrote:
>> explicit operator bool() const is missing.
>
> string_ref mimics (a subset of) std::string's interface. Since no
> existing container has an operator bool(), string_ref doesn't. If you
> want to change that, write a separate proposal.
>
> On Fri, Nov 23, 2012 at 3:05 AM, DeadMG <wolfeinstein@gmail.com> wrote:
>> No Unicode support? Needs fixing.
>
> People are working on fixing that in general. See
> http://www.open-std.org/JTC1/SC22/WG21/docs/papers/2012/n3398.html,
> for example. string_ref doesn't attempt to fix anything in that
> direction because not every particular proposal has to fix every
> problem at once.
LWG members often call the portions of proposals that try to fix
unrelated or slightly related problems "drive by fixes", and they are
usually viewed as unfortunate. The problems include perfectly good
drive-by-fixes getting sidetracked because the main proposal gets
rejected, and perfectly good primary proposals getting sidetracked
because they contain flawed drive-by-fixes.
Jeffery is being careful to limit the scope of his proposal to the
primary problem being attacked, and that's a sign of a
well-thought-out proposal.
--Beman
--
.
Author: Nicol Bolas <jmckesson@gmail.com>
Date: Fri, 23 Nov 2012 16:00:36 -0800 (PST)
Raw View
------=_Part_1979_18328226.1353715236064
Content-Type: text/plain; charset=ISO-8859-1
On Friday, November 23, 2012 3:43:32 PM UTC-8, Beman Dawes wrote:
>
> On Fri, Nov 23, 2012 at 1:05 PM, Jeffrey Yasskin <jyas...@googlers.com<javascript:>>
> wrote:
>
> >> explicit operator bool() const is missing.
> >
> > string_ref mimics (a subset of) std::string's interface. Since no
> > existing container has an operator bool(), string_ref doesn't. If you
> > want to change that, write a separate proposal.
> >
> > On Fri, Nov 23, 2012 at 3:05 AM, DeadMG <wolfei...@gmail.com<javascript:>>
> wrote:
> >> No Unicode support? Needs fixing.
> >
> > People are working on fixing that in general. See
> > http://www.open-std.org/JTC1/SC22/WG21/docs/papers/2012/n3398.html,
> > for example. string_ref doesn't attempt to fix anything in that
> > direction because not every particular proposal has to fix every
> > problem at once.
>
> LWG members often call the portions of proposals that try to fix
> unrelated or slightly related problems "drive by fixes", and they are
> usually viewed as unfortunate. The problems include perfectly good
> drive-by-fixes getting sidetracked because the main proposal gets
> rejected, and perfectly good primary proposals getting sidetracked
> because they contain flawed drive-by-fixes.
>
> Jeffery is being careful to limit the scope of his proposal to the
> primary problem being attacked, and that's a sign of a
> well-thought-out proposal.
>
> --Beman
>
Fair enough, but who's responsible for, for want of a better term, "system
integration?" Making sure that new systems work well with existing ones and
everything else. Say we get array_ref and string_ref. That's great. But if
the Unicode proposal is also being worked on at the same time, who makes
sure that the two proposals mesh well?
There are quite a few places in the standard where decisions were made in
the absence of knowledge of some features that hurt the standard library. I
mean, we added u16string and u32string to the library, but I still can't
create an *ifstream* from one. When we get Unicode strings, are they going
to work together with filesystem paths? Will filesystem paths work together
with string_ref?
I don't know, it seems like there needs to be a shorter path for fixes that
are about letting different libraries work together. Maybe there should be
a library integration study group who's job it is to make sure that the
different libraries actually work well together. They exist to massage the
libraries together so that they become an integrated, cohesive whole.
--
------=_Part_1979_18328226.1353715236064
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
<br><br>On Friday, November 23, 2012 3:43:32 PM UTC-8, Beman Dawes wrote:<b=
lockquote class=3D"gmail_quote" style=3D"margin: 0;margin-left: 0.8ex;borde=
r-left: 1px #ccc solid;padding-left: 1ex;">On Fri, Nov 23, 2012 at 1:05 PM,=
Jeffrey Yasskin <<a href=3D"javascript:" target=3D"_blank" gdf-obfuscat=
ed-mailto=3D"JGFQOnc0MKQJ">jyas...@googlers.com</a>> wrote:
<br>
<br>>> explicit operator bool() const is missing.
<br>>
<br>> string_ref mimics (a subset of) std::string's interface. Since no
<br>> existing container has an operator bool(), string_ref doesn't. If =
you
<br>> want to change that, write a separate proposal.
<br>>
<br>> On Fri, Nov 23, 2012 at 3:05 AM, DeadMG <<a href=3D"javascript:=
" target=3D"_blank" gdf-obfuscated-mailto=3D"JGFQOnc0MKQJ">wolfei...@gmail.=
com</a>> wrote:
<br>>> No Unicode support? Needs fixing.
<br>>
<br>> People are working on fixing that in general. See
<br>> <a href=3D"http://www.open-std.org/JTC1/SC22/WG21/docs/papers/2012=
/n3398.html" target=3D"_blank">http://www.open-std.org/JTC1/<wbr>SC22/WG21/=
docs/papers/2012/<wbr>n3398.html</a>,
<br>> for example. string_ref doesn't attempt to fix anything in that
<br>> direction because not every particular proposal has to fix every
<br>> problem at once.
<br>
<br>LWG members often call the portions of proposals that try to fix
<br>unrelated or slightly related problems "drive by fixes", and they are
<br>usually viewed as unfortunate. The problems include perfectly good
<br>drive-by-fixes getting sidetracked because the main proposal gets
<br>rejected, and perfectly good primary proposals getting sidetracked
<br>because they contain flawed drive-by-fixes.
<br>
<br>Jeffery is being careful to limit the scope of his proposal to the
<br>primary problem being attacked, and that's a sign of a
<br>well-thought-out proposal.
<br>
<br>--Beman
<br></blockquote><div><br>Fair enough, but who's responsible for, for want =
of a better term, "system integration?" Making sure that new systems work w=
ell with existing ones and everything else. Say we get array_ref and string=
_ref. That's great. But if the Unicode proposal is also being worked on at =
the same time, who makes sure that the two proposals mesh well?<br><br>Ther=
e are quite a few places in the standard where decisions were made in the a=
bsence of knowledge of some features that hurt the standard library. I mean=
, we added u16string and u32string to the library, but I still can't create=
an <i>ifstream</i> from one. When we get Unicode strings, are they going t=
o work together with filesystem paths? Will filesystem paths work together =
with string_ref?<br><br>I don't know, it seems like there needs to be a sho=
rter path for fixes that are about letting different libraries work togethe=
r. Maybe there should be a library integration study group who's job it is =
to make sure that the different libraries actually work well together. They=
exist to massage the libraries together so that they become an integrated,=
cohesive whole.<br></div>
<p></p>
-- <br />
<br />
<br />
<br />
------=_Part_1979_18328226.1353715236064--
.
Author: Jeffrey Yasskin <jyasskin@googlers.com>
Date: Fri, 23 Nov 2012 16:59:01 -0800
Raw View
On Fri, Nov 23, 2012 at 4:00 PM, Nicol Bolas <jmckesson@gmail.com> wrote:
>
>
> On Friday, November 23, 2012 3:43:32 PM UTC-8, Beman Dawes wrote:
>>
>> On Fri, Nov 23, 2012 at 1:05 PM, Jeffrey Yasskin <jyas...@googlers.com>
>> wrote:
>>
>> >> explicit operator bool() const is missing.
>> >
>> > string_ref mimics (a subset of) std::string's interface. Since no
>> > existing container has an operator bool(), string_ref doesn't. If you
>> > want to change that, write a separate proposal.
>> >
>> > On Fri, Nov 23, 2012 at 3:05 AM, DeadMG <wolfei...@gmail.com> wrote:
>> >> No Unicode support? Needs fixing.
>> >
>> > People are working on fixing that in general. See
>> > http://www.open-std.org/JTC1/SC22/WG21/docs/papers/2012/n3398.html,
>> > for example. string_ref doesn't attempt to fix anything in that
>> > direction because not every particular proposal has to fix every
>> > problem at once.
>>
>> LWG members often call the portions of proposals that try to fix
>> unrelated or slightly related problems "drive by fixes", and they are
>> usually viewed as unfortunate. The problems include perfectly good
>> drive-by-fixes getting sidetracked because the main proposal gets
>> rejected, and perfectly good primary proposals getting sidetracked
>> because they contain flawed drive-by-fixes.
>>
>> Jeffery is being careful to limit the scope of his proposal to the
>> primary problem being attacked, and that's a sign of a
>> well-thought-out proposal.
>>
>> --Beman
>
>
> Fair enough, but who's responsible for, for want of a better term, "system
> integration?" Making sure that new systems work well with existing ones and
> everything else. Say we get array_ref and string_ref. That's great. But if
> the Unicode proposal is also being worked on at the same time, who makes
> sure that the two proposals mesh well?
>
> There are quite a few places in the standard where decisions were made in
> the absence of knowledge of some features that hurt the standard library. I
> mean, we added u16string and u32string to the library, but I still can't
> create an ifstream from one. When we get Unicode strings, are they going to
> work together with filesystem paths? Will filesystem paths work together
> with string_ref?
>
> I don't know, it seems like there needs to be a shorter path for fixes that
> are about letting different libraries work together. Maybe there should be a
> library integration study group who's job it is to make sure that the
> different libraries actually work well together. They exist to massage the
> libraries together so that they become an integrated, cohesive whole.
The LWG as a whole ensures that proposals mesh well and incorporate
enough changes to integrate with the whole standard. The only active
unicode paper I'm aware of is N3398, but I don't see any integration
problems between that and string_ref. If you see some, or if you know
of another active Unicode proposal, please let me know (that is, point
out specific problems; don't just wave your hands). I can't make
string_ref interoperate with a Unicode proposal that doesn't even
exist, and if a Unicode proposal comes in after string_ref is adopted
into the draft standard, then that proposal will use string_ref as it
sees fit.
As to the filesystem and u{16,32}string, that's not a trivial issue
because fstreams need to interact with underlying APIs that use their
own encodings. Beman's N3398 will make it easier to convert a wide
string to the native encoding, and n3399 will likely be the place to
solve this in the long run. If you think you have a better idea,
please write a paper and submit it to the next mailing (probably by
sending it to this list).
Jeffrey
--
.
Author: Nicol Bolas <jmckesson@gmail.com>
Date: Fri, 23 Nov 2012 21:23:14 -0800 (PST)
Raw View
------=_Part_13_11532189.1353734594453
Content-Type: text/plain; charset=ISO-8859-1
On Friday, November 23, 2012 4:59:24 PM UTC-8, Jeffrey Yasskin wrote:
>
> On Fri, Nov 23, 2012 at 4:00 PM, Nicol Bolas <jmck...@gmail.com<javascript:>>
> wrote:
> >
> >
> > On Friday, November 23, 2012 3:43:32 PM UTC-8, Beman Dawes wrote:
> >>
> >> On Fri, Nov 23, 2012 at 1:05 PM, Jeffrey Yasskin <jyas...@googlers.com>
>
> >> wrote:
> >>
> >> >> explicit operator bool() const is missing.
> >> >
> >> > string_ref mimics (a subset of) std::string's interface. Since no
> >> > existing container has an operator bool(), string_ref doesn't. If you
> >> > want to change that, write a separate proposal.
> >> >
> >> > On Fri, Nov 23, 2012 at 3:05 AM, DeadMG <wolfei...@gmail.com> wrote:
> >> >> No Unicode support? Needs fixing.
> >> >
> >> > People are working on fixing that in general. See
> >> > http://www.open-std.org/JTC1/SC22/WG21/docs/papers/2012/n3398.html,
> >> > for example. string_ref doesn't attempt to fix anything in that
> >> > direction because not every particular proposal has to fix every
> >> > problem at once.
> >>
> >> LWG members often call the portions of proposals that try to fix
> >> unrelated or slightly related problems "drive by fixes", and they are
> >> usually viewed as unfortunate. The problems include perfectly good
> >> drive-by-fixes getting sidetracked because the main proposal gets
> >> rejected, and perfectly good primary proposals getting sidetracked
> >> because they contain flawed drive-by-fixes.
> >>
> >> Jeffery is being careful to limit the scope of his proposal to the
> >> primary problem being attacked, and that's a sign of a
> >> well-thought-out proposal.
> >>
> >> --Beman
> >
> >
> > Fair enough, but who's responsible for, for want of a better term,
> "system
> > integration?" Making sure that new systems work well with existing ones
> and
> > everything else. Say we get array_ref and string_ref. That's great. But
> if
> > the Unicode proposal is also being worked on at the same time, who makes
> > sure that the two proposals mesh well?
> >
> > There are quite a few places in the standard where decisions were made
> in
> > the absence of knowledge of some features that hurt the standard
> library. I
> > mean, we added u16string and u32string to the library, but I still can't
> > create an ifstream from one. When we get Unicode strings, are they going
> to
> > work together with filesystem paths? Will filesystem paths work together
> > with string_ref?
> >
> > I don't know, it seems like there needs to be a shorter path for fixes
> that
> > are about letting different libraries work together. Maybe there should
> be a
> > library integration study group who's job it is to make sure that the
> > different libraries actually work well together. They exist to massage
> the
> > libraries together so that they become an integrated, cohesive whole.
>
> The LWG as a whole ensures that proposals mesh well and incorporate
> enough changes to integrate with the whole standard.
That's my point: the LWG has this loop of:
1: Accept proposals 1 and 2.
2: Discover that proposals 1 and 2 should interact in some way. Someone
writes a paper detailing those interactions.
3: Accept proposal 3. Discuss and potentially accept integration of 1 and 2.
4: Discover that proposal 3 now needs interactions with 1 & 2. Get someone
to write a paper on that.
5: Accept proposal 4. Discuss and potentially accept integration of 1, 2,
and 3.
6: Discover that proposal 4 has interactions etc...
Steps 1, 3, and 5 can only happen at actual meetings of the committee,
which only happen twice a year. And this doesn't include times when a
proposal is sent back for improvements. So you're looking at a year and a
half minimum to get this done.
The standards committee is trying to ramp up the speed at which proposals
become actual C++, particularly for libraries. The problem with this is
that everyone's working in their own little groups. So they spend a year or
so ironing out their proposals, then it takes a year or so to get them
standardized. And on top of that, it takes a year to get them integrated.
It would make more sense if there were someone specifically looking at all
of these proposals with an eye towards integration and making certain that
such integration were a priority during the *development* phase of the
proposals. That's the fast phase, where turn-around time is relatively
quick. Once it gets to the point where it goes to LWG, you're in the slow
part: 6 months pass between bits of useful feedback (ie: accept/reject
votes).
Once it becomes standard, major changes are not going to be accepted. Once
Filesystem goes live, you're not going to be able to, for example, state
that the Filesystem's path objects should use a Unicode string with some
Unicode encoding (and thus translates to whatever each platform prefers).
If it was defined to use a basic_string (or "arbitrary array of some
type"), then that's what it will use in perpetuity.
We only get one chance to get these libraries right. It would be terrible
to make a bad choice now just to get Filesystem out there faster, when a
proper Unicode proposal could improve the quality of the Filesystem library
so much more.
> The only active
> unicode paper I'm aware of is N3398, but I don't see any integration
> problems between that and string_ref. If you see some, or if you know
> of another active Unicode proposal, please let me know (that is, point
> out specific problems; don't just wave your hands).
I guess it depends on how you define the term "active proposal". There's this
proposal that DeadMG is working to pull together<https://groups.google.com/a/isocpp.org/forum/#!topic/std-proposals/Yszuced3FOA>,
and he's expressed an intent to attend the next standards meeting to
present and defend it. That may or may not be active, depending on how
developed it gets and how far it goes.
While N3398 is nice, it isn't really a "Unicode proposal". It's an encoding
conversion proposal. A Unicode proposal would have some real guarantees
about what is and isn't a properly Unicode-encoded string, some notion of
Unicode normalization and a way to normalize Unicode strings, some way to
get categories for different codepoints, and so forth.
The problem with string_ref and Unicode is that there's no *guarantee* of
anything. A proper Unicode string ensures that it is, well, a *proper*Unicode string, according to the Unicode standard and some particular
Unicode encoding scheme. basic_string_ref is nothing more than an array of
some base character type; there's no guarantee that it is a *valid* encoded
Unicode string. u16string and u32string, useful though they are, provide *
zero* protection from accidental changes that break the rules of UTF-16 or
UTF-32. N3398, like u16string and u32string, is basically just a big dance
around proper Unicode support, continuing to treat strings like arrays of
code units instead of Unicode sequences of codepoints.
A unicode_string_ref would be a range of codepoints encoded according to
some Unicode encoding scheme. You wouldn't be able to break the encoding by
trimming off a byte; such an object would take care to ensure that the
encoding is maintained. You can't array index it, because Unicode encodings
only work bidirectionally. And so forth.
I can't make
> string_ref interoperate with a Unicode proposal that doesn't even
> exist, and if a Unicode proposal comes in after string_ref is adopted
> into the draft standard, then that proposal will use string_ref as it
> sees fit.
>
> As to the filesystem and u{16,32}string, that's not a trivial issue
> because fstreams need to interact with underlying APIs that use their
> own encodings.
That's an implementation detail. Being able to shove a properly formatted
UTF-16 string at an API isn't an unreasonable thing to want to do. If the
underlying system uses some other encoding, then the implementation will
have to transcode it. Just like if the underlying system uses a different
encoding from the compiler's base execution character set of C++, it has to
transcode `const char*` to match what the OS does. Unicode is pretty
universal, so it would not be at all difficult for them to at least
understand the input. If it has illegal characters, then it would do
whatever the API would do for illegal characters.
Beman's N3398 will make it easier to convert a wide
> string to the native encoding, and n3399 will likely be the place to
> solve this in the long run. If you think you have a better idea,
> please write a paper and submit it to the next mailing (probably by
> sending it to this list).
>
> Jeffrey
>
--
------=_Part_13_11532189.1353734594453
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
<br><br>On Friday, November 23, 2012 4:59:24 PM UTC-8, Jeffrey Yasskin wrot=
e:<blockquote class=3D"gmail_quote" style=3D"margin: 0;margin-left: 0.8ex;b=
order-left: 1px #ccc solid;padding-left: 1ex;">On Fri, Nov 23, 2012 at 4:00=
PM, Nicol Bolas <<a href=3D"javascript:" target=3D"_blank" gdf-obfuscat=
ed-mailto=3D"8ES0YE6VZnMJ">jmck...@gmail.com</a>> wrote:
<br>>
<br>>
<br>> On Friday, November 23, 2012 3:43:32 PM UTC-8, Beman Dawes wrote:
<br>>>
<br>>> On Fri, Nov 23, 2012 at 1:05 PM, Jeffrey Yasskin <<a>jyas..=
..@googlers.com</a>>
<br>>> wrote:
<br>>>
<br>>> >> explicit operator bool() const is missing.
<br>>> >
<br>>> > string_ref mimics (a subset of) std::string's interface. =
Since no
<br>>> > existing container has an operator bool(), string_ref doe=
sn't. If you
<br>>> > want to change that, write a separate proposal.
<br>>> >
<br>>> > On Fri, Nov 23, 2012 at 3:05 AM, DeadMG <<a>wolfei...@=
gmail.com</a>> wrote:
<br>>> >> No Unicode support? Needs fixing.
<br>>> >
<br>>> > People are working on fixing that in general. See
<br>>> > <a href=3D"http://www.open-std.org/JTC1/SC22/WG21/docs/pa=
pers/2012/n3398.html" target=3D"_blank">http://www.open-std.org/JTC1/<wbr>S=
C22/WG21/docs/papers/2012/<wbr>n3398.html</a>,
<br>>> > for example. string_ref doesn't attempt to fix anything i=
n that
<br>>> > direction because not every particular proposal has to fi=
x every
<br>>> > problem at once.
<br>>>
<br>>> LWG members often call the portions of proposals that try to f=
ix
<br>>> unrelated or slightly related problems "drive by fixes", and t=
hey are
<br>>> usually viewed as unfortunate. The problems include perfectly =
good
<br>>> drive-by-fixes getting sidetracked because the main proposal g=
ets
<br>>> rejected, and perfectly good primary proposals getting sidetra=
cked
<br>>> because they contain flawed drive-by-fixes.
<br>>>
<br>>> Jeffery is being careful to limit the scope of his proposal to=
the
<br>>> primary problem being attacked, and that's a sign of a
<br>>> well-thought-out proposal.
<br>>>
<br>>> --Beman
<br>>
<br>>
<br>> Fair enough, but who's responsible for, for want of a better term,=
"system
<br>> integration?" Making sure that new systems work well with existing=
ones and
<br>> everything else. Say we get array_ref and string_ref. That's great=
.. But if
<br>> the Unicode proposal is also being worked on at the same time, who=
makes
<br>> sure that the two proposals mesh well?
<br>>
<br>> There are quite a few places in the standard where decisions were =
made in
<br>> the absence of knowledge of some features that hurt the standard l=
ibrary. I
<br>> mean, we added u16string and u32string to the library, but I still=
can't
<br>> create an ifstream from one. When we get Unicode strings, are they=
going to
<br>> work together with filesystem paths? Will filesystem paths work to=
gether
<br>> with string_ref?
<br>>
<br>> I don't know, it seems like there needs to be a shorter path for f=
ixes that
<br>> are about letting different libraries work together. Maybe there s=
hould be a
<br>> library integration study group who's job it is to make sure that =
the
<br>> different libraries actually work well together. They exist to mas=
sage the
<br>> libraries together so that they become an integrated, cohesive who=
le.
<br>
<br>The LWG as a whole ensures that proposals mesh well and incorporate
<br>enough changes to integrate with the whole standard.</blockquote><div><=
br>That's my point: the LWG has this loop of:<br><br>1: Accept proposals 1 =
and 2.<br>2: Discover that proposals 1 and 2 should interact in some way. S=
omeone writes a paper detailing those interactions.<br>3: Accept proposal 3=
.. Discuss and potentially accept integration of 1 and 2.<br>4: Discover tha=
t proposal 3 now needs interactions with 1 & 2. Get someone to write a =
paper on that.<br>5: Accept proposal 4. Discuss and potentially accept inte=
gration of 1, 2, and 3.<br>6: Discover that proposal 4 has interactions etc=
....<br><br>Steps 1, 3, and 5 can only happen at actual meetings of the comm=
ittee, which only happen twice a year. And this doesn't include times when =
a proposal is sent back for improvements. So you're looking at a year and a=
half minimum to get this done.<br><br>The standards committee is trying to=
ramp up the speed at which proposals become actual C++, particularly for l=
ibraries. The problem with this is that everyone's working in their own lit=
tle groups. So they spend a year or so ironing out their proposals, then it=
takes a year or so to get them standardized. And on top of that, it takes =
a year to get them integrated.<br><br>It would make more sense if there wer=
e someone specifically looking at all of these proposals with an eye toward=
s integration and making certain that such integration were a priority duri=
ng the <i>development</i> phase of the proposals. That's the fast phase, wh=
ere turn-around time is relatively quick. Once it gets to the point where i=
t goes to LWG, you're in the slow part: 6 months pass between bits of usefu=
l feedback (ie: accept/reject votes).<br><br>Once it becomes standard, majo=
r changes are not going to be accepted. Once Filesystem goes live, you're n=
ot going to be able to, for example, state that the Filesystem's path objec=
ts should use a Unicode string with some Unicode encoding (and thus transla=
tes to whatever each platform prefers). If it was defined to use a basic_st=
ring (or "arbitrary array of some type"), then that's what it will use in p=
erpetuity.<br><br>We only get one chance to get these libraries right. It w=
ould be terrible to make a bad choice now just to get Filesystem out there =
faster, when a proper Unicode proposal could improve the quality of the Fil=
esystem library so much more.<br> </div><blockquote class=3D"gmail_quo=
te" style=3D"margin: 0;margin-left: 0.8ex;border-left: 1px #ccc solid;paddi=
ng-left: 1ex;">The only active
<br>unicode paper I'm aware of is N3398, but I don't see any integration
<br>problems between that and string_ref. If you see some, or if you know
<br>of another active Unicode proposal, please let me know (that is, point
<br>out specific problems; don't just wave your hands).</blockquote><div><b=
r>I guess it depends on how you define the term "active proposal". There's =
<a href=3D"https://groups.google.com/a/isocpp.org/forum/#!topic/std-proposa=
ls/Yszuced3FOA">this proposal that DeadMG is working to pull together</a>, =
and he's expressed an intent to attend the next standards meeting to presen=
t and defend it. That may or may not be active, depending on how developed =
it gets and how far it goes.<br><br>While N3398 is nice, it isn't really a =
"Unicode proposal". It's an encoding conversion proposal. A Unicode proposa=
l would have some real guarantees about what is and isn't a properly Unicod=
e-encoded string, some notion of Unicode normalization and a way to normali=
ze Unicode strings, some way to get categories for different codepoints, an=
d so forth.<br><br>The problem with string_ref and Unicode is that there's =
no <i>guarantee</i> of anything. A proper Unicode string ensures that it is=
, well, a <i>proper</i> Unicode string, according to the Unicode standard a=
nd some particular Unicode encoding scheme. basic_string_ref is nothing mor=
e than an array of some base character type; there's no guarantee that it i=
s a <i>valid</i> encoded Unicode string. u16string and u32string, useful th=
ough they are, provide <i>zero</i> protection from accidental changes that =
break the rules of UTF-16 or UTF-32. N3398, like u16string and u32string, i=
s basically just a big dance around proper Unicode support, continuing to t=
reat strings like arrays of code units instead of Unicode sequences of code=
points.<br><br>A unicode_string_ref would be a range of codepoints encoded =
according to some Unicode encoding scheme. You wouldn't be able to break th=
e encoding by trimming off a byte; such an object would take care to ensure=
that the encoding is maintained. You can't array index it, because Unicode=
encodings only work bidirectionally. And so forth.<br><br></div><blockquot=
e class=3D"gmail_quote" style=3D"margin: 0;margin-left: 0.8ex;border-left: =
1px #ccc solid;padding-left: 1ex;">I can't make
<br>string_ref interoperate with a Unicode proposal that doesn't even
<br>exist, and if a Unicode proposal comes in after string_ref is adopted
<br>into the draft standard, then that proposal will use string_ref as it
<br>sees fit.
<br>
<br>As to the filesystem and u{16,32}string, that's not a trivial issue
<br>because fstreams need to interact with underlying APIs that use their
<br>own encodings.</blockquote><div><br>That's an implementation detail. Be=
ing able to shove a properly formatted UTF-16 string at an API isn't an unr=
easonable thing to want to do. If the underlying system uses some other enc=
oding, then the implementation will have to transcode it. Just like if the =
underlying system uses a different encoding from the compiler's base execut=
ion character set of C++, it has to transcode `const char*` to match what t=
he OS does. Unicode is pretty universal, so it would not be at all difficul=
t for them to at least understand the input. If it has illegal characters, =
then it would do whatever the API would do for illegal characters.<br><br><=
/div><blockquote class=3D"gmail_quote" style=3D"margin: 0;margin-left: 0.8e=
x;border-left: 1px #ccc solid;padding-left: 1ex;">Beman's N3398 will make i=
t easier to convert a wide
<br>string to the native encoding, and n3399 will likely be the place to
<br>solve this in the long run. If you think you have a better idea,
<br>please write a paper and submit it to the next mailing (probably by
<br>sending it to this list).<br>
<br>Jeffrey
<br></blockquote>
<p></p>
-- <br />
<br />
<br />
<br />
------=_Part_13_11532189.1353734594453--
.
Author: Nevin Liber <nevin@eviloverlord.com>
Date: Sat, 24 Nov 2012 01:22:40 -0600
Raw View
--0015174c12cebd803e04cf38a940
Content-Type: text/plain; charset=ISO-8859-1
On 23 November 2012 23:23, Nicol Bolas <jmckesson@gmail.com> wrote:
>
> That's my point: the LWG has this loop of:
>
> 1: Accept proposals 1 and 2.
>
That is one possibility. There are, of course, three others.
> 2: Discover that proposals 1 and 2 should interact in some way. Someone
> writes a paper detailing those interactions.
>
And if only one of the proposals passes? How is it any less time to
untangle the two? And if you decide to combine what would have been one
accepted and one rejected proposal into one big one, it is more likely than
not for the big proposal to get rejected.
> 3: Accept proposal 3. Discuss and potentially accept integration of 1 and
> 2.
> 4: Discover that proposal 3 now needs interactions with 1 & 2. Get someone
> to write a paper on that.
> 5: Accept proposal 4. Discuss and potentially accept integration of 1, 2,
> and 3.
> 6: Discover that proposal 4 has interactions etc...
>
There are now 16 possibilities.
> Steps 1, 3, and 5 can only happen at actual meetings of the committee,
> which only happen twice a year. And this doesn't include times when a
> proposal is sent back for improvements. So you're looking at a year and a
> half minimum to get this done.
>
In all but the "everything is accepted" case, how does your proposed method
of working reduce the time? Volunteers are not going to do 16x the amount
of work between meetings.
It would make more sense if there were someone specifically looking at all
> of these proposals with an eye towards integration and making certain that
> such integration were a priority during the *development* phase of the
> proposals.
>
> Once it becomes standard, major changes are not going to be accepted. Once
> Filesystem goes live, you're not going to be able to, for example, state
> that the Filesystem's path objects should use a Unicode string with some
> Unicode encoding (and thus translates to whatever each platform prefers).
> If it was defined to use a basic_string (or "arbitrary array of some
> type"), then that's what it will use in perpetuity.
>
Keeping with the example, how long should Filesystem wait for a Unicode
string library to be accepted? A year? Two years? C++17? What if we
don't get one by C++17? Is that really fair to the folks who proposed
Filesystem? Is it fair to the folks who want to have a standard Filesystem
library that they can use? Shouldn't Unicode also wait to make sure it
meets all the needs of Filesystem? How long should Unicode wait for
Filesystem? Etc., etc.
Integration is painful. Unfortunately, the solutions to minimize the
integration pain are more painful.
We only get one chance to get these libraries right. It would be terrible
> to make a bad choice now just to get Filesystem out there faster, when a
> proper Unicode proposal could improve the quality of the Filesystem library
> so much more.
>
Perfect is the enemy of good. While the bar is high, we can replace things
later, especially in libraries. stringstream replaced strstream,
unique_ptr replaced auto_ptr, etc.
I guess it depends on how you define the term "active proposal". There's this
> proposal that DeadMG is working to pull together<https://groups.google.com/a/isocpp.org/forum/#!topic/std-proposals/Yszuced3FOA>,
> and he's expressed an intent to attend the next standards meeting to
> present and defend it.
>
And I hope that happens. But why should string_ref have a new dependency
on a currently non-existent, non-accepted, fairly risky (no reference
implementation yet, for instance) proposal? That hampers just about any
progress.
--
Nevin ":-)" Liber <mailto:nevin@eviloverlord.com> (847) 691-1404
--
--0015174c12cebd803e04cf38a940
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
On 23 November 2012 23:23, Nicol Bolas <span dir=3D"ltr"><<a href=3D"mai=
lto:jmckesson@gmail.com" target=3D"_blank">jmckesson@gmail.com</a>></spa=
n> wrote:<br><div class=3D"gmail_quote"><blockquote class=3D"gmail_quote" s=
tyle=3D"margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<br>That's my point: the LWG has this loop of:<br><div><br>1: Accept pr=
oposals 1 and 2.<br></div></blockquote><div><br></div><div>That is one poss=
ibility. =A0There are, of course, three others.</div><div>=A0</div><blockqu=
ote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1px #ccc s=
olid;padding-left:1ex">
<div>2: Discover that proposals 1 and 2 should interact in some way. Someon=
e writes a paper detailing those interactions.<br></div></blockquote><div><=
br></div><div>And if only one of the proposals passes? =A0How is it any les=
s time to untangle the two? =A0And if you decide to combine what would have=
been one accepted and one rejected proposal into one big one, it is more l=
ikely than not for the big proposal to get rejected.</div>
<div>=A0=A0</div><blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8=
ex;border-left:1px #ccc solid;padding-left:1ex"><div>3: Accept proposal 3. =
Discuss and potentially accept integration of 1 and 2.<br>4: Discover that =
proposal 3 now needs interactions with 1 & 2. Get someone to write a pa=
per on that.<br>
5: Accept proposal 4. Discuss and potentially accept integration of 1, 2, a=
nd 3.<br>6: Discover that proposal 4 has interactions etc...<br></div></blo=
ckquote><div><br></div><div>There are now 16 possibilities.</div><div>
=A0</div>
<blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p=
x #ccc solid;padding-left:1ex"><div>Steps 1, 3, and 5 can only happen at ac=
tual meetings of the committee, which only happen twice a year. And this do=
esn't include times when a proposal is sent back for improvements. So y=
ou're looking at a year and a half minimum to get this done.<br>
</div></blockquote><div><br></div><div>In all but the "everything is a=
ccepted" case, how does your proposed method of working reduce the tim=
e? =A0Volunteers are not going to do 16x the amount of work between meeting=
s.</div>
<div><br></div><blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex=
;border-left:1px #ccc solid;padding-left:1ex"><div>It would make more sense=
if there were someone specifically looking at all of these proposals with =
an eye towards integration and making certain that such integration were a =
priority during the <i>development</i> phase of the proposals.</div>
</blockquote><blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;b=
order-left:1px #ccc solid;padding-left:1ex"><div><br>Once it becomes standa=
rd, major changes are not going to be accepted. Once Filesystem goes live, =
you're not going to be able to, for example, state that the Filesystem&=
#39;s path objects should use a Unicode string with some Unicode encoding (=
and thus translates to whatever each platform prefers). If it was defined t=
o use a basic_string (or "arbitrary array of some type"), then th=
at's what it will use in perpetuity.<br>
</div></blockquote><div><br></div><div>Keeping with the example, how long s=
hould Filesystem wait for a Unicode string library to be accepted? =A0A yea=
r? =A0Two years? =A0C++17? =A0What if we don't get one by C++17? =A0Is =
that really fair to the folks who proposed Filesystem? =A0Is it fair to the=
folks who want to have a standard Filesystem library that they can use? =
=A0Shouldn't Unicode also wait to make sure it meets all the needs of F=
ilesystem? =A0How long should Unicode wait for Filesystem? =A0Etc., etc.</d=
iv>
<div><br></div><div>Integration is painful. =A0Unfortunately, the solutions=
to minimize the integration pain are more painful.</div><div><br></div><bl=
ockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1px #=
ccc solid;padding-left:1ex">
<div>We only get one chance to get these libraries right. It would be terri=
ble to make a bad choice now just to get Filesystem out there faster, when =
a proper Unicode proposal could improve the quality of the Filesystem libra=
ry so much more.<br>
</div></blockquote><div><br></div><div>Perfect is the enemy of good. =A0Whi=
le the bar is high, we can replace things later, especially in libraries. =
=A0stringstream replaced strstream, unique_ptr replaced auto_ptr, etc.</div=
>
<div><br></div><blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex=
;border-left:1px #ccc solid;padding-left:1ex"><div>I guess it depends on ho=
w you define the term "active proposal". There's <a href=3D"h=
ttps://groups.google.com/a/isocpp.org/forum/#!topic/std-proposals/Yszuced3F=
OA" target=3D"_blank">this proposal that DeadMG is working to pull together=
</a>, and he's expressed an intent to attend the next standards meeting=
to present and defend it. </div>
</blockquote><div><br></div><div>And I hope that happens. =A0But why should=
string_ref have a new dependency on a currently non-existent, non-accepted=
, fairly risky (no reference implementation yet, for instance) proposal? =
=A0That hampers just about any progress.</div>
<div>--=A0</div></div>=A0Nevin ":-)" Liber=A0 <mailto:<a href=
=3D"mailto:nevin@eviloverlord.com" target=3D"_blank">nevin@eviloverlord.com=
</a>>=A0 (847) 691-1404<br>
<p></p>
-- <br />
<br />
<br />
<br />
--0015174c12cebd803e04cf38a940--
.
Author: Nicol Bolas <jmckesson@gmail.com>
Date: Sat, 24 Nov 2012 02:29:28 -0800 (PST)
Raw View
------=_Part_1039_33296982.1353752968946
Content-Type: text/plain; charset=ISO-8859-1
On Friday, November 23, 2012 11:29:41 PM UTC-8, Nevin ":-)" Liber wrote:
>
> On 23 November 2012 23:23, Nicol Bolas <jmck...@gmail.com <javascript:>>wrote:
>
>>
>> That's my point: the LWG has this loop of:
>>
>> 1: Accept proposals 1 and 2.
>>
>
> That is one possibility. There are, of course, three others.
>
>
>> 2: Discover that proposals 1 and 2 should interact in some way. Someone
>> writes a paper detailing those interactions.
>>
>
> And if only one of the proposals passes? How is it any less time to
> untangle the two? And if you decide to combine what would have been one
> accepted and one rejected proposal into one big one, it is more likely than
> not for the big proposal to get rejected.
>
>
>> 3: Accept proposal 3. Discuss and potentially accept integration of 1 and
>> 2.
>> 4: Discover that proposal 3 now needs interactions with 1 & 2. Get
>> someone to write a paper on that.
>> 5: Accept proposal 4. Discuss and potentially accept integration of 1, 2,
>> and 3.
>> 6: Discover that proposal 4 has interactions etc...
>>
>
> There are now 16 possibilities.
>
>
>> Steps 1, 3, and 5 can only happen at actual meetings of the committee,
>> which only happen twice a year. And this doesn't include times when a
>> proposal is sent back for improvements. So you're looking at a year and a
>> half minimum to get this done.
>>
>
> In all but the "everything is accepted" case, how does your proposed
> method of working reduce the time? Volunteers are not going to do 16x the
> amount of work between meetings.
>
I think you've misunderstood what I was getting at. Those were *steps*along the standardization process (each representing about 3 months of
real-time). They're not individual scenarios.
First, the standards committee accepts a pair of library proposals. Then,
somebody realizes that these proposals should interact. So someone writes
up a proposal to integrate them together. 6 months later, the LWG meets
again and approves the integration proposal. But they also approve proposal
3 independently of that. Again, someone realizes that proposal 3 needs to
be integrated with the rest of the stuff already there, so someone now has
to write another proposal. During that time, proposal 4 comes along to
start the process over. And over. And over again.
In the end, you eventually have to ship a standard. So you ship without
integration, or worse, with poorly-done integration. And in C++, there are
no take-backs; once you ship, it's in there forever. You might later ship a
different proposal, but the mistake is still there.
> It would make more sense if there were someone specifically looking at all
>> of these proposals with an eye towards integration and making certain that
>> such integration were a priority during the *development* phase of the
>> proposals.
>>
>
>> Once it becomes standard, major changes are not going to be accepted.
>> Once Filesystem goes live, you're not going to be able to, for example,
>> state that the Filesystem's path objects should use a Unicode string with
>> some Unicode encoding (and thus translates to whatever each platform
>> prefers). If it was defined to use a basic_string (or "arbitrary array of
>> some type"), then that's what it will use in perpetuity.
>>
>
> Keeping with the example, how long should Filesystem wait for a Unicode
> string library to be accepted? A year? Two years? C++17? What if we
> don't get one by C++17? Is that really fair to the folks who proposed
> Filesystem? Is it fair to the folks who want to have a standard Filesystem
> library that they can use? Shouldn't Unicode also wait to make sure it
> meets all the needs of Filesystem? How long should Unicode wait for
> Filesystem? Etc., etc.
>
> Integration is painful. Unfortunately, the solutions to minimize the
> integration pain are more painful.
>
So, having a study group that exists to watch proposals in their formative
stages and figure out ways for proposed library additions to
intercommunicate is "painful"? The point is to have formal proposals that
already *have* the integration stuff in them, before they get formally
proposed to LWG. That way, we don't get the current situation of where we
accept all of these proposals in a vacuum, and hope that *somebody* comes
along and writes a proposal to actually make them work well together.
As the standards committee does more frequent releases, that's going to
fail more and more often. It would be nice if there were some mechanism in
place that would at least attempt to resolve these kinds of issues, so that
we don't get nonsense like being unable to create Filesystem paths with
Unicode strings or vice-versa or somesuch. Assuming that someone, somewhere
will write an oversight proposal... well, again, I point to the inability
to create a simple ifstream with a u16string.
We only get one chance to get these libraries right. It would be terrible
>> to make a bad choice now just to get Filesystem out there faster, when a
>> proper Unicode proposal could improve the quality of the Filesystem library
>> so much more.
>>
>
> Perfect is the enemy of good. While the bar is high, we can replace
> things later, especially in libraries. stringstream replaced strstream,
> unique_ptr replaced auto_ptr, etc.
>
And yet, auto_ptr still exists. strstream still exists (and also is
superior to stringstream in some respects, so it's not really a
replacement). Neither has been *replaced*; there is simply an alternate
version that can be used. And there's no indication that either of these
classes will be removed anytime soon.
When you're working in a system where nothing can be removed once it is
added, good is the enemy of perfect. It *has* to be perfect. Because if you
get them wrong, you get, well, iostreams. Which will exist in C++ in
perpetuity. Even if we add a new library to do the same job, it will still
be there.
I guess it depends on how you define the term "active proposal". There's this
>> proposal that DeadMG is working to pull together<https://groups.google.com/a/isocpp.org/forum/#!topic/std-proposals/Yszuced3FOA>,
>> and he's expressed an intent to attend the next standards meeting to
>> present and defend it.
>>
>
> And I hope that happens. But why should string_ref have a new dependency
> on a currently non-existent, non-accepted, fairly risky (no reference
> implementation yet, for instance) proposal? That hampers just about any
> progress.
>
I'm not saying it should, specifically. But there should be *some* thought
put forth towards integration with things coming down the pipe. If for no
other reason than to make certain that decisions aren't being made now that
makes such integration impossible or overly difficult.
> --
> Nevin ":-)" Liber <mailto:ne...@eviloverlord.com <javascript:>> (847)
> 691-1404
>
--
------=_Part_1039_33296982.1353752968946
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
<br><br>On Friday, November 23, 2012 11:29:41 PM UTC-8, Nevin ":-)" Liber w=
rote:<blockquote class=3D"gmail_quote" style=3D"margin: 0;margin-left: 0.8e=
x;border-left: 1px #ccc solid;padding-left: 1ex;">On 23 November 2012 23:23=
, Nicol Bolas <span dir=3D"ltr"><<a href=3D"javascript:" target=3D"_blan=
k" gdf-obfuscated-mailto=3D"hDovORoujvMJ">jmck...@gmail.com</a>></span> =
wrote:<br><div class=3D"gmail_quote"><blockquote class=3D"gmail_quote" styl=
e=3D"margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<br>That's my point: the LWG has this loop of:<br><div><br>1: Accept propos=
als 1 and 2.<br></div></blockquote><div><br></div><div>That is one possibil=
ity. There are, of course, three others.</div><div> </div><block=
quote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1px #ccc=
solid;padding-left:1ex">
<div>2: Discover that proposals 1 and 2 should interact in some way. Someon=
e writes a paper detailing those interactions.<br></div></blockquote><div><=
br></div><div>And if only one of the proposals passes? How is it any =
less time to untangle the two? And if you decide to combine what woul=
d have been one accepted and one rejected proposal into one big one, it is =
more likely than not for the big proposal to get rejected.</div>
<div> </div><blockquote class=3D"gmail_quote" style=3D"margin:0 =
0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div>3: Accept propos=
al 3. Discuss and potentially accept integration of 1 and 2.<br>4: Discover=
that proposal 3 now needs interactions with 1 & 2. Get someone to writ=
e a paper on that.<br>
5: Accept proposal 4. Discuss and potentially accept integration of 1, 2, a=
nd 3.<br>6: Discover that proposal 4 has interactions etc...<br></div></blo=
ckquote><div><br></div><div>There are now 16 possibilities.</div><div>
</div>
<blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p=
x #ccc solid;padding-left:1ex"><div>Steps 1, 3, and 5 can only happen at ac=
tual meetings of the committee, which only happen twice a year. And this do=
esn't include times when a proposal is sent back for improvements. So you'r=
e looking at a year and a half minimum to get this done.<br>
</div></blockquote><div><br></div><div>In all but the "everything is accept=
ed" case, how does your proposed method of working reduce the time? V=
olunteers are not going to do 16x the amount of work between meetings.</div=
></div></blockquote><div><br>I think you've misunderstood what I was gettin=
g at. Those were <i>steps</i> along the standardization process (each repre=
senting about 3 months of real-time). They're not individual scenarios.<br>=
<br>First, the standards committee accepts a pair of library proposals. The=
n, somebody realizes that these proposals should interact. So someone write=
s up a proposal to integrate them together. 6 months later, the LWG meets a=
gain and approves the integration proposal. But they also approve proposal =
3 independently of that. Again, someone realizes that proposal 3 needs to b=
e integrated with the rest of the stuff already there, so someone now has t=
o write another proposal. During that time, proposal 4 comes along to start=
the process over. And over. And over again.<br><br> In the end, you eventu=
ally have to ship a standard. So you ship without integration, or worse, wi=
th poorly-done integration. And in C++, there are no take-backs; once you s=
hip, it's in there forever. You might later ship a different proposal, but =
the mistake is still there.<br> </div><blockquote class=3D"gmail_quote=
" style=3D"margin: 0;margin-left: 0.8ex;border-left: 1px #ccc solid;padding=
-left: 1ex;"><div class=3D"gmail_quote">
<blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p=
x #ccc solid;padding-left:1ex"><div>It would make more sense if there were =
someone specifically looking at all of these proposals with an eye towards =
integration and making certain that such integration were a priority during=
the <i>development</i> phase of the proposals.</div>
</blockquote><blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;b=
order-left:1px #ccc solid;padding-left:1ex"><div><br>Once it becomes standa=
rd, major changes are not going to be accepted. Once Filesystem goes live, =
you're not going to be able to, for example, state that the Filesystem's pa=
th objects should use a Unicode string with some Unicode encoding (and thus=
translates to whatever each platform prefers). If it was defined to use a =
basic_string (or "arbitrary array of some type"), then that's what it will =
use in perpetuity.<br>
</div></blockquote><div><br></div><div>Keeping with the example, how long s=
hould Filesystem wait for a Unicode string library to be accepted? A =
year? Two years? C++17? What if we don't get one by C++17=
? Is that really fair to the folks who proposed Filesystem? Is =
it fair to the folks who want to have a standard Filesystem library that th=
ey can use? Shouldn't Unicode also wait to make sure it meets all the=
needs of Filesystem? How long should Unicode wait for Filesystem? &n=
bsp;Etc., etc.</div>
<div><br></div><div>Integration is painful. Unfortunately, the soluti=
ons to minimize the integration pain are more painful.</div></div></blockqu=
ote><div><br>So, having a study group that exists to watch proposals in the=
ir formative stages and figure out ways for proposed library additions to i=
ntercommunicate is "painful"? The point is to have formal proposals that al=
ready <i>have</i> the integration stuff in them, before they get formally p=
roposed to LWG. That way, we don't get the current situation of where we ac=
cept all of these proposals in a vacuum, and hope that <i>somebody</i> come=
s along and writes a proposal to actually make them work well together.<br>=
<br>As the standards committee does more frequent releases, that's going to=
fail more and more often. It would be nice if there were some mechanism in=
place that would at least attempt to resolve these kinds of issues, so tha=
t we don't get nonsense like being unable to create Filesystem paths with U=
nicode strings or vice-versa or somesuch. Assuming that someone, somewhere =
will write an oversight proposal... well, again, I point to the inability t=
o create a simple ifstream with a u16string.<br><br></div><blockquote class=
=3D"gmail_quote" style=3D"margin: 0;margin-left: 0.8ex;border-left: 1px #cc=
c solid;padding-left: 1ex;"><div class=3D"gmail_quote"><blockquote class=3D=
"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1px #ccc solid;padding=
-left:1ex">
<div>We only get one chance to get these libraries right. It would be terri=
ble to make a bad choice now just to get Filesystem out there faster, when =
a proper Unicode proposal could improve the quality of the Filesystem libra=
ry so much more.<br>
</div></blockquote><div><br></div><div>Perfect is the enemy of good. =
While the bar is high, we can replace things later, especially in libraries=
.. stringstream replaced strstream, unique_ptr replaced auto_ptr, etc.=
</div></div></blockquote><div><br>And yet, auto_ptr still exists. strstream=
still exists (and also is superior to stringstream in some respects, so it=
's not really a replacement). Neither has been <i>replaced</i>; there is si=
mply an alternate version that can be used. And there's no indication that =
either of these classes will be removed anytime soon.<br><br>When you're wo=
rking in a system where nothing can be removed once it is added, good is th=
e enemy of perfect. It <i>has</i> to be perfect. Because if you get them wr=
ong, you get, well, iostreams. Which will exist in C++ in perpetuity. Even =
if we add a new library to do the same job, it will still be there.<br><br>=
</div><blockquote class=3D"gmail_quote" style=3D"margin: 0;margin-left: 0.8=
ex;border-left: 1px #ccc solid;padding-left: 1ex;"><div class=3D"gmail_quot=
e">
<blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p=
x #ccc solid;padding-left:1ex"><div>I guess it depends on how you define th=
e term "active proposal". There's <a href=3D"https://groups.google.com/a/is=
ocpp.org/forum/#!topic/std-proposals/Yszuced3FOA" target=3D"_blank">this pr=
oposal that DeadMG is working to pull together</a>, and he's expressed an i=
ntent to attend the next standards meeting to present and defend it. </div>
</blockquote><div><br></div><div>And I hope that happens. But why sho=
uld string_ref have a new dependency on a currently non-existent, non-accep=
ted, fairly risky (no reference implementation yet, for instance) proposal?=
That hampers just about any progress.</div></div></blockquote><div><=
br>I'm not saying it should, specifically. But there should be <i>some</i> =
thought put forth towards integration with things coming down the pipe. If =
for no other reason than to make certain that decisions aren't being made n=
ow that makes such integration impossible or overly difficult.<br> </d=
iv><blockquote class=3D"gmail_quote" style=3D"margin: 0;margin-left: 0.8ex;=
border-left: 1px #ccc solid;padding-left: 1ex;"><div class=3D"gmail_quote">
<div>-- </div></div> Nevin ":-)" Liber <mailto:<a href=
=3D"javascript:" target=3D"_blank" gdf-obfuscated-mailto=3D"hDovORoujvMJ">n=
e...@eviloverlord.com</a><wbr>> (847) 691-1404<br>
</blockquote>
<p></p>
-- <br />
<br />
<br />
<br />
------=_Part_1039_33296982.1353752968946--
.
Author: DeadMG <wolfeinstein@gmail.com>
Date: Sat, 24 Nov 2012 03:26:55 -0800 (PST)
Raw View
------=_Part_253_25536739.1353756415567
Content-Type: text/plain; charset=ISO-8859-1
>
> And I hope that happens. But why should string_ref have a new dependency
> on a currently non-existent, non-accepted, fairly risky (no reference
> implementation yet, for instance) proposal? That hampers just about any
> progress.
>
There are many implementations in many languages of Unicode functionality.
It's true that there is no reference implementation of my exact proposal,
but there are lots of implementations of normalization, collation, and
such. The Unicode Standard already did the work to ensure that these
algorithms can be implemented.
But secondly, because when the Standard finally gets proper Unicode
support, perhaps through my proposal, perhaps through some other, then
string_ref will either become totally obsolete or be updated to match.
Updating it now would simply be getting ahead of the curve.
--
------=_Part_253_25536739.1353756415567
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
<blockquote class=3D"gmail_quote" style=3D"margin: 0px 0px 0px 0.8ex; borde=
r-left-width: 1px; border-left-color: rgb(204, 204, 204); border-left-style=
: solid; padding-left: 1ex;">And I hope that happens. But why should =
string_ref have a new dependency on a currently non-existent, non-accepted,=
fairly risky (no reference implementation yet, for instance) proposal? &nb=
sp;That hampers just about any progress.<br></blockquote><div><br></div><di=
v>There are many implementations in many languages of Unicode functionality=
.. It's true that there is no reference implementation of my exact proposal,=
but there are lots of implementations of normalization, collation, and suc=
h. The Unicode Standard already did the work to ensure that these algorithm=
s can be implemented.</div><div><br></div><div>But secondly, because when t=
he Standard finally gets proper Unicode support, perhaps through my proposa=
l, perhaps through some other, then string_ref will either become totally o=
bsolete or be updated to match. Updating it now would simply be getting ahe=
ad of the curve.</div>
<p></p>
-- <br />
<br />
<br />
<br />
------=_Part_253_25536739.1353756415567--
.
Author: Olaf van der Spek <olafvdspek@gmail.com>
Date: Sat, 24 Nov 2012 05:40:17 -0800 (PST)
Raw View
------=_Part_298_17659826.1353764417302
Content-Type: text/plain; charset=ISO-8859-1
Op zaterdag 24 november 2012 12:26:55 UTC+1 schreef DeadMG het volgende:
> And I hope that happens. But why should string_ref have a new dependency
>> on a currently non-existent, non-accepted, fairly risky (no reference
>> implementation yet, for instance) proposal? That hampers just about any
>> progress.
>>
>
> There are many implementations in many languages of Unicode functionality.
> It's true that there is no reference implementation of my exact proposal,
> but there are lots of implementations of normalization, collation, and
> such. The Unicode Standard already did the work to ensure that these
> algorithms can be implemented.
>
> But secondly, because when the Standard finally gets proper Unicode
> support, perhaps through my proposal, perhaps through some other, then
> string_ref will either become totally obsolete or be updated to match.
> Updating it now would simply be getting ahead of the curve.
>
Do you have a concrete proposal for such an update to string_ref?
If not, you're basically asking to put the string_ref proposal on hold,
aren't you?
--
------=_Part_298_17659826.1353764417302
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
Op zaterdag 24 november 2012 12:26:55 UTC+1 schreef DeadMG het volgende:<br=
><blockquote class=3D"gmail_quote" style=3D"margin: 0;margin-left: 0.8ex;bo=
rder-left: 1px #ccc solid;padding-left: 1ex;"><blockquote class=3D"gmail_qu=
ote" style=3D"margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-co=
lor:rgb(204,204,204);border-left-style:solid;padding-left:1ex">And I hope t=
hat happens. But why should string_ref have a new dependency on a cur=
rently non-existent, non-accepted, fairly risky (no reference implementatio=
n yet, for instance) proposal? That hampers just about any progress.<=
br></blockquote><div><br></div><div>There are many implementations in many =
languages of Unicode functionality. It's true that there is no reference im=
plementation of my exact proposal, but there are lots of implementations of=
normalization, collation, and such. The Unicode Standard already did the w=
ork to ensure that these algorithms can be implemented.</div><div><br></div=
><div>But secondly, because when the Standard finally gets proper Unicode s=
upport, perhaps through my proposal, perhaps through some other, then strin=
g_ref will either become totally obsolete or be updated to match. Updating =
it now would simply be getting ahead of the curve.</div></blockquote><div><=
br></div><div>Do you have a concrete proposal for such an update to string_=
ref?</div><div>If not, you're basically asking to put the string_ref propos=
al on hold, aren't you?</div>
<p></p>
-- <br />
<br />
<br />
<br />
------=_Part_298_17659826.1353764417302--
.
Author: Olaf van der Spek <olafvdspek@gmail.com>
Date: Sat, 24 Nov 2012 06:10:59 -0800 (PST)
Raw View
------=_Part_247_19959517.1353766259062
Content-Type: text/plain; charset=ISO-8859-1
Op vrijdag 23 november 2012 19:05:40 UTC+1 schreef Jeffrey Yasskin het
volgende:
> >> stoi(const string_ref & str, size_t * idx=0, int base=10);
> >
> > Wouldn't it be better to pass string_ref by value? It's only two
> pointers.
>
> (or a pointer+a length, at the implementer's discretion. ;)
>
I was thinking about ABI and interoperability issues.
> > Why is stoui missing?
>
> stoui is "missing" because it's not in the current standard.
> string_ref isn't intended improve anything about numeric conversions,
> so I'm not changing anything there.
>
Understandable.
I did notice this:
> Returns:
stox(string(str), idx, base) where x is the type suffix of the function
called.
I hope that's just semantics. std::string construction should be avoided.
>
> > explicit operator bool() const is missing.
>
> string_ref mimics (a subset of) std::string's interface. Since no
> existing container has an operator bool(), string_ref doesn't. If you
> want to change that, write a separate proposal.
>
Will do
Was providing some of the functions as non-member functions discussed?
What happened to construction from contiguous containers (like array and
vector)?
What about the relation with array_ref and range?
Olaf
--
------=_Part_247_19959517.1353766259062
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
Op vrijdag 23 november 2012 19:05:40 UTC+1 schreef Jeffrey Yasskin het volg=
ende:<br><blockquote class=3D"gmail_quote" style=3D"margin: 0;margin-left: =
0.8ex;border-left: 1px #ccc solid;padding-left: 1ex;">>> stoi(const s=
tring_ref & str, size_t * idx=3D0, int base=3D10);
<br>>
<br>> Wouldn't it be better to pass string_ref by value? It's only two p=
ointers.
<br>
<br>(or a pointer+a length, at the implementer's discretion. ;)
<br></blockquote><div><br></div><div>I was thinking about ABI and interoper=
ability issues.</div><div> </div><blockquote class=3D"gmail_quote" sty=
le=3D"margin: 0;margin-left: 0.8ex;border-left: 1px #ccc solid;padding-left=
: 1ex;">> Why is stoui missing?
<br>
<br>stoui is "missing" because it's not in the current standard.
<br>string_ref isn't intended improve anything about numeric conversions,
<br>so I'm not changing anything there.
<br></blockquote><div><br></div><div>Understandable.</div><div>I did notice=
this:</div><div>> <span style=3D"color: rgb(0, 0, 0); font-family:=
'Times New Roman'; font-size: medium; font-style: italic;">Returns:</span>=
</div><dd style=3D"max-width: 80ex; color: rgb(0, 0, 0); font-family: 'Time=
s New Roman'; font-size: medium;"><code style=3D"white-space: nowrap;">sto<=
var>x</var>(string(str), idx, base)</code> where <code style=3D"w=
hite-space: nowrap;"><var>x</var></code> is the type suffix of the fun=
ction called.</dd><dd style=3D"max-width: 80ex; color: rgb(0, 0, 0); font-f=
amily: 'Times New Roman'; font-size: medium;"><br></dd><dd style=3D"max-wid=
th: 80ex; color: rgb(0, 0, 0); font-family: 'Times New Roman'; font-size: m=
edium;">I hope that's just semantics. std::string construction should be av=
oided.</dd><dd style=3D"max-width: 80ex; color: rgb(0, 0, 0); font-family: =
'Times New Roman'; font-size: medium;"> <br></dd><blockquote class=3D"=
gmail_quote" style=3D"margin: 0;margin-left: 0.8ex;border-left: 1px #ccc so=
lid;padding-left: 1ex;">
<br>> explicit operator bool() const is missing.
<br>
<br>string_ref mimics (a subset of) std::string's interface. Since no
<br>existing container has an operator bool(), string_ref doesn't. If you
<br>want to change that, write a separate proposal.
<br></blockquote><div><br></div><div>Will do</div><div> </div><div>Was=
providing some of the functions as non-member functions discussed?</div><d=
iv>What happened to construction from contiguous containers (like array and=
vector)?</div><div><br></div><div>What about the relation with array_ref a=
nd range?</div><div><br></div><div>Olaf</div>
<p></p>
-- <br />
<br />
<br />
<br />
------=_Part_247_19959517.1353766259062--
.
Author: DeadMG <wolfeinstein@gmail.com>
Date: Sat, 24 Nov 2012 06:32:37 -0800 (PST)
Raw View
------=_Part_295_8614856.1353767557922
Content-Type: text/plain; charset=ISO-8859-1
Well, array_ref<T> can just be left as it is. As for string_ref, there's
nothing inherently wrong with basic_string_ref, but you would also need a
unicode_string_ref. In addition, the numerical conversion functions
proposed would need a complete refactoring, as you cannot for example
convert from a u16string to an integer.
But more specifically, this seems to me to be just a desire for ranges,
with a side order of not using the existing Standard facilities. I mean, I
love ranges and I think they're great, but I disagree with special-casing
only some Standard classes to support them. The rationale given in the
proposal is bad, too.
The usual approach here is to have the client explicitly pass in a pointer
> and a length, as in:
> std::vector<int> my_vector;
> for (size_t i = 5; i > 0; --i) { my_vector.push_back(i); }
> MyOldRoutine(my_vector.data(), my_vector.size());
> std::array<int, 4> my_std_array = {{4, 3, 2, 1}};
> MyOldRoutine(my_std_array.data(), my_std_array.size());
> int my_array[10] = {10, 9, 8, 7, 6, 5, 4, 3, 2, 1};
> MyOldRoutine(my_array, 10);
> int* dyn_array = new int[3];
> for (size_t i = 0; i < 3; ++i) { dyn_array[i] = 3 - i; }
> MyOldRoutine(dyn_array, 3);
It's no surprise that this doesn't work well, because that's *not* the
usual approach and hasn't been since the Standard Template Library. The
correct approach is to pass a pair of pointers representing the range,
which can be trivially extracted from iterators and is not error-prone at
all.
std::vector<int> my_vector;
for (size_t i = 5; i > 0; --i) { my_vector.push_back(i); }
MyOldRoutine(&*std::begin(my_vector), &*std::end(my_vector));
std::array<int, 4> my_std_array = {{4, 3, 2, 1}};
MyOldRoutine(&*std::begin(my_std_array), &*std::end(my_std_array));
int my_array[10] = {10, 9, 8, 7, 6, 5, 4, 3, 2, 1};
MyOldRoutine(std::begin(my_array), std::end(my_array));
int* dyn_array = new int[3]; // This is hideously bad anyway- who still
does this?
for (size_t i = 0; i < 3; ++i) { dyn_array[i] = 3 - i; }
MyOldRoutine(dyn_array, dyn_array + 3);
Now, I can appreciate the need for ranges as singular objects as opposed to
iterators, and I can also appreciate the need for pointers and arrays as
opposed to a template-based abstraction when dealing with binary
compatibility, but the only problem array_ref solves is ranges being better
than iterators. If you feel strongly, perhaps you should consider adding to
or creating a range proposal.
In addition, there is no rationale given for std::basic_string_ref. Why
does this class exist? The only thing it offers over array_ref is the
string operations that basic_string offers- the ones which, if they're even
necessary at all, should be generic algorithms and are held up as a
defining example of why std::basic_string violates many design principles-
and a conversion to std::basic_string, which is redundant in the face of
ranges. If you think functions with this functionality should exist, you
should offer them as freestanding algorithms, not as members of
basic_string_ref.
A partial specialization of array_ref for const char and similar which can
deal with null-terminated C strings in addition would be fine, at most,
since the array constructor can already take string literals- assuming
array_ref can justify it's existence.
And I might add, on a more personal note, that the chosen libraries in the
rationale are a bad choice. Particularly, the Google C++ style guide is
about as far away from Modern C++ as you can get, and LLVM is almost as
bad. If you choose to stick to pre-C++98, then that's your choice, but also
your problem, and there's no reason to Standardise components which solve
problems already solved by the Standard because some libraries choose not
to use the existing Standard facilities.
--
------=_Part_295_8614856.1353767557922
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
Well, array_ref<T> can just be left as it is. As for string_ref, ther=
e's nothing inherently wrong with basic_string_ref, but you would also need=
a unicode_string_ref. In addition, the numerical conversion functions prop=
osed would need a complete refactoring, as you cannot for example convert f=
rom a u16string to an integer.<div><br></div><div>But more specifically, th=
is seems to me to be just a desire for ranges, with a side order of not usi=
ng the existing Standard facilities. I mean, I love ranges and I think they=
're great, but I disagree with special-casing only some Standard classes to=
support them. The rationale given in the proposal is bad, too.</div><div><=
br></div><blockquote class=3D"gmail_quote" style=3D"margin: 0px 0px 0px 0.8=
ex; border-left-width: 1px; border-left-color: rgb(204, 204, 204); border-l=
eft-style: solid; padding-left: 1ex;"><span style=3D"color: rgb(0, 0, 0); f=
ont-family: 'Times New Roman'; font-size: medium;">The usual approach here =
is to have the client explicitly pass in a pointer and a length, as in:<br>=
</span><code style=3D"color: rgb(0, 0, 0);"><span class=3D"normal">std::vec=
tor<int> my_vector;<br></span></code><code style=3D"color: rgb(0=
, 0, 0);"><span class=3D"keywordflow">for</span><span class=3D"normal">&nbs=
p;(</span><span class=3D"keywordtype">size_t</span><span class=3D"normal">&=
nbsp;i =3D 5; i > 0; --i) { my_v=
ector.push_back(i); }<br></span></code><code style=3D"color: rgb(0, 0,=
0);"><span class=3D"normal">MyOldRoutine(my_vector.data(), my_vector.=
size());</span></code><br style=3D"color: rgb(0, 0, 0); font-family: 'Times=
New Roman'; font-size: medium;"><code style=3D"color: rgb(0, 0, 0);"><span=
class=3D"normal">std::array<int, 4> my_std_array =3D&=
nbsp;{{4, 3, 2, 1}};<br></span></code><code style=3D"color: =
rgb(0, 0, 0);"><span class=3D"normal">MyOldRoutine(my_std_array.data(),&nbs=
p;my_std_array.size());</span></code><br style=3D"color: rgb(0, 0, 0); font=
-family: 'Times New Roman'; font-size: medium;"><code style=3D"color: rgb(0=
, 0, 0);"><span class=3D"keywordtype">int</span><span class=3D"normal">&nbs=
p;my_array[10] =3D {10, 9, 8, 7, 6, 5,&n=
bsp;4, 3, 2, 1};<br></span></code><code style=3D"color: rgb(=
0, 0, 0);"><span class=3D"normal">MyOldRoutine(my_array, 10);</span></=
code><br style=3D"color: rgb(0, 0, 0); font-family: 'Times New Roman'; font=
-size: medium;"><code style=3D"color: rgb(0, 0, 0);"><span class=3D"keyword=
type">int</span><span class=3D"normal">* dyn_array =3D </spa=
n><span class=3D"keyword">new</span><span class=3D"normal"> </span><sp=
an class=3D"keywordtype">int</span><span class=3D"normal">[3];<br></span></=
code><code style=3D"color: rgb(0, 0, 0);"><span class=3D"keywordflow">for</=
span><span class=3D"normal"> (</span><span class=3D"keywordtype">size_=
t</span><span class=3D"normal"> i =3D 0; i <&nb=
sp;3; ++i) { dyn_array[i] =3D 3 - i;&nbs=
p;}<br></span></code><code style=3D"color: rgb(0, 0, 0);"><span class=3D"no=
rmal">MyOldRoutine(dyn_array, 3);</span></code></blockquote><div><br><=
/div><div>It's no surprise that this doesn't work well, because that's =
;<i>not</i> the usual approach and hasn't been since the Standard Temp=
late Library. The correct approach is to pass a pair of pointers representi=
ng the range, which can be trivially extracted from iterators and is not er=
ror-prone at all.</div><div><br></div><div><span style=3D"color: rgb(0, 0, =
0);">std::vector<int> my_vector;</span></div><code style=3D"colo=
r: rgb(0, 0, 0);"><span class=3D"keywordflow">for</span><span class=3D"norm=
al"> (</span><span class=3D"keywordtype">size_t</span><span class=3D"n=
ormal"> i =3D 5; i > 0; --i) {&n=
bsp;my_vector.push_back(i); }<br></span></code><code style=3D"color: r=
gb(0, 0, 0);"><span class=3D"normal">MyOldRoutine(&*std::begin(my_vecto=
r), &*std::end(my_vector));</span></code><br style=3D"color: rgb(0, 0, =
0); font-family: 'Times New Roman'; font-size: medium;"><code style=3D"colo=
r: rgb(0, 0, 0);"><span class=3D"normal">std::array<int, 4> =
;my_std_array =3D {{4, 3, 2, 1}};<br></span></code=
><code style=3D"color: rgb(0, 0, 0);"><span class=3D"normal">MyOldRoutine(&=
amp;*std::begin(my_std_array), &*std::end(my_std_array));</span></=
code><br style=3D"color: rgb(0, 0, 0); font-family: 'Times New Roman'; font=
-size: medium;"><code style=3D"color: rgb(0, 0, 0);"><span class=3D"keyword=
type">int</span><span class=3D"normal"> my_array[10] =3D {10=
, 9, 8, 7, 6, 5, 4, 3, 2, 1};<=
br></span></code><code style=3D"color: rgb(0, 0, 0);"><span class=3D"normal=
">MyOldRoutine(std::begin(my_array), std::end(my_array));</span></code><br =
style=3D"color: rgb(0, 0, 0); font-family: 'Times New Roman'; font-size: me=
dium;"><code style=3D"color: rgb(0, 0, 0);"><span class=3D"keywordtype">int=
</span><span class=3D"normal">* dyn_array =3D </span><span c=
lass=3D"keyword">new</span><span class=3D"normal"> </span><span class=
=3D"keywordtype">int</span><span class=3D"normal">[3]; // This is hideously=
bad anyway- who still does this?<br></span></code><code style=3D"color: rg=
b(0, 0, 0);"><span class=3D"keywordflow">for</span><span class=3D"normal">&=
nbsp;(</span><span class=3D"keywordtype">size_t</span><span class=3D"normal=
"> i =3D 0; i < 3; ++i) { d=
yn_array[i] =3D 3 - i; }<br></span></code><code st=
yle=3D"color: rgb(0, 0, 0);"><span class=3D"normal">MyOldRoutine(dyn_array,=
dyn_array + 3);</span></code><div><code style=3D"color: rgb(0, 0, 0);"><sp=
an class=3D"normal"><br></span></code></div>Now, I can appreciate the need =
for ranges as singular objects as opposed to iterators, and I can also appr=
eciate the need for pointers and arrays as opposed to a template-based abst=
raction when dealing with binary compatibility, but the only problem array_=
ref solves is ranges being better than iterators. If you feel strongly, per=
haps you should consider adding to or creating a range proposal.<div><br></=
div><div>In addition, there is no rationale given for std::basic_string_ref=
.. Why does this class exist? The only thing it offers over array_ref is the=
string operations that basic_string offers- the ones which, if they're eve=
n necessary at all, should be generic algorithms and are held up as a defin=
ing example of why std::basic_string violates many design principles- and a=
conversion to std::basic_string, which is redundant in the face of ranges.=
If you think functions with this functionality should exist, you should of=
fer them as freestanding algorithms, not as members of basic_string_ref.&nb=
sp;</div><div><br></div><div>A partial specialization of array_ref for cons=
t char and similar which can deal with null-terminated C strings in additio=
n would be fine, at most, since the array constructor can already take stri=
ng literals- assuming array_ref can justify it's existence.</div><div><br><=
/div><div>And I might add, on a more personal note, that the chosen librari=
es in the rationale are a bad choice. Particularly, the Google C++ style gu=
ide is about as far away from Modern C++ as you can get, and LLVM is almost=
as bad. If you choose to stick to pre-C++98, then that's your choice, but =
also your problem, and there's no reason to Standardise components which so=
lve problems already solved by the Standard because some libraries choose n=
ot to use the existing Standard facilities.</div>
<p></p>
-- <br />
<br />
<br />
<br />
------=_Part_295_8614856.1353767557922--
.
Author: Nicol Bolas <jmckesson@gmail.com>
Date: Sat, 24 Nov 2012 09:36:28 -0800 (PST)
Raw View
------=_Part_1104_14007735.1353778588622
Content-Type: text/plain; charset=ISO-8859-1
On Saturday, November 24, 2012 6:32:38 AM UTC-8, DeadMG wrote:
>
> Well, array_ref<T> can just be left as it is. As for string_ref, there's
> nothing inherently wrong with basic_string_ref, but you would also need a
> unicode_string_ref. In addition, the numerical conversion functions
> proposed would need a complete refactoring, as you cannot for example
> convert from a u16string to an integer.
>
> But more specifically, this seems to me to be just a desire for ranges,
> with a side order of not using the existing Standard facilities. I mean, I
> love ranges and I think they're great, but I disagree with special-casing
> only some Standard classes to support them. The rationale given in the
> proposal is bad, too.
>
> The usual approach here is to have the client explicitly pass in a pointer
>> and a length, as in:
>> std::vector<int> my_vector;
>> for (size_t i = 5; i > 0; --i) { my_vector.push_back(i); }
>> MyOldRoutine(my_vector.data(), my_vector.size());
>> std::array<int, 4> my_std_array = {{4, 3, 2, 1}};
>> MyOldRoutine(my_std_array.data(), my_std_array.size());
>> int my_array[10] = {10, 9, 8, 7, 6, 5, 4, 3, 2, 1};
>> MyOldRoutine(my_array, 10);
>> int* dyn_array = new int[3];
>> for (size_t i = 0; i < 3; ++i) { dyn_array[i] = 3 - i; }
>> MyOldRoutine(dyn_array, 3);
>
>
> It's no surprise that this doesn't work well, because that's *not* the
> usual approach and hasn't been since the Standard Template Library. The
> correct approach is to pass a pair of pointers representing the range,
> which can be trivially extracted from iterators and is not error-prone at
> all.
>
The "correct approach" is in the eye of the beholder. STL may have created
a new paradigm, but the world did not adopt it as "standard" or "correct".
Some did, most didn't. And certainly no C APIs did. This generally makes it
much more difficult to interact with C-based APIs.
Iterators are a great abstraction of pointers, but when you have an API
that *takes pointers*, it's... pointless.
The purpose of the array_ref is that it is not merely an arbitrary range of
iterators. It is specifically a *pointer range*. It is an array: a
collection of elements of type T aggregated contiguously in memory. It also
has functions for trimming the range, which a more generic iterator range
would not have. It has these functions because it knows exactly what it's
contents are.
>
> std::vector<int> my_vector;
> for (size_t i = 5; i > 0; --i) { my_vector.push_back(i); }
> MyOldRoutine(&*std::begin(my_vector), &*std::end(my_vector));
> std::array<int, 4> my_std_array = {{4, 3, 2, 1}};
> MyOldRoutine(&*std::begin(my_std_array), &*std::end(my_std_array));
> int my_array[10] = {10, 9, 8, 7, 6, 5, 4, 3, 2, 1};
> MyOldRoutine(std::begin(my_array), std::end(my_array));
> int* dyn_array = new int[3]; // This is hideously bad anyway- who still
> does this?
> for (size_t i = 0; i < 3; ++i) { dyn_array[i] = 3 - i; }
> MyOldRoutine(dyn_array, dyn_array + 3);
>
> Now, I can appreciate the need for ranges as singular objects as opposed
> to iterators, and I can also appreciate the need for pointers and arrays as
> opposed to a template-based abstraction when dealing with binary
> compatibility, but the only problem array_ref solves is ranges being better
> than iterators. If you feel strongly, perhaps you should consider adding to
> or creating a range proposal.
As someone stated in one of my threads, perfect is the enemy of good.
array_ref solves a legitimate problem with passing arrays to APIs that deal
in pointers. You may not like that such APIs exist, you may thing that
they're bad design. That's irrelevant, because those APIs *do* exist,
people need to interact with them, and this helps them do that effectively.
array_ref is needed with or without the range proposal.
Ranges is a huge proposal, which includes a lot more than array_ref and
basic_string_ref. It includes range adapters, range algorithms, and all
kinds of other things. This is a simple proposal that solves a very
specific problem. Plus, it already interoperates well with Boost.Range, the
foundation of the current Range study-group, so there's no real need to
delay it.
In addition, there is no rationale given for std::basic_string_ref. Why
> does this class exist? The only thing it offers over array_ref is the
> string operations that basic_string offers- the ones which, if they're even
> necessary at all, should be generic algorithms and are held up as a
> defining example of why std::basic_string violates many design principles-
> and a conversion to std::basic_string, which is redundant in the face of
> ranges. If you think functions with this functionality should exist, you
> should offer them as freestanding algorithms, not as members of
> basic_string_ref.
>
I understand the argument about basic_string's member functions, but the
thing you have to remember is that it is an *argument*, not an absolute
fact. It violates some design principles, yes. Who's? Yours perhaps, but
not necessarily someone else's.
There are some member functions that basic_string that I believe
basic_string shouldn't have had. And there are some it should. Some of them
simply aren't useful outside the context of operating with strings. And as
much as some people don't like it, people *do* want to treat strings
specially from just "an array of things."
At the end of the day, basic_string has those functions. So
basic_string_ref should too (where appropriate). The purpose of the object
is to be able to take a constant reference to a basic_string without caring
about the *storage* of the data. This means that they should have access to
all of the const member functions of the string.
Like array_ref, it solves a real problem that C++ programmers face. It may
not solve it the way you want to, but there's a reason why it is frequently
used and why something like this was developed in multiple different places.
A partial specialization of array_ref for const char and similar which can
> deal with null-terminated C strings in addition would be fine, at most,
> since the array constructor can already take string literals- assuming
> array_ref can justify it's existence.
>
> And I might add, on a more personal note, that the chosen libraries in the
> rationale are a bad choice. Particularly, the Google C++ style guide is
> about as far away from Modern C++ as you can get, and LLVM is almost as
> bad. If you choose to stick to pre-C++98, then that's your choice, but also
> your problem, and there's no reason to Standardise components which solve
> problems already solved by the Standard because some libraries choose not
> to use the existing Standard facilities.
>
--
------=_Part_1104_14007735.1353778588622
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
<br><br>On Saturday, November 24, 2012 6:32:38 AM UTC-8, DeadMG wrote:<bloc=
kquote class=3D"gmail_quote" style=3D"margin: 0;margin-left: 0.8ex;border-l=
eft: 1px #ccc solid;padding-left: 1ex;">Well, array_ref<T> can just b=
e left as it is. As for string_ref, there's nothing inherently wrong with b=
asic_string_ref, but you would also need a unicode_string_ref. In addition,=
the numerical conversion functions proposed would need a complete refactor=
ing, as you cannot for example convert from a u16string to an integer.<div>=
<br></div><div>But more specifically, this seems to me to be just a desire =
for ranges, with a side order of not using the existing Standard facilities=
.. I mean, I love ranges and I think they're great, but I disagree with spec=
ial-casing only some Standard classes to support them. The rationale given =
in the proposal is bad, too.</div><div><br></div><blockquote class=3D"gmail=
_quote" style=3D"margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left=
-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><span sty=
le=3D"color:rgb(0,0,0);font-family:'Times New Roman';font-size:medium">The =
usual approach here is to have the client explicitly pass in a pointer and =
a length, as in:<br></span><code style=3D"color:rgb(0,0,0)"><span>std::vect=
or<int> my_vector;<br></span></code><code style=3D"color:rgb(0,0=
,0)"><span>for</span><span> (</span><span>size_t</span><span> i&n=
bsp;=3D 5; i > 0; --i)<wbr> { my_vect=
or.push_back(i); }<br></span></code><code style=3D"color:rgb(0,0,0)"><=
span>MyOldRoutine(my_vector.data(),<wbr> my_vector.size());</span></co=
de><br style=3D"color:rgb(0,0,0);font-family:'Times New Roman';font-size:me=
dium"><code style=3D"color:rgb(0,0,0)"><span>std::array<int, 4>&=
nbsp;my_std_<wbr>array =3D {{4, 3, 2, 1}};<br></sp=
an></code><code style=3D"color:rgb(0,0,0)"><span>MyOldRoutine(my_std_array.=
<wbr>data(), my_std_array.size());</span></code><br style=3D"color:rgb=
(0,0,0);font-family:'Times New Roman';font-size:medium"><code style=3D"colo=
r:rgb(0,0,0)"><span>int</span><span> my_array[10] =3D {10,&n=
bsp;9, 8, <wbr>7, 6, 5, 4, 3, 2, 1}=
;<br></span></code><code style=3D"color:rgb(0,0,0)"><span>MyOldRoutine(my_a=
rray, 10);</span></code><br style=3D"color:rgb(0,0,0);font-family:'Tim=
es New Roman';font-size:medium"><code style=3D"color:rgb(0,0,0)"><span>int<=
/span><span>* dyn_array =3D </span><span>new</span><span>&nb=
sp;</span><span>int</span><span>[3];<br></span></code><code style=3D"color:=
rgb(0,0,0)"><span>for</span><span> (</span><span>size_t</span><span>&n=
bsp;i =3D 0; i < 3; ++i)<wbr> { =
dyn_array[i] =3D 3 - i; }<br></span></code><code s=
tyle=3D"color:rgb(0,0,0)"><span>MyOldRoutine(dyn_array, 3);</span></co=
de></blockquote><div><br></div><div>It's no surprise that this doesn't work=
well, because that's <i>not</i> the usual approach and hasn't be=
en since the Standard Template Library. The correct approach is to pass a p=
air of pointers representing the range, which can be trivially extracted fr=
om iterators and is not error-prone at all.</div></blockquote><div><br>The =
"correct approach" is in the eye of the beholder. STL may have created a ne=
w paradigm, but the world did not adopt it as "standard" or "correct". Some=
did, most didn't. And certainly no C APIs did. This generally makes it muc=
h more difficult to interact with C-based APIs.<br><br>Iterators are a grea=
t abstraction of pointers, but when you have an API that <i>takes pointers<=
/i>, it's... pointless.<br><br>The purpose of the array_ref is that it is n=
ot merely an arbitrary range of iterators. It is specifically a <i>pointer =
range</i>. It is an array: a collection of elements of type T aggregated co=
ntiguously in memory. It also has functions for trimming the range, which a=
more generic iterator range would not have. It has these functions because=
it knows exactly what it's contents are.<br> </div><blockquote class=
=3D"gmail_quote" style=3D"margin: 0;margin-left: 0.8ex;border-left: 1px #cc=
c solid;padding-left: 1ex;"><div><br></div><div><span style=3D"color:rgb(0,=
0,0)">std::vector<int> my_vector;</span></div><code style=3D"col=
or:rgb(0,0,0)"><span>for</span><span> (</span><span>size_t</span><span=
> i =3D 5; i > 0; --i)<wbr> {&nb=
sp;my_vector.push_back(i); }<br></span></code><code style=3D"color:rgb=
(0,0,0)"><span>MyOldRoutine(&*std::begin(my_<wbr>vector), &*std::en=
d(my_vector));</span></code><br style=3D"color:rgb(0,0,0);font-family:'Time=
s New Roman';font-size:medium"><code style=3D"color:rgb(0,0,0)"><span>std::=
array<int, 4> my_std_<wbr>array =3D {{4, 3,&=
nbsp;2, 1}};<br></span></code><code style=3D"color:rgb(0,0,0)"><span>M=
yOldRoutine(&*std::begin(my_<wbr>std_array), &*std::end(my_std=
_<wbr>array));</span></code><br style=3D"color:rgb(0,0,0);font-family:'Time=
s New Roman';font-size:medium"><code style=3D"color:rgb(0,0,0)"><span>int</=
span><span> my_array[10] =3D {10, 9, 8, <wbr>=
7, 6, 5, 4, 3, 2, 1};<br></span></code><code =
style=3D"color:rgb(0,0,0)"><span>MyOldRoutine(std::begin(my_<wbr>array), st=
d::end(my_array));</span></code><br style=3D"color:rgb(0,0,0);font-family:'=
Times New Roman';font-size:medium"><code style=3D"color:rgb(0,0,0)"><span>i=
nt</span><span>* dyn_array =3D </span><span>new</span><span>=
</span><span>int</span><span>[3]; // This is hideously bad anyway- wh=
o still does this?<br></span></code><code style=3D"color:rgb(0,0,0)"><span>=
for</span><span> (</span><span>size_t</span><span> i =3D&nbs=
p;0; i < 3; ++i)<wbr> { dyn_array[i] =
=3D 3 - i; }<br></span></code><code style=3D"color:rgb(=
0,0,0)"><span>MyOldRoutine(dyn_array, dyn_array + 3);</span></code><div><co=
de style=3D"color:rgb(0,0,0)"><span><br></span></code></div>Now, I can appr=
eciate the need for ranges as singular objects as opposed to iterators, and=
I can also appreciate the need for pointers and arrays as opposed to a tem=
plate-based abstraction when dealing with binary compatibility, but the onl=
y problem array_ref solves is ranges being better than iterators. If you fe=
el strongly, perhaps you should consider adding to or creating a range prop=
osal.</blockquote><div><br>As someone stated in one of my threads, perfect =
is the enemy of good. array_ref solves a legitimate problem with passing ar=
rays to APIs that deal in pointers. You may not like that such APIs exist, =
you may thing that they're bad design. That's irrelevant, because those API=
s <i>do</i> exist, people need to interact with them, and this helps them d=
o that effectively.<br><br>array_ref is needed with or without the range pr=
oposal.<br><br>Ranges is a huge proposal, which includes a lot more than ar=
ray_ref and basic_string_ref. It includes range adapters, range algorithms,=
and all kinds of other things. This is a simple proposal that solves a ver=
y specific problem. Plus, it already interoperates well with Boost.Range, t=
he foundation of the current Range study-group, so there's no real need to =
delay it.<br><br></div><blockquote class=3D"gmail_quote" style=3D"margin: 0=
;margin-left: 0.8ex;border-left: 1px #ccc solid;padding-left: 1ex;"><div></=
div><div>In addition, there is no rationale given for std::basic_string_ref=
.. Why does this class exist? The only thing it offers over array_ref is the=
string operations that basic_string offers- the ones which, if they're eve=
n necessary at all, should be generic algorithms and are held up as a defin=
ing example of why std::basic_string violates many design principles- and a=
conversion to std::basic_string, which is redundant in the face of ranges.=
If you think functions with this functionality should exist, you should of=
fer them as freestanding algorithms, not as members of basic_string_ref.</d=
iv></blockquote><div><br>I understand the argument about basic_string's mem=
ber functions, but the thing you have to remember is that it is an <i>argum=
ent</i>, not an absolute fact. It violates some design principles, yes. Who=
's? Yours perhaps, but not necessarily someone else's.<br><br>There are som=
e member functions that basic_string that I believe basic_string shouldn't =
have had. And there are some it should. Some of them simply aren't useful o=
utside the context of operating with strings. And as much as some people do=
n't like it, people <i>do</i> want to treat strings specially from just "an=
array of things."<br><br>At the end of the day, basic_string has those fun=
ctions. So basic_string_ref should too (where appropriate). The purpose of =
the object is to be able to take a constant reference to a basic_string wit=
hout caring about the <i>storage</i> of the data. This means that they shou=
ld have access to all of the const member functions of the string.<br><br>L=
ike array_ref, it solves a real problem that C++ programmers face. It may n=
ot solve it the way you want to, but there's a reason why it is frequently =
used and why something like this was developed in multiple different places=
..<br><br></div><blockquote class=3D"gmail_quote" style=3D"margin: 0;margin-=
left: 0.8ex;border-left: 1px #ccc solid;padding-left: 1ex;"><div></div><div=
>A partial specialization of array_ref for const char and similar which can=
deal with null-terminated C strings in addition would be fine, at most, si=
nce the array constructor can already take string literals- assuming array_=
ref can justify it's existence.</div><div><br></div><div>And I might add, o=
n a more personal note, that the chosen libraries in the rationale are a ba=
d choice. Particularly, the Google C++ style guide is about as far away fro=
m Modern C++ as you can get, and LLVM is almost as bad. If you choose to st=
ick to pre-C++98, then that's your choice, but also your problem, and there=
's no reason to Standardise components which solve problems already solved =
by the Standard because some libraries choose not to use the existing Stand=
ard facilities.</div></blockquote>
<p></p>
-- <br />
<br />
<br />
<br />
------=_Part_1104_14007735.1353778588622--
.
Author: Olaf van der Spek <olafvdspek@gmail.com>
Date: Sat, 24 Nov 2012 10:05:54 -0800 (PST)
Raw View
------=_Part_468_24290305.1353780354898
Content-Type: text/plain; charset=ISO-8859-1
Op zaterdag 24 november 2012 18:36:28 UTC+1 schreef Nicol Bolas het
volgende:
>
> The purpose of the array_ref is that it is not merely an arbitrary range
> of iterators. It is specifically a *pointer range*. It is an array: a
> collection of elements of type T aggregated contiguously in memory. It also
> has functions for trimming the range, which a more generic iterator range
> would not have. It has these functions because it knows exactly what it's
> contents are.
>
What functions would that be? boost::iterator_range supports pop_front()
and pop_back().
> array_ref is needed with or without the range proposal.
>
What'd be the difference between array_ref<char> and range<const char*>?
Wouldn't string_ref basically be range<const char*> with extra constructors
and member functions?
--
------=_Part_468_24290305.1353780354898
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
Op zaterdag 24 november 2012 18:36:28 UTC+1 schreef Nicol Bolas het volgend=
e:<br><blockquote class=3D"gmail_quote" style=3D"margin: 0;margin-left: 0.8=
ex;border-left: 1px #ccc solid;padding-left: 1ex;"><br><div>The purpose of =
the array_ref is that it is not merely an arbitrary range of iterators. It =
is specifically a <i>pointer range</i>. It is an array: a collection of ele=
ments of type T aggregated contiguously in memory. It also has functions fo=
r trimming the range, which a more generic iterator range would not have. I=
t has these functions because it knows exactly what it's contents are.<br><=
/div></blockquote><div><br></div><div>What functions would that be? boost::=
iterator_range supports pop_front() and pop_back().</div><div> </div><=
blockquote class=3D"gmail_quote" style=3D"margin: 0;margin-left: 0.8ex;bord=
er-left: 1px #ccc solid;padding-left: 1ex;"><div>array_ref is needed with o=
r without the range proposal.</div></blockquote><div><br></div><div>What'd =
be the difference between array_ref<char> and range<const char*>=
;?</div><div> </div><div>Wouldn't string_ref basically be range<con=
st char*> with extra constructors and member functions?</div>
<p></p>
-- <br />
<br />
<br />
<br />
------=_Part_468_24290305.1353780354898--
.
Author: Nicol Bolas <jmckesson@gmail.com>
Date: Sat, 24 Nov 2012 10:16:42 -0800 (PST)
Raw View
------=_Part_1287_32651499.1353781002089
Content-Type: text/plain; charset=ISO-8859-1
On Saturday, November 24, 2012 10:05:55 AM UTC-8, Olaf van der Spek wrote:
>
> Op zaterdag 24 november 2012 18:36:28 UTC+1 schreef Nicol Bolas het
> volgende:
>
>>
>> The purpose of the array_ref is that it is not merely an arbitrary range
>> of iterators. It is specifically a *pointer range*. It is an array: a
>> collection of elements of type T aggregated contiguously in memory. It also
>> has functions for trimming the range, which a more generic iterator range
>> would not have. It has these functions because it knows exactly what it's
>> contents are.
>>
>
> What functions would that be? boost::iterator_range supports pop_front()
> and pop_back().
>
Does it support direct conversion into a std::vector (though personally I'm
a bit ambivalent about that function. Especially since it doesn't take an
allocator type)? What about remove_prefix/suffix? What about *size*; does
it have one of those?
That's my point: iterator_range is just a generic range of iterators.
array_ref is an *array*; it knows it's an array and can thus have useful
functions like `size`.
Also, it should be noted that array_ref and basic_string_ref both have very
real possibilities of being standard-layout. If we explicitly require a a
particular set of members, then inter-communication across DLL boundaries
with different versions and such is possible. Even better, C APIs can be
designed to take them.
> array_ref is needed with or without the range proposal.
>>
>
> What'd be the difference between array_ref<char> and range<const char*>?
>
Wouldn't string_ref basically be range<const char*> with extra constructors
> and member functions?
>
No. basic_string_ref uses char_traits. And I'm not sure why "with extra
constructors and member functions" should be discounted. Ease of use and
consistency issues are not unimportant.
--
------=_Part_1287_32651499.1353781002089
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
<br><br>On Saturday, November 24, 2012 10:05:55 AM UTC-8, Olaf van der Spek=
wrote:<blockquote class=3D"gmail_quote" style=3D"margin: 0;margin-left: 0.=
8ex;border-left: 1px #ccc solid;padding-left: 1ex;">Op zaterdag 24 november=
2012 18:36:28 UTC+1 schreef Nicol Bolas het volgende:<br><blockquote class=
=3D"gmail_quote" style=3D"margin:0;margin-left:0.8ex;border-left:1px #ccc s=
olid;padding-left:1ex"><br><div>The purpose of the array_ref is that it is =
not merely an arbitrary range of iterators. It is specifically a <i>pointer=
range</i>. It is an array: a collection of elements of type T aggregated c=
ontiguously in memory. It also has functions for trimming the range, which =
a more generic iterator range would not have. It has these functions becaus=
e it knows exactly what it's contents are.<br></div></blockquote><div><br><=
/div><div>What functions would that be? boost::iterator_range supports pop_=
front() and pop_back().</div></blockquote><div><br>Does it support direct c=
onversion into a std::vector (though personally I'm a bit ambivalent about =
that function. Especially since it doesn't take an allocator type)? What ab=
out remove_prefix/suffix? What about <i>size</i>; does it have one of those=
?<br><br>That's my point: iterator_range is just a generic range of iterato=
rs. array_ref is an <i>array</i>; it knows it's an array and can thus have =
useful functions like `size`.<br><br>Also, it should be noted that array_re=
f and basic_string_ref both have very real possibilities of being standard-=
layout. If we explicitly require a a particular set of members, then inter-=
communication across DLL boundaries with different versions and such is pos=
sible. Even better, C APIs can be designed to take them.<br> <br></div=
><blockquote class=3D"gmail_quote" style=3D"margin: 0;margin-left: 0.8ex;bo=
rder-left: 1px #ccc solid;padding-left: 1ex;"><blockquote class=3D"gmail_qu=
ote" style=3D"margin:0;margin-left:0.8ex;border-left:1px #ccc solid;padding=
-left:1ex"><div>array_ref is needed with or without the range proposal.</di=
v></blockquote><div><br></div><div>What'd be the difference between array_r=
ef<char> and range<const char*>?</div></blockquote><blockquote =
class=3D"gmail_quote" style=3D"margin: 0;margin-left: 0.8ex;border-left: 1p=
x #ccc solid;padding-left: 1ex;"><div>Wouldn't string_ref basically be rang=
e<const char*> with extra constructors and member functions?</div></b=
lockquote><div><br>No. basic_string_ref uses char_traits. And I'm not sure =
why "with extra constructors and member functions" should be discounted. Ea=
se of use and consistency issues are not unimportant.<br></div>
<p></p>
-- <br />
<br />
<br />
<br />
------=_Part_1287_32651499.1353781002089--
.
Author: Olaf van der Spek <olafvdspek@gmail.com>
Date: Sat, 24 Nov 2012 10:42:03 -0800 (PST)
Raw View
------=_Part_236_31585732.1353782523525
Content-Type: text/plain; charset=ISO-8859-1
Op zaterdag 24 november 2012 19:16:42 UTC+1 schreef Nicol Bolas het
volgende:
> What functions would that be? boost::iterator_range supports pop_front()
> and pop_back().
>
> Does it support direct conversion into a std::vector (though personally
> I'm a bit ambivalent about that function. Especially since it doesn't take
> an allocator type)?
>
Eh, what? Don't think it does, should it?
What about remove_prefix/suffix?
>
Yes, via advance_begin and advance_end.
> What about *size*; does it have one of those?
>
Yes, it does. It has operator[] too, though no data() (yet).
> That's my point: iterator_range is just a generic range of iterators.
> array_ref is an *array*; it knows it's an array and can thus have useful
> functions like `size`.
>
> Also, it should be noted that array_ref and basic_string_ref both have
> very real possibilities of being standard-layout. If we explicitly require
> a a particular set of members, then inter-communication across DLL
> boundaries with different versions and such is possible. Even better, C
> APIs can be designed to take them.
>
Standard-layout would be great, although the current proposal doesn't
require it.
But range for pointers could be standard layout too.
>
>
>> array_ref is needed with or without the range proposal.
>>>
>>
>> What'd be the difference between array_ref<char> and range<const char*>?
>>
> Wouldn't string_ref basically be range<const char*> with extra
>> constructors and member functions?
>>
>
> No. basic_string_ref uses char_traits. And I'm not sure why "with extra
> constructors and member functions" should be discounted. Ease of use and
> consistency issues are not unimportant.
>
I'm not discounting them and I'm not saying they're not important.
--
------=_Part_236_31585732.1353782523525
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
Op zaterdag 24 november 2012 19:16:42 UTC+1 schreef Nicol Bolas het volgend=
e:<br><blockquote class=3D"gmail_quote" style=3D"margin: 0;margin-left: 0.8=
ex;border-left: 1px #ccc solid;padding-left: 1ex;">What functions would tha=
t be? boost::iterator_range supports pop_front() and pop_back().<br><div><b=
r>Does it support direct conversion into a std::vector (though personally I=
'm a bit ambivalent about that function. Especially since it doesn't take a=
n allocator type)? </div></blockquote><div><br></div><div>Eh, what? Don't t=
hink it does, should it? </div><div><br></div><blockquote class=3D"gma=
il_quote" style=3D"margin: 0;margin-left: 0.8ex;border-left: 1px #ccc solid=
;padding-left: 1ex;"><div>What about remove_prefix/suffix? </div></blockquo=
te><div><br></div><div>Yes, via advance_begin and advance_end.</div><div>&n=
bsp;</div><blockquote class=3D"gmail_quote" style=3D"margin: 0;margin-left:=
0.8ex;border-left: 1px #ccc solid;padding-left: 1ex;"><div>What about <i>s=
ize</i>; does it have one of those?<br></div></blockquote><div><br></div><d=
iv>Yes, it does. It has operator[] too, though no data() (yet). </div>=
<div> </div><blockquote class=3D"gmail_quote" style=3D"margin: 0;margi=
n-left: 0.8ex;border-left: 1px #ccc solid;padding-left: 1ex;"><div>That's m=
y point: iterator_range is just a generic range of iterators. array_ref is =
an <i>array</i>; it knows it's an array and can thus have useful functions =
like `size`.<br><br>Also, it should be noted that array_ref and basic_strin=
g_ref both have very real possibilities of being standard-layout. If we exp=
licitly require a a particular set of members, then inter-communication acr=
oss DLL boundaries with different versions and such is possible. Even bette=
r, C APIs can be designed to take them.<br></div></blockquote><div><br></di=
v><div>Standard-layout would be great, although the current proposal doesn'=
t require it.</div><div>But range for pointers could be standard layout too=
..</div><div> </div><blockquote class=3D"gmail_quote" style=3D"margin: =
0;margin-left: 0.8ex;border-left: 1px #ccc solid;padding-left: 1ex;"><div>&=
nbsp;<br></div><blockquote class=3D"gmail_quote" style=3D"margin:0;margin-l=
eft:0.8ex;border-left:1px #ccc solid;padding-left:1ex"><blockquote class=3D=
"gmail_quote" style=3D"margin:0;margin-left:0.8ex;border-left:1px #ccc soli=
d;padding-left:1ex"><div>array_ref is needed with or without the range prop=
osal.</div></blockquote><div><br></div><div>What'd be the difference betwee=
n array_ref<char> and range<const char*>?</div></blockquote><bl=
ockquote class=3D"gmail_quote" style=3D"margin:0;margin-left:0.8ex;border-l=
eft:1px #ccc solid;padding-left:1ex"><div>Wouldn't string_ref basically be =
range<const char*> with extra constructors and member functions?</div=
></blockquote><div><br>No. basic_string_ref uses char_traits. And I'm not s=
ure why "with extra constructors and member functions" should be discounted=
.. Ease of use and consistency issues are not unimportant.<br></div></blockq=
uote><div><br></div><div> I'm not discounting them and I'm not saying =
they're not important.</div>
<p></p>
-- <br />
<br />
<br />
<br />
------=_Part_236_31585732.1353782523525--
.
Author: Nicol Bolas <jmckesson@gmail.com>
Date: Sat, 24 Nov 2012 10:54:04 -0800 (PST)
Raw View
------=_Part_457_22343780.1353783244565
Content-Type: text/plain; charset=ISO-8859-1
On Saturday, November 24, 2012 10:42:03 AM UTC-8, Olaf van der Spek wrote:
>
> Op zaterdag 24 november 2012 19:16:42 UTC+1 schreef Nicol Bolas het
> volgende:
>
>> What functions would that be? boost::iterator_range supports pop_front()
>> and pop_back().
>>
>> Does it support direct conversion into a std::vector (though personally
>> I'm a bit ambivalent about that function. Especially since it doesn't take
>> an allocator type)?
>>
>
> Eh, what? Don't think it does, should it?
>
Well, as I said, I'm not exactly convinced of the need for `operator
vector<T>`. But if array_ref is going to have one, then it needs to be a
`template<typename allocator> operator vector<T, allocator>` thing.
>
> What about remove_prefix/suffix?
>>
>
> Yes, via advance_begin and advance_end.
>
>
>> What about *size*; does it have one of those?
>>
>
> Yes, it does. It has operator[] too, though no data() (yet).
>
>
>> That's my point: iterator_range is just a generic range of iterators.
>> array_ref is an *array*; it knows it's an array and can thus have useful
>> functions like `size`.
>>
>> Also, it should be noted that array_ref and basic_string_ref both have
>> very real possibilities of being standard-layout. If we explicitly require
>> a a particular set of members, then inter-communication across DLL
>> boundaries with different versions and such is possible. Even better, C
>> APIs can be designed to take them.
>>
>
> Standard-layout would be great, although the current proposal doesn't
> require it.
> But range for pointers could be standard layout too.
>
There's also the fact that array_ref is specifically an array. It's not
generic; that's the point.
>
>
>>
>>
>>> array_ref is needed with or without the range proposal.
>>>>
>>>
>>> What'd be the difference between array_ref<char> and range<const char*>?
>>>
>> Wouldn't string_ref basically be range<const char*> with extra
>>> constructors and member functions?
>>>
>>
>> No. basic_string_ref uses char_traits. And I'm not sure why "with extra
>> constructors and member functions" should be discounted. Ease of use and
>> consistency issues are not unimportant.
>>
>
> I'm not discounting them and I'm not saying they're not important.
>
--
------=_Part_457_22343780.1353783244565
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
<br><br>On Saturday, November 24, 2012 10:42:03 AM UTC-8, Olaf van der Spek=
wrote:<blockquote class=3D"gmail_quote" style=3D"margin: 0;margin-left: 0.=
8ex;border-left: 1px #ccc solid;padding-left: 1ex;">Op zaterdag 24 november=
2012 19:16:42 UTC+1 schreef Nicol Bolas het volgende:<br><blockquote class=
=3D"gmail_quote" style=3D"margin:0;margin-left:0.8ex;border-left:1px #ccc s=
olid;padding-left:1ex">What functions would that be? boost::iterator_range =
supports pop_front() and pop_back().<br><div><br>Does it support direct con=
version into a std::vector (though personally I'm a bit ambivalent about th=
at function. Especially since it doesn't take an allocator type)? </div></b=
lockquote><div><br></div><div>Eh, what? Don't think it does, should it?</di=
v></blockquote><div><br>Well, as I said, I'm not exactly convinced of the n=
eed for `operator vector<T>`. But if array_ref is going to have one, =
then it needs to be a `template<typename allocator> operator vector&l=
t;T, allocator>` thing.<br> </div><blockquote class=3D"gmail_quote"=
style=3D"margin: 0;margin-left: 0.8ex;border-left: 1px #ccc solid;padding-=
left: 1ex;"><div><br></div><blockquote class=3D"gmail_quote" style=3D"margi=
n:0;margin-left:0.8ex;border-left:1px #ccc solid;padding-left:1ex"><div>Wha=
t about remove_prefix/suffix? </div></blockquote><div><br></div><div>Yes, v=
ia advance_begin and advance_end.</div><div> </div><blockquote class=
=3D"gmail_quote" style=3D"margin:0;margin-left:0.8ex;border-left:1px #ccc s=
olid;padding-left:1ex"><div>What about <i>size</i>; does it have one of tho=
se?<br></div></blockquote><div><br></div><div>Yes, it does. It has operator=
[] too, though no data() (yet). </div><div> </div><blockquote cla=
ss=3D"gmail_quote" style=3D"margin:0;margin-left:0.8ex;border-left:1px #ccc=
solid;padding-left:1ex"><div>That's my point: iterator_range is just a gen=
eric range of iterators. array_ref is an <i>array</i>; it knows it's an arr=
ay and can thus have useful functions like `size`.<br><br>Also, it should b=
e noted that array_ref and basic_string_ref both have very real possibiliti=
es of being standard-layout. If we explicitly require a a particular set of=
members, then inter-communication across DLL boundaries with different ver=
sions and such is possible. Even better, C APIs can be designed to take the=
m.<br></div></blockquote><div><br></div><div>Standard-layout would be great=
, although the current proposal doesn't require it.</div><div>But range for=
pointers could be standard layout too.</div></blockquote><div><br>There's =
also the fact that array_ref is specifically an array. It's not generic; th=
at's the point.<br> </div><blockquote class=3D"gmail_quote" style=3D"m=
argin: 0;margin-left: 0.8ex;border-left: 1px #ccc solid;padding-left: 1ex;"=
><div> </div><blockquote class=3D"gmail_quote" style=3D"margin:0;margi=
n-left:0.8ex;border-left:1px #ccc solid;padding-left:1ex"><div> <br></=
div><blockquote class=3D"gmail_quote" style=3D"margin:0;margin-left:0.8ex;b=
order-left:1px #ccc solid;padding-left:1ex"><blockquote class=3D"gmail_quot=
e" style=3D"margin:0;margin-left:0.8ex;border-left:1px #ccc solid;padding-l=
eft:1ex"><div>array_ref is needed with or without the range proposal.</div>=
</blockquote><div><br></div><div>What'd be the difference between array_ref=
<char> and range<const char*>?</div></blockquote><blockquote cl=
ass=3D"gmail_quote" style=3D"margin:0;margin-left:0.8ex;border-left:1px #cc=
c solid;padding-left:1ex"><div>Wouldn't string_ref basically be range<co=
nst char*> with extra constructors and member functions?</div></blockquo=
te><div><br>No. basic_string_ref uses char_traits. And I'm not sure why "wi=
th extra constructors and member functions" should be discounted. Ease of u=
se and consistency issues are not unimportant.<br></div></blockquote><div><=
br></div><div> I'm not discounting them and I'm not saying they're not=
important.</div></blockquote>
<p></p>
-- <br />
<br />
<br />
<br />
------=_Part_457_22343780.1353783244565--
.
Author: Jeffrey Yasskin <jyasskin@googlers.com>
Date: Sat, 24 Nov 2012 11:07:14 -0800
Raw View
On Sat, Nov 24, 2012 at 2:29 AM, Nicol Bolas <jmckesson@gmail.com> wrote:
>
>
> On Friday, November 23, 2012 11:29:41 PM UTC-8, Nevin ":-)" Liber wrote:
>>
>> On 23 November 2012 23:23, Nicol Bolas <jmck...@gmail.com> wrote:
>>>
>>>
>>> That's my point: the LWG has this loop of:
>>>
>>> 1: Accept proposals 1 and 2.
>>
>>
>> That is one possibility. There are, of course, three others.
>>
>>>
>>> 2: Discover that proposals 1 and 2 should interact in some way. Someone
>>> writes a paper detailing those interactions.
>>
>>
>> And if only one of the proposals passes? How is it any less time to
>> untangle the two? And if you decide to combine what would have been one
>> accepted and one rejected proposal into one big one, it is more likely than
>> not for the big proposal to get rejected.
>>
>>>
>>> 3: Accept proposal 3. Discuss and potentially accept integration of 1 and
>>> 2.
>>> 4: Discover that proposal 3 now needs interactions with 1 & 2. Get
>>> someone to write a paper on that.
>>> 5: Accept proposal 4. Discuss and potentially accept integration of 1, 2,
>>> and 3.
>>> 6: Discover that proposal 4 has interactions etc...
>>
>>
>> There are now 16 possibilities.
>>
>>>
>>> Steps 1, 3, and 5 can only happen at actual meetings of the committee,
>>> which only happen twice a year. And this doesn't include times when a
>>> proposal is sent back for improvements. So you're looking at a year and a
>>> half minimum to get this done.
>>
>>
>> In all but the "everything is accepted" case, how does your proposed
>> method of working reduce the time? Volunteers are not going to do 16x the
>> amount of work between meetings.
>
>
> I think you've misunderstood what I was getting at. Those were steps along
> the standardization process (each representing about 3 months of real-time).
> They're not individual scenarios.
>
> First, the standards committee accepts a pair of library proposals. Then,
> somebody realizes that these proposals should interact. So someone writes up
> a proposal to integrate them together. 6 months later, the LWG meets again
> and approves the integration proposal. But they also approve proposal 3
> independently of that. Again, someone realizes that proposal 3 needs to be
> integrated with the rest of the stuff already there, so someone now has to
> write another proposal. During that time, proposal 4 comes along to start
> the process over. And over. And over again.
>
> In the end, you eventually have to ship a standard. So you ship without
> integration, or worse, with poorly-done integration. And in C++, there are
> no take-backs; once you ship, it's in there forever. You might later ship a
> different proposal, but the mistake is still there.
>
>>>
>>> It would make more sense if there were someone specifically looking at
>>> all of these proposals with an eye towards integration and making certain
>>> that such integration were a priority during the development phase of the
>>> proposals.
>>>
>>>
>>> Once it becomes standard, major changes are not going to be accepted.
>>> Once Filesystem goes live, you're not going to be able to, for example,
>>> state that the Filesystem's path objects should use a Unicode string with
>>> some Unicode encoding (and thus translates to whatever each platform
>>> prefers). If it was defined to use a basic_string (or "arbitrary array of
>>> some type"), then that's what it will use in perpetuity.
>>
>>
>> Keeping with the example, how long should Filesystem wait for a Unicode
>> string library to be accepted? A year? Two years? C++17? What if we
>> don't get one by C++17? Is that really fair to the folks who proposed
>> Filesystem? Is it fair to the folks who want to have a standard Filesystem
>> library that they can use? Shouldn't Unicode also wait to make sure it
>> meets all the needs of Filesystem? How long should Unicode wait for
>> Filesystem? Etc., etc.
>>
>> Integration is painful. Unfortunately, the solutions to minimize the
>> integration pain are more painful.
>
>
> So, having a study group that exists to watch proposals in their formative
> stages and figure out ways for proposed library additions to
> intercommunicate is "painful"? The point is to have formal proposals that
> already have the integration stuff in them, before they get formally
> proposed to LWG. That way, we don't get the current situation of where we
> accept all of these proposals in a vacuum, and hope that somebody comes
> along and writes a proposal to actually make them work well together.
>
> As the standards committee does more frequent releases, that's going to fail
> more and more often. It would be nice if there were some mechanism in place
> that would at least attempt to resolve these kinds of issues, so that we
> don't get nonsense like being unable to create Filesystem paths with Unicode
> strings or vice-versa or somesuch. Assuming that someone, somewhere will
> write an oversight proposal... well, again, I point to the inability to
> create a simple ifstream with a u16string.
Please write that simple proposal about creating an ifstream from a
u16string before pointing to it as an example of a mistake. Having
seen the LWG in action and understanding a bit of the complexity
around filesystem encodings and unicode (but not as much as I'd need
to to write that proposal), I believe there were good reasons to
ignore it in the current standard. An actual proposal from you could
convince me otherwise, but simple assertions on a mailing list have no
chance.
Jeffrey
--
.
Author: DeadMG <wolfeinstein@gmail.com>
Date: Sat, 24 Nov 2012 11:11:34 -0800 (PST)
Raw View
------=_Part_400_15130213.1353784294796
Content-Type: text/plain; charset=ISO-8859-1
>
> the world did not adopt it as "standard" or "correct".
No, but the C++ Standard clearly did, and that's what we're dealing with
here.
> And certainly no C APIs did. This generally makes it much more difficult
> to interact with C-based APIs.
I disagree. Not only is it clearly not a C API, else it couldn't take an
array_ref<T> from the C++ Standard, but it's quite trivial to go from pair
of pointers to pointer plus size.
The purpose of the array_ref is that it is not merely an arbitrary range of
> iterators. It is specifically a *pointer range*. It is an array: a
> collection of elements of type T aggregated contiguously in memory. It also
> has functions for trimming the range, which a more generic iterator range
> would not have. It has these functions because it knows exactly what it's
> contents are.
You can simply subtract from the iterator- any random-access pair of
iterators, including the pair of pointers model, includes such
functionality.
Ranges is a huge proposal, which includes a lot more than array_ref and
> basic_string_ref. It includes range adapters, range algorithms, and all
> kinds of other things.
And those things are necessary to make ranges a significant improvement
over pair<T*, T*>. array_ref<T> offers nothing over a pair of pointers. All
of it's operations are trivially equivalent to some operation on that pair
of pointers. It doesn't actually introduce any new functionality, except
possibly a very slightly simpler calling syntax- and even that isn't much
when you could write a quick helper function to create the pairs.
> I understand the argument about basic_string's member functions, but the
> thing you have to remember is that it is an *argument*, not an absolute
> fact. It violates some design principles, yes. Who's? Yours perhaps, but
> not necessarily someone else's.
But definitely the ones which are apparent in the design of the rest of the
Standard. There's no reason whatsoever that those functions cannot be
offered as freestanding to operate on any sequence of any type that meets
some simple requirements. Why would you introduce an interface that can do
the same thing but in a far less generic fashion?
And as much as some people don't like it, people *do* want to treat
> strings specially from just "an array of things."
And I'm one of them. But strings don't support *more* operations than
"array of things". There is not one operation on a std::basic_string which
is specific to strings. They support *less* operations than "array of
things", depending on the encoding.
The only reason basic_string still offers these functions is compatibility.
basic_string_ref is a new class and does not have compatibility. Therefore,
there's no reason to have them.
But finally, perhaps this is just my fail, but I didn't see you argue for
any reason that basic_string_ref should exist, over array_ref<char>.
--
------=_Part_400_15130213.1353784294796
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
<blockquote class=3D"gmail_quote" style=3D"margin: 0px 0px 0px 0.8ex; borde=
r-left-width: 1px; border-left-color: rgb(204, 204, 204); border-left-style=
: solid; padding-left: 1ex;">the world did not adopt it as "standard" or "c=
orrect". </blockquote><div><br></div><div>No, but the C++ Standard cle=
arly did, and that's what we're dealing with here.</div><div> </div><b=
lockquote class=3D"gmail_quote" style=3D"margin: 0px 0px 0px 0.8ex; border-=
left-width: 1px; border-left-color: rgb(204, 204, 204); border-left-style: =
solid; padding-left: 1ex;">And certainly no C APIs did. This generally make=
s it much more difficult to interact with C-based APIs.</blockquote><div><b=
r></div><div>I disagree. Not only is it clearly not a C API, else it couldn=
't take an array_ref<T> from the C++ Standard, but it's quite trivial=
to go from pair of pointers to pointer plus size.</div><div><br></div><blo=
ckquote class=3D"gmail_quote" style=3D"margin: 0px 0px 0px 0.8ex; border-le=
ft-width: 1px; border-left-color: rgb(204, 204, 204); border-left-style: so=
lid; padding-left: 1ex;">The purpose of the array_ref is that it is not mer=
ely an arbitrary range of iterators. It is specifically a <i>pointer r=
ange</i>. It is an array: a collection of elements of type T aggregated con=
tiguously in memory. It also has functions for trimming the range, which a =
more generic iterator range would not have. It has these functions because =
it knows exactly what it's contents are.</blockquote><div><br></div><div>Yo=
u can simply subtract from the iterator- any random-access pair of iterator=
s, including the pair of pointers model, includes such functionality. =
</div><div><br></div><blockquote class=3D"gmail_quote" style=3D"margin: 0px=
0px 0px 0.8ex; border-left-width: 1px; border-left-color: rgb(204, 204, 20=
4); border-left-style: solid; padding-left: 1ex;">Ranges is a huge proposal=
, which includes a lot more than array_ref and basic_string_ref. It include=
s range adapters, range algorithms, and all kinds of other things.</blockqu=
ote><div><br></div><div>And those things are necessary to make ranges a sig=
nificant improvement over pair<T*, T*>. array_ref<T> offers not=
hing over a pair of pointers. All of it's operations are trivially equivale=
nt to some operation on that pair of pointers. It doesn't actually introduc=
e any new functionality, except possibly a very slightly simpler calling sy=
ntax- and even that isn't much when you could write a quick helper function=
to create the pairs.</div><div> <br></div><blockquote class=3D"gmail_=
quote" style=3D"margin: 0px 0px 0px 0.8ex; border-left-width: 1px; border-l=
eft-color: rgb(204, 204, 204); border-left-style: solid; padding-left: 1ex;=
">I understand the argument about basic_string's member functions, but the =
thing you have to remember is that it is an <i>argument</i>, not an ab=
solute fact. It violates some design principles, yes. Who's? Yours perhaps,=
but not necessarily someone else's.</blockquote><div><br></div><div>But de=
finitely the ones which are apparent in the design of the rest of the Stand=
ard. There's no reason whatsoever that those functions cannot be offered as=
freestanding to operate on any sequence of any type that meets some simple=
requirements. Why would you introduce an interface that can do the same th=
ing but in a far less generic fashion? </div><div><br></div><blockquot=
e class=3D"gmail_quote" style=3D"margin: 0px 0px 0px 0.8ex; border-left-wid=
th: 1px; border-left-color: rgb(204, 204, 204); border-left-style: solid; p=
adding-left: 1ex;"> And as much as some people don't like it, people&n=
bsp;<i>do</i> want to treat strings specially from just "an array of t=
hings."</blockquote><div><br></div><div>And I'm one of them. But strings do=
n't support <i>more</i> operations than "array of things". There is no=
t one operation on a std::basic_string which is specific to strings. They s=
upport <i>less</i> operations than "array of things", depending on the=
encoding.</div><div><br></div><div>The only reason basic_string still offe=
rs these functions is compatibility. basic_string_ref is a new class and do=
es not have compatibility. Therefore, there's no reason to have them.</div>=
<div><br></div><div>But finally, perhaps this is just my fail, but I didn't=
see you argue for any reason that basic_string_ref should exist, over arra=
y_ref<char>.</div>
<p></p>
-- <br />
<br />
<br />
<br />
------=_Part_400_15130213.1353784294796--
.
Author: Jeffrey Yasskin <jyasskin@googlers.com>
Date: Sat, 24 Nov 2012 11:13:18 -0800
Raw View
On Sat, Nov 24, 2012 at 6:32 AM, DeadMG <wolfeinstein@gmail.com> wrote:
> Well, array_ref<T> can just be left as it is.
You're reading an old version of the paper. array_ref<T> didn't
attract enough interest from the LWG, so it's on hold, and will likely
be replaced by the Ranges study group. string_ref did attract enough
interest, so, as I mentioned earlier in this thread, I updated its
paper to http://www.open-std.org/JTC1/SC22/WG21/docs/papers/2012/n3442.html
and https://github.com/google/cxx-std-draft/compare/master...string-ref.
--
.
Author: Jeffrey Yasskin <jyasskin@googlers.com>
Date: Sat, 24 Nov 2012 11:28:53 -0800
Raw View
On Sat, Nov 24, 2012 at 6:10 AM, Olaf van der Spek <olafvdspek@gmail.com> wrote:
> Op vrijdag 23 november 2012 19:05:40 UTC+1 schreef Jeffrey Yasskin het
> volgende:
>
>> >> stoi(const string_ref & str, size_t * idx=0, int base=10);
>> >
>> > Wouldn't it be better to pass string_ref by value? It's only two
>> > pointers.
>>
>> (or a pointer+a length, at the implementer's discretion. ;)
>
>
> I was thinking about ABI and interoperability issues.
I was mostly joking about the big sub-thread you started along those
lines on the boost list. ;) In any case, the C++ standard doesn't
generally address ABI issues. The only exception I can think of is
complex<>. I think that at most the standard should say that
basic_string_ref is standard-layout, and let implementations figure
out the exact ABI.
>> > Why is stoui missing?
>>
>> stoui is "missing" because it's not in the current standard.
>> string_ref isn't intended improve anything about numeric conversions,
>> so I'm not changing anything there.
>
>
> Understandable.
> I did notice this:
>> Returns:
> stox(string(str), idx, base) where x is the type suffix of the function
> called.
> I hope that's just semantics. std::string construction should be avoided.
Absolutely. The current wording is at
https://github.com/google/cxx-std-draft/compare/master...string-ref#L1R5253,
which explicitly says that it's just as-if it called sto<whatever> on
a temporary string.
>> > explicit operator bool() const is missing.
>>
>> string_ref mimics (a subset of) std::string's interface. Since no
>> existing container has an operator bool(), string_ref doesn't. If you
>> want to change that, write a separate proposal.
>
>
> Will do
Excellent.
> Was providing some of the functions as non-member functions discussed?
Yes. There's interest in getting an algorithms library that works on
strings, but we got general agreement that the interface in the paper
(all of the query methods, but without 'pos' and 'n' arguments) was a
decent tradeoff between cleaning up the interface and making it easy
to migrate between std::string and std::string_ref as code evolves. In
Portland, the LWG asked for const char* overloads on some methods
since they can fail without traversing the whole argument string,
making strlen() a possible waste of time, so that's what's in the
current draft proposal.
> What happened to construction from contiguous containers (like array and
> vector)?
There's no way in the current standard to identify contiguous ranges,
so string_ref can't be constructed from them, except by explicitly
passing in .data() and .size(). In theory, it would be possible to
give them outgoing conversion operators when their element type is
char-like. Do you think that happens enough to extend their interface?
> What about the relation with array_ref and range?
That's mostly up to the range proposals. Certainly string_ref is a
range, so it interacts well with N3456.
Thanks for keeping the discussion on string_ref.
Jeffrey
--
.
Author: Nicol Bolas <jmckesson@gmail.com>
Date: Sat, 24 Nov 2012 11:32:44 -0800 (PST)
Raw View
------=_Part_109_2390594.1353785565006
Content-Type: text/plain; charset=ISO-8859-1
On Saturday, November 24, 2012 11:11:35 AM UTC-8, DeadMG wrote:
>
> the world did not adopt it as "standard" or "correct".
>
>
> No, but the C++ Standard clearly did, and that's what we're dealing with
> here.
>
>
>> And certainly no C APIs did. This generally makes it much more difficult
>> to interact with C-based APIs.
>
>
> I disagree. Not only is it clearly not a C API, else it couldn't take an
> array_ref<T> from the C++ Standard, but it's quite trivial to go from pair
> of pointers to pointer plus size.
>
> The purpose of the array_ref is that it is not merely an arbitrary range
>> of iterators. It is specifically a *pointer range*. It is an array: a
>> collection of elements of type T aggregated contiguously in memory. It also
>> has functions for trimming the range, which a more generic iterator range
>> would not have. It has these functions because it knows exactly what it's
>> contents are.
>
>
> You can simply subtract from the iterator- any random-access pair of
> iterators, including the pair of pointers model, includes such
> functionality.
>
> Ranges is a huge proposal, which includes a lot more than array_ref and
>> basic_string_ref. It includes range adapters, range algorithms, and all
>> kinds of other things.
>
>
> And those things are necessary to make ranges a significant improvement
> over pair<T*, T*>. array_ref<T> offers nothing over a pair of pointers.
>
You can do this:
array_ref<T> arr = ...;
arr[4] = ...
arr.at(4) = ...
You can't do that with a pair of pointers. At least, not without
difficulty. Nor can you do that with an iterator range.
> All of it's operations are trivially equivalent to some operation on that
> pair of pointers. It doesn't actually introduce any new functionality,
> except possibly a very slightly simpler calling syntax- and even that isn't
> much when you could write a quick helper function to create the pairs.
>
>
>> I understand the argument about basic_string's member functions, but the
>> thing you have to remember is that it is an *argument*, not an absolute
>> fact. It violates some design principles, yes. Who's? Yours perhaps, but
>> not necessarily someone else's.
>
>
> But definitely the ones which are apparent in the design of the rest of
> the Standard. There's no reason whatsoever that those functions cannot be
> offered as freestanding to operate on any sequence of any type that meets
> some simple requirements. Why would you introduce an interface that can do
> the same thing but in a far less generic fashion?
>
What "rest of the Standard?" basic_string is no more an outlier than
iostreams, locale, or any of the other various systems.
The Standard Library does not *have* a "design"; don't make the mistake of
thinking of STL as the Standard Library. Each of the major parts of the
standard library was designed independently and then some effort was put
into integrating them into a whole. If there had been more time, I'm sure
that basic_string would have lost some of its member functions. But it
didn't.
There is no overarching design of the standard library. There is one for
the components derived from the STL, but that's not everything in the
standard library.
>
> And as much as some people don't like it, people *do* want to treat
>> strings specially from just "an array of things."
>
>
> And I'm one of them. But strings don't support *more* operations than
> "array of things". There is not one operation on a std::basic_string which
> is specific to strings. They support *less* operations than "array of
> things", depending on the encoding.
>
> The only reason basic_string still offers these functions is
> compatibility. basic_string_ref is a new class and does not have
> compatibility.
>
But it *does* have compatibility. We want people to be able to go to their
function signatures and swap `const std::string &` with `std::string_ref`
without much trouble.
> Therefore, there's no reason to have them.
>
> But finally, perhaps this is just my fail, but I didn't see you argue for
> any reason that basic_string_ref should exist, over array_ref<char>.
>
--
------=_Part_109_2390594.1353785565006
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
<br><br>On Saturday, November 24, 2012 11:11:35 AM UTC-8, DeadMG wrote:<blo=
ckquote class=3D"gmail_quote" style=3D"margin: 0;margin-left: 0.8ex;border-=
left: 1px #ccc solid;padding-left: 1ex;"><blockquote class=3D"gmail_quote" =
style=3D"margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:r=
gb(204,204,204);border-left-style:solid;padding-left:1ex">the world did not=
adopt it as "standard" or "correct". </blockquote><div><br></div><div=
>No, but the C++ Standard clearly did, and that's what we're dealing with h=
ere.</div><div> </div><blockquote class=3D"gmail_quote" style=3D"margi=
n:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204=
);border-left-style:solid;padding-left:1ex">And certainly no C APIs did. Th=
is generally makes it much more difficult to interact with C-based APIs.</b=
lockquote><div><br></div><div>I disagree. Not only is it clearly not a C AP=
I, else it couldn't take an array_ref<T> from the C++ Standard, but i=
t's quite trivial to go from pair of pointers to pointer plus size.</div><d=
iv><br></div><blockquote class=3D"gmail_quote" style=3D"margin:0px 0px 0px =
0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-=
style:solid;padding-left:1ex">The purpose of the array_ref is that it is no=
t merely an arbitrary range of iterators. It is specifically a <i>poin=
ter range</i>. It is an array: a collection of elements of type T aggregate=
d contiguously in memory. It also has functions for trimming the range, whi=
ch a more generic iterator range would not have. It has these functions bec=
ause it knows exactly what it's contents are.</blockquote><div><br></div><d=
iv>You can simply subtract from the iterator- any random-access pair of ite=
rators, including the pair of pointers model, includes such functionality.&=
nbsp;</div><div><br></div><blockquote class=3D"gmail_quote" style=3D"margin=
:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204)=
;border-left-style:solid;padding-left:1ex">Ranges is a huge proposal, which=
includes a lot more than array_ref and basic_string_ref. It includes range=
adapters, range algorithms, and all kinds of other things.</blockquote><di=
v><br></div><div>And those things are necessary to make ranges a significan=
t improvement over pair<T*, T*>. array_ref<T> offers nothing ov=
er a pair of pointers.</div></blockquote><div><br>You can do this:<br><br><=
div class=3D"prettyprint" style=3D"background-color: rgb(250, 250, 250); bo=
rder-color: rgb(187, 187, 187); border-style: solid; border-width: 1px; wor=
d-wrap: break-word;"><code class=3D"prettyprint"><div class=3D"subprettypri=
nt"><span style=3D"color: #000;" class=3D"styled-by-prettify">array_ref</sp=
an><span style=3D"color: #660;" class=3D"styled-by-prettify"><</span><sp=
an style=3D"color: #000;" class=3D"styled-by-prettify">T</span><span style=
=3D"color: #660;" class=3D"styled-by-prettify">></span><span style=3D"co=
lor: #000;" class=3D"styled-by-prettify"> arr </span><span style=3D"color: =
#660;" class=3D"styled-by-prettify">=3D</span><span style=3D"color: #000;" =
class=3D"styled-by-prettify"> </span><span style=3D"color: #660;" class=3D"=
styled-by-prettify">...;</span><span style=3D"color: #000;" class=3D"styled=
-by-prettify"><br>arr</span><span style=3D"color: #660;" class=3D"styled-by=
-prettify">[</span><span style=3D"color: #066;" class=3D"styled-by-prettify=
">4</span><span style=3D"color: #660;" class=3D"styled-by-prettify">]</span=
><span style=3D"color: #000;" class=3D"styled-by-prettify"> </span><span st=
yle=3D"color: #660;" class=3D"styled-by-prettify">=3D</span><span style=3D"=
color: #000;" class=3D"styled-by-prettify"> </span><span style=3D"color: #6=
60;" class=3D"styled-by-prettify">...</span><span style=3D"color: #000;" cl=
ass=3D"styled-by-prettify"><br>arr</span><span style=3D"color: #660;" class=
=3D"styled-by-prettify">.</span><span style=3D"color: #000;" class=3D"style=
d-by-prettify">at</span><span style=3D"color: #660;" class=3D"styled-by-pre=
ttify">(</span><span style=3D"color: #066;" class=3D"styled-by-prettify">4<=
/span><span style=3D"color: #660;" class=3D"styled-by-prettify">)</span><sp=
an style=3D"color: #000;" class=3D"styled-by-prettify"> </span><span style=
=3D"color: #660;" class=3D"styled-by-prettify">=3D</span><span style=3D"col=
or: #000;" class=3D"styled-by-prettify"> </span><span style=3D"color: #660;=
" class=3D"styled-by-prettify">...</span></div></code></div><br>You can't d=
o that with a pair of pointers. At least, not without difficulty. Nor can y=
ou do that with an iterator range.<br> </div><blockquote class=3D"gmai=
l_quote" style=3D"margin: 0;margin-left: 0.8ex;border-left: 1px #ccc solid;=
padding-left: 1ex;"><div>All of it's operations are trivially equivalent to=
some operation on that pair of pointers. It doesn't actually introduce any=
new functionality, except possibly a very slightly simpler calling syntax-=
and even that isn't much when you could write a quick helper function to c=
reate the pairs.</div><div> <br></div><blockquote class=3D"gmail_quote=
" style=3D"margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color=
:rgb(204,204,204);border-left-style:solid;padding-left:1ex">I understand th=
e argument about basic_string's member functions, but the thing you have to=
remember is that it is an <i>argument</i>, not an absolute fact. It v=
iolates some design principles, yes. Who's? Yours perhaps, but not necessar=
ily someone else's.</blockquote><div><br></div><div>But definitely the ones=
which are apparent in the design of the rest of the Standard. There's no r=
eason whatsoever that those functions cannot be offered as freestanding to =
operate on any sequence of any type that meets some simple requirements. Wh=
y would you introduce an interface that can do the same thing but in a far =
less generic fashion? </div></blockquote><div><br>What "rest of the St=
andard?" basic_string is no more an outlier than iostreams, locale, or any =
of the other various systems.<br><br>The Standard Library does not <i>have<=
/i> a "design"; don't make the mistake of thinking of STL as the Standard L=
ibrary. Each of the major parts of the standard library was designed indepe=
ndently and then some effort was put into integrating them into a whole. If=
there had been more time, I'm sure that basic_string would have lost some =
of its member functions. But it didn't.<br><br>There is no overarching desi=
gn of the standard library. There is one for the components derived from th=
e STL, but that's not everything in the standard library.<br> </div><b=
lockquote class=3D"gmail_quote" style=3D"margin: 0;margin-left: 0.8ex;borde=
r-left: 1px #ccc solid;padding-left: 1ex;"><div><br></div><blockquote class=
=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border-left-width:1px;bo=
rder-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">=
And as much as some people don't like it, people <i>do</i> =
want to treat strings specially from just "an array of things."</blockquote=
><div><br></div><div>And I'm one of them. But strings don't support <i>more=
</i> operations than "array of things". There is not one operation on =
a std::basic_string which is specific to strings. They support <i>less</i>&=
nbsp;operations than "array of things", depending on the encoding.</div><di=
v><br></div><div>The only reason basic_string still offers these functions =
is compatibility. basic_string_ref is a new class and does not have compati=
bility.</div></blockquote><div><br>But it <i>does</i> have compatibility. W=
e want people to be able to go to their function signatures and swap `const=
std::string &` with `std::string_ref` without much trouble.<br> <=
/div><blockquote class=3D"gmail_quote" style=3D"margin: 0;margin-left: 0.8e=
x;border-left: 1px #ccc solid;padding-left: 1ex;"><div>Therefore, there's n=
o reason to have them.</div><div><br></div><div>But finally, perhaps this i=
s just my fail, but I didn't see you argue for any reason that basic_string=
_ref should exist, over array_ref<char>.</div></blockquote>
<p></p>
-- <br />
<br />
<br />
<br />
------=_Part_109_2390594.1353785565006--
.
Author: rick@longbowgames.com
Date: Sat, 24 Nov 2012 12:04:25 -0800 (PST)
Raw View
------=_Part_1889_13408803.1353787465220
Content-Type: text/plain; charset=ISO-8859-1
On Saturday, November 24, 2012 2:32:45 PM UTC-5, Nicol Bolas wrote:
>
> But it *does* have compatibility. We want people to be able to go to
> their function signatures and swap `const std::string &` with
> `std::string_ref` without much trouble.
>
It's worth noting that most of those functions were written against a
string class that supports random access, and so take indices instead of
iterators. If you want to support anything as wild as, oh, UTF-8, you'll
need to rewrite the prototypes of those functions anyway.
If you take something like:
size_type basic_string::find(const basic_string& needle, size_type pos)
You'll have to replace it with something like:
string_ref::iterator string_ref::find(const string_ref& needle,
string_ref::iterator pos)
Since you're breaking the prototype anyway, you may as well take this
opportunity to make it more general and allow finding substring in any
container type:
template <class T>
T::const_iterator find(const T& haystack, const T& needle,
T::const_iterator pos)
(Actually, you'll want to return a range, but that's a different argument.)
--
------=_Part_1889_13408803.1353787465220
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
On Saturday, November 24, 2012 2:32:45 PM UTC-5, Nicol Bolas wrote:<blockqu=
ote class=3D"gmail_quote" style=3D"margin: 0;margin-left: 0.8ex;border-left=
: 1px #ccc solid;padding-left: 1ex;"><div>But it <i>does</i> have compatibi=
lity. We want people to be able to go to their function signatures and swap=
`const std::string &` with `std::string_ref` without much trouble. <br=
></div></blockquote><div><br>It's worth noting that most of those functions=
were written against a string class that supports random access, and so ta=
ke indices instead of iterators. If you want to support anything as wild as=
, oh, UTF-8, you'll need to rewrite the prototypes of those functions anywa=
y.<br><br>If you take something like:<br><br><span style=3D"font-family: co=
urier new,monospace;">size_type basic_string::find(const basic_string& =
needle, size_type pos)</span><br><br>You'll have to replace it with somethi=
ng like:<br><br><span style=3D"font-family: courier new,monospace;">string_=
ref::iterator string_ref::find(const string_ref& needle, string_ref::it=
erator pos)</span><br><br>Since you're breaking the prototype anyway, you m=
ay as well take this opportunity to make it more general and allow finding =
substring in any container type:<br><br><span style=3D"font-family: courier=
new,monospace;">template <class T><br>T::const_iterator find(const T=
& haystack, const T& needle, T::const_iterator pos)</span><br><br>(=
Actually, you'll want to return a range, but that's a different argument.)<=
br></div>
<p></p>
-- <br />
<br />
<br />
<br />
------=_Part_1889_13408803.1353787465220--
.
Author: DeadMG <wolfeinstein@gmail.com>
Date: Sat, 24 Nov 2012 12:20:21 -0800 (PST)
Raw View
------=_Part_493_607989.1353788421598
Content-Type: text/plain; charset=ISO-8859-1
>
> You can't do that with a pair of pointers. At least, not without
> difficulty. Nor can you do that with an iterator range.
For a pair of pointers, you could write a trivial helper that returns
pair<T*, T*> for easy creation. Then
arr[4] = ...;
pair.first[4] = ...;
As for at(), you're right in that native random-access iterators don't
offer that, but again, a very brief helper here can certainly do that.
template<typename Iterator> auto at(std::pair<Iterator, Iterator> range,
std::size_t where) -> decltype(range.first[where]) {
if (where >= range.second - range.first)
throw std::out_of_bounds(...);
return range.first[where];
}
The majority of operations on array_ref have direct equivalences in pointer
arithmetic, which is also valid on RA iterators. Only a couple of the more
complex ops like construction and at() require trivial helpers.
What "rest of the Standard?" basic_string is no more an outlier than
> iostreams, locale, or any of the other various systems.
Both of which are also due for an overhaul. However, I'd suggest that "A
function that is exactly the same in every respect except more generic is
superior to another function" is a generally well-accepted design concept.
But it *does* have compatibility. We want people to be able to go to their
> function signatures and swap `const std::string &` with `std::string_ref`
> without much trouble.
Why would people want to do that? There's no incentive for string_ref. They
can already gain practically every useful function from std::string_ref
with a pair of iterators, or a pair of char*.
And, to put it another way, the string interface - both the Standard one
and the ones written by users - already requires a fairly complete revamp
to support Unicode, and find_first_not_of is certainly not on the list of
functions that will survive the trip- it doesn't really make sense in a
Unicode world. Not having it is going to be the *least* of the concerns of
anyone dealing with strings*.*
*
*
Y'know, really, I guess it's just a bikeshed issue. You want a class, I
think it's fine with existing functionality and maybe a helper or two, at
most. But ultimately, that's bikeshedding. It's much more important to deal
with Unicode than to discuss whether or not string_ref deserves to be a
class on it's own.
--
------=_Part_493_607989.1353788421598
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
<blockquote class=3D"gmail_quote" style=3D"margin: 0px 0px 0px 0.8ex; borde=
r-left-width: 1px; border-left-color: rgb(204, 204, 204); border-left-style=
: solid; padding-left: 1ex;">You can't do that with a pair of pointers. At =
least, not without difficulty. Nor can you do that with an iterator range.<=
/blockquote><div><br></div><div>For a pair of pointers, you could write a t=
rivial helper that returns pair<T*, T*> for easy creation. Then</div>=
<div><br></div><div>arr[4] =3D ...;</div><div>pair.first[4] =3D ...;</div><=
div><br></div><div>As for at(), you're right in that native random-access i=
terators don't offer that, but again, a very brief helper here can certainl=
y do that.</div><div><br></div><div>template<typename Iterator> auto =
at(std::pair<Iterator, Iterator> range, std::size_t where) -> decl=
type(range.first[where]) {<br> if (where >=3D range.second =
- range.first)</div><div> throw std::out_of_boun=
ds(...);</div><div> return range.first[where];</div><div>}</di=
v><div><br></div><div>The majority of operations on array_ref have direct e=
quivalences in pointer arithmetic, which is also valid on RA iterators. Onl=
y a couple of the more complex ops like construction and at() require trivi=
al helpers.</div><div><br></div><blockquote class=3D"gmail_quote" style=3D"=
margin: 0px 0px 0px 0.8ex; border-left-width: 1px; border-left-color: rgb(2=
04, 204, 204); border-left-style: solid; padding-left: 1ex;">What "rest of =
the Standard?" basic_string is no more an outlier than iostreams, locale, o=
r any of the other various systems.</blockquote><div><br></div><div>Both of=
which are also due for an overhaul. However, I'd suggest that "A function =
that is exactly the same in every respect except more generic is superior t=
o another function" is a generally well-accepted design concept.</div><div>=
<br></div><blockquote class=3D"gmail_quote" style=3D"margin: 0px 0px 0px 0.=
8ex; border-left-width: 1px; border-left-color: rgb(204, 204, 204); border-=
left-style: solid; padding-left: 1ex;"> But it <i>does</i> h=
ave compatibility. We want people to be able to go to their function signat=
ures and swap `const std::string &` with `std::string_ref` without much=
trouble.</blockquote><div><br></div><div>Why would people want to do that?=
There's no incentive for string_ref. They can already gain practically eve=
ry useful function from std::string_ref with a pair of iterators, or a pair=
of char*. </div><div><br></div><div>And, to put it another way, the s=
tring interface - both the Standard one and the ones written by users - alr=
eady requires a fairly complete revamp to support Unicode, and find_first_n=
ot_of is certainly not on the list of functions that will survive the trip-=
it doesn't really make sense in a Unicode world. Not having it is going to=
be the <i>least</i> of the concerns of anyone dealing with strin=
gs<i>.</i></div><div><i><br></i></div><div>Y'know, really, I guess it's jus=
t a bikeshed issue. You want a class, I think it's fine with existing funct=
ionality and maybe a helper or two, at most. But ultimately, that's bikeshe=
dding. It's much more important to deal with Unicode than to discuss whethe=
r or not string_ref deserves to be a class on it's own.</div>
<p></p>
-- <br />
<br />
<br />
<br />
------=_Part_493_607989.1353788421598--
.
Author: Nicol Bolas <jmckesson@gmail.com>
Date: Sat, 24 Nov 2012 13:17:03 -0800 (PST)
Raw View
------=_Part_634_502553.1353791823118
Content-Type: text/plain; charset=ISO-8859-1
On Saturday, November 24, 2012 12:20:21 PM UTC-8, DeadMG wrote:
>
> You can't do that with a pair of pointers. At least, not without
>> difficulty. Nor can you do that with an iterator range.
>
>
> For a pair of pointers, you could write a trivial helper that returns
> pair<T*, T*> for easy creation. Then
>
> arr[4] = ...;
> pair.first[4] = ...;
>
> As for at(), you're right in that native random-access iterators don't
> offer that, but again, a very brief helper here can certainly do that.
>
> template<typename Iterator> auto at(std::pair<Iterator, Iterator> range,
> std::size_t where) -> decltype(range.first[where]) {
> if (where >= range.second - range.first)
> throw std::out_of_bounds(...);
> return range.first[where];
> }
>
> The majority of operations on array_ref have direct equivalences in
> pointer arithmetic, which is also valid on RA iterators. Only a couple of
> the more complex ops like construction and at() require trivial helpers.
>
> What "rest of the Standard?" basic_string is no more an outlier than
>> iostreams, locale, or any of the other various systems.
>
>
> Both of which are also due for an overhaul. However, I'd suggest that "A
> function that is exactly the same in every respect except more generic is
> superior to another function" is a generally well-accepted design concept.
>
> But it *does* have compatibility. We want people to be able to go to
>> their function signatures and swap `const std::string &` with
>> `std::string_ref` without much trouble.
>
>
> Why would people want to do that? There's no incentive for string_ref.
> They can already gain practically every useful function from
> std::string_ref with a pair of iterators, or a pair of char*.
>
> And, to put it another way, the string interface - both the Standard one
> and the ones written by users - already requires a fairly complete revamp
> to support Unicode, and find_first_not_of is certainly not on the list of
> functions that will survive the trip- it doesn't really make sense in a
> Unicode world. Not having it is going to be the *least* of the concerns
> of anyone dealing with strings*.*
>
You're conflating two very different issues: Unicode support and string
interoperation.
basic_string_ref is for the latter. That's the only thing it does: it
allows strings from different sources (std::basic_string, MFC's CString,
C-standard string literals, etc) to interoperate with a single, coherent
interface.
If I write a function that takes a basic_string, you must copy your string
into it. If I write a function that takes a basic_string_ref, no copy is
needed, but I still get to use this single, coherent interface.
basic_string_ref is not intended to deal with Unicode issues *at all*. It's
there for interoperation between different kinds of strings. It's a common
denominator interface between strings that allows a set of useful
operations to be performed on them.
Even if a Unicode string proposal was accepted for C++14, the C++ world
would not instantly jump on it. There's a *lot* of legacy code that doesn't
use those and isn't going to be upgraded to use them. There are other
string classes that people will continue to use because they are convenient
for those users. And so forth.
basic_string_ref exists to serve their needs.
> *
> *
> Y'know, really, I guess it's just a bikeshed issue. You want a class, I
> think it's fine with existing functionality and maybe a helper or two, at
> most. But ultimately, that's bikeshedding. It's much more important to deal
> with Unicode than to discuss whether or not string_ref deserves to be a
> class on it's own.
>
--
------=_Part_634_502553.1353791823118
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
<br><br>On Saturday, November 24, 2012 12:20:21 PM UTC-8, DeadMG wrote:<blo=
ckquote class=3D"gmail_quote" style=3D"margin: 0;margin-left: 0.8ex;border-=
left: 1px #ccc solid;padding-left: 1ex;"><blockquote class=3D"gmail_quote" =
style=3D"margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:r=
gb(204,204,204);border-left-style:solid;padding-left:1ex">You can't do that=
with a pair of pointers. At least, not without difficulty. Nor can you do =
that with an iterator range.</blockquote><div><br></div><div>For a pair of =
pointers, you could write a trivial helper that returns pair<T*, T*> =
for easy creation. Then</div><div><br></div><div>arr[4] =3D ...;</div><div>=
pair.first[4] =3D ...;</div><div><br></div><div>As for at(), you're right i=
n that native random-access iterators don't offer that, but again, a very b=
rief helper here can certainly do that.</div><div><br></div><div>template&l=
t;typename Iterator> auto at(std::pair<Iterator, Iterator> range, =
std::size_t where) -> decltype(range.first[where]) {<br> if=
(where >=3D range.second - range.first)</div><div> =
throw std::out_of_bounds(...);</div><div> return range.=
first[where];</div><div>}</div><div><br></div><div>The majority of operatio=
ns on array_ref have direct equivalences in pointer arithmetic, which is al=
so valid on RA iterators. Only a couple of the more complex ops like constr=
uction and at() require trivial helpers.</div><div><br></div><blockquote cl=
ass=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border-left-width:1px=
;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1e=
x">What "rest of the Standard?" basic_string is no more an outlier than ios=
treams, locale, or any of the other various systems.</blockquote><div><br><=
/div><div>Both of which are also due for an overhaul. However, I'd suggest =
that "A function that is exactly the same in every respect except more gene=
ric is superior to another function" is a generally well-accepted design co=
ncept.</div><div><br></div><blockquote class=3D"gmail_quote" style=3D"margi=
n:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204=
);border-left-style:solid;padding-left:1ex"> But it <i>does</i>&n=
bsp;have compatibility. We want people to be able to go to their function s=
ignatures and swap `const std::string &` with `std::string_ref` without=
much trouble.</blockquote><div><br></div><div>Why would people want to do =
that? There's no incentive for string_ref. They can already gain practicall=
y every useful function from std::string_ref with a pair of iterators, or a=
pair of char*. </div><div><br></div><div>And, to put it another way, =
the string interface - both the Standard one and the ones written by users =
- already requires a fairly complete revamp to support Unicode, and find_fi=
rst_not_of is certainly not on the list of functions that will survive the =
trip- it doesn't really make sense in a Unicode world. Not having it is goi=
ng to be the <i>least</i> of the concerns of anyone dealing with =
strings<i>.</i></div></blockquote><div><br>You're conflating two very diffe=
rent issues: Unicode support and string interoperation.<br><br>basic_string=
_ref is for the latter. That's the only thing it does: it allows strings fr=
om different sources (std::basic_string, MFC's CString, C-standard string l=
iterals, etc) to interoperate with a single, coherent interface.<br><br>If =
I write a function that takes a basic_string, you must copy your string int=
o it. If I write a function that takes a basic_string_ref, no copy is neede=
d, but I still get to use this single, coherent interface.<br><br>basic_str=
ing_ref is not intended to deal with Unicode issues <i>at all</i>. It's the=
re for interoperation between different kinds of strings. It's a common den=
ominator interface between strings that allows a set of useful operations t=
o be performed on them.<br><br>Even if a Unicode string proposal was accept=
ed for C++14, the C++ world would not instantly jump on it. There's a <i>lo=
t</i> of legacy code that doesn't use those and isn't going to be upgraded =
to use them. There are other string classes that people will continue to us=
e because they are convenient for those users. And so forth.<br><br>basic_s=
tring_ref exists to serve their needs.<br> <br></div><blockquote class=
=3D"gmail_quote" style=3D"margin: 0;margin-left: 0.8ex;border-left: 1px #cc=
c solid;padding-left: 1ex;"><div><i><br></i></div><div>Y'know, really, I gu=
ess it's just a bikeshed issue. You want a class, I think it's fine with ex=
isting functionality and maybe a helper or two, at most. But ultimately, th=
at's bikeshedding. It's much more important to deal with Unicode than to di=
scuss whether or not string_ref deserves to be a class on it's own.</div></=
blockquote>
<p></p>
-- <br />
<br />
<br />
<br />
------=_Part_634_502553.1353791823118--
.
Author: Olaf van der Spek <olafvdspek@gmail.com>
Date: Sun, 25 Nov 2012 05:16:51 -0800 (PST)
Raw View
------=_Part_946_33300505.1353849411959
Content-Type: text/plain; charset=ISO-8859-1
Op zaterdag 24 november 2012 20:29:15 UTC+1 schreef Jeffrey Yasskin het
volgende:
> > I was thinking about ABI and interoperability issues.
>
> I was mostly joking about the big sub-thread you started along those
> lines on the boost list. ;) In any case, the C++ standard doesn't
>
Yeah, that was fun. :p
> > Understandable.
> > I did notice this:
> >> Returns:
> > stox(string(str), idx, base) where x is the type suffix of the function
> > called.
> > I hope that's just semantics. std::string construction should be
> avoided.
>
> Absolutely. The current wording is at
> https://github.com/google/cxx-std-draft/compare/master...string-ref#L1R5253,
>
> which explicitly says that it's just as-if it called sto<whatever> on
> a temporary string.
>
Doesn't having both const string& and string_ref overloads cause ambiguity
when called with "7"?
> What happened to construction from contiguous containers (like array and
> > vector)?
>
> There's no way in the current standard to identify contiguous ranges,
>
True, didn't LWG think that should be fixed?
> so string_ref can't be constructed from them, except by explicitly
> passing in .data() and .size(). In theory, it would be possible to
> give them outgoing conversion operators when their element type is
> char-like. Do you think that happens enough to extend their interface?
>
void f(string_ref);
CString mfc;
QString qt;
f(mfc);
f(qt);
Wouldn't string_ref constructors for array and vector make more sense,
until contiguous ranges can be properly supported?
--
------=_Part_946_33300505.1353849411959
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
Op zaterdag 24 november 2012 20:29:15 UTC+1 schreef Jeffrey Yasskin het vol=
gende:<br><blockquote class=3D"gmail_quote" style=3D"margin: 0;margin-left:=
0.8ex;border-left: 1px #ccc solid;padding-left: 1ex;">> I was thinking =
about ABI and interoperability issues.
<br>
<br>I was mostly joking about the big sub-thread you started along those
<br>lines on the boost list. ;) In any case, the C++ standard doesn't
<br></blockquote><div><br></div><div>Yeah, that was fun. :p</div><div> =
;</div><blockquote class=3D"gmail_quote" style=3D"margin: 0;margin-left: 0.=
8ex;border-left: 1px #ccc solid;padding-left: 1ex;">> Understandable.
<br>> I did notice this:
<br>>> Returns:
<br>> stox(string(str), idx, base) where x is the type suffix of the fun=
ction
<br>> called.
<br>> I hope that's just semantics. std::string construction should be a=
voided.
<br>
<br>Absolutely. The current wording is at
<br><a href=3D"https://github.com/google/cxx-std-draft/compare/master...str=
ing-ref#L1R5253" target=3D"_blank">https://github.com/google/cxx-<wbr>std-d=
raft/compare/master...<wbr>string-ref#L1R5253</a>,
<br>which explicitly says that it's just as-if it called sto<whatever>=
; on
<br>a temporary string.
<br></blockquote><div><br></div><div>Doesn't having both const string& =
and string_ref overloads cause ambiguity when called with "7"?</div><div><b=
r></div><div> > What happened to construction from contiguous conta=
iners (like array and </div><blockquote class=3D"gmail_quote" style=3D=
"margin: 0;margin-left: 0.8ex;border-left: 1px #ccc solid;padding-left: 1ex=
;">> vector)?
<br>
<br>There's no way in the current standard to identify contiguous ranges,
<br></blockquote><div><br></div><div>True, didn't LWG think that should be =
fixed?</div><div> </div><blockquote class=3D"gmail_quote" style=3D"mar=
gin: 0;margin-left: 0.8ex;border-left: 1px #ccc solid;padding-left: 1ex;">s=
o string_ref can't be constructed from them, except by explicitly
<br>passing in .data() and .size(). In theory, it would be possible to
<br>give them outgoing conversion operators when their element type is
<br>char-like. Do you think that happens enough to extend their interface?
<br></blockquote><div><br></div><div>void f(string_ref);</div><div><br></di=
v><div>CString mfc;</div><div>QString qt;</div><div>f(mfc);</div><div>f(qt)=
;</div><div><br></div><div>Wouldn't string_ref constructors for array and v=
ector make more sense, until contiguous ranges can be properly supported?</=
div><div><br></div>
<p></p>
-- <br />
<br />
<br />
<br />
------=_Part_946_33300505.1353849411959--
.
Author: Jeffrey Yasskin <jyasskin@googlers.com>
Date: Sun, 25 Nov 2012 10:11:08 -0800
Raw View
On Sun, Nov 25, 2012 at 5:16 AM, Olaf van der Spek <olafvdspek@gmail.com> wrote:
> Op zaterdag 24 november 2012 20:29:15 UTC+1 schreef Jeffrey Yasskin het
> volgende:
>> > I did notice this:
>> >> Returns:
>> > stox(string(str), idx, base) where x is the type suffix of the function
>> > called.
>> > I hope that's just semantics. std::string construction should be
>> > avoided.
>>
>> Absolutely. The current wording is at
>>
>> https://github.com/google/cxx-std-draft/compare/master...string-ref#L1R5253,
>> which explicitly says that it's just as-if it called sto<whatever> on
>> a temporary string.
>
>
> Doesn't having both const string& and string_ref overloads cause ambiguity
> when called with "7"?
Yes. I've got a note at
https://github.com/google/cxx-std-draft/compare/master...string-ref#L1R808
saying we either need to remove the string overloads or add const
char* overloads. I'll probably add the const char* overloads for the
next version of the paper because it matches what I've seen the LWG do
before. It would be nice to remove the string overloads instead, but
it could break users who have defined an implicit operator
std::string().
> > What happened to construction from contiguous containers (like array and
>>
>> > vector)?
>>
>> There's no way in the current standard to identify contiguous ranges,
>
>
> True, didn't LWG think that should be fixed?
I think there's support for fixing that, but it edges into all the
other "Fix Iterators!!!" proposals, so it's not part of this paper.
Maybe one of the range proposals will have it. One possibility for
doing that, aside from the obvious contiguous_iterator_tag route,
would be to have contiguous iterators define an explicit operator
T*().
>> so string_ref can't be constructed from them, except by explicitly
>> passing in .data() and .size(). In theory, it would be possible to
>> give them outgoing conversion operators when their element type is
>> char-like. Do you think that happens enough to extend their interface?
>
>
> void f(string_ref);
>
> CString mfc;
> QString qt;
> f(mfc);
> f(qt);
>
> Wouldn't string_ref constructors for array and vector make more sense, until
> contiguous ranges can be properly supported?
I'm not sure what the CString and QString examples have to do with
array and vector. CString and QString, being string-like classes,
ought to define an operator string_ref() so that they can be passed to
these functions. Are you suggesting that string_ref should have an
implicit conversion from any type that defines .data() and .size()
with the right return types? I'm skeptical of adding implicit
conversions in general, but if you want it proposed, I'll include it
as an option in the paper so people know to discuss it.
Thanks,
Jeffrey
--
.
Author: Olaf van der Spek <olafvdspek@gmail.com>
Date: Mon, 26 Nov 2012 13:14:58 +0100
Raw View
On Sun, Nov 25, 2012 at 7:11 PM, Jeffrey Yasskin <jyasskin@googlers.com> wrote:
>> Doesn't having both const string& and string_ref overloads cause ambiguity
>> when called with "7"?
>
> Yes. I've got a note at
> https://github.com/google/cxx-std-draft/compare/master...string-ref#L1R808
> saying we either need to remove the string overloads or add const
> char* overloads. I'll probably add the const char* overloads for the
> next version of the paper because it matches what I've seen the LWG do
string_ref has the potential to clean up interfaces, but having 3
overloads would be a seriously unclean interface.
3 similar overloads might have even more potential for ambiguity.
> before. It would be nice to remove the string overloads instead, but
> it could break users who have defined an implicit operator
> std::string().
That's a bit unfortunate.
>> > What happened to construction from contiguous containers (like array and
>>>
>>> > vector)?
>>>
>>> There's no way in the current standard to identify contiguous ranges,
Array and vector are contiguous by definition.
>>
>> True, didn't LWG think that should be fixed?
>
> I think there's support for fixing that, but it edges into all the
> other "Fix Iterators!!!" proposals, so it's not part of this paper.
string_ref would greatly benefit from it, though. Is there no way to
use data() and size() implicitly instead?
> Maybe one of the range proposals will have it. One possibility for
> doing that, aside from the obvious contiguous_iterator_tag route,
> would be to have contiguous iterators define an explicit operator
> T*().
That's a nice and clean idea!
>>> so string_ref can't be constructed from them, except by explicitly
>>> passing in .data() and .size(). In theory, it would be possible to
>>> give them outgoing conversion operators when their element type is
>>> char-like. Do you think that happens enough to extend their interface?
>>
>>
>> void f(string_ref);
>>
>> CString mfc;
>> QString qt;
>> f(mfc);
>> f(qt);
>>
>> Wouldn't string_ref constructors for array and vector make more sense, until
>> contiguous ranges can be properly supported?
>
> I'm not sure what the CString and QString examples have to do with
> array and vector. CString and QString, being string-like classes,
They're all contiguous (char) ranges.
> ought to define an operator string_ref() so that they can be passed to
> these functions. Are you suggesting that string_ref should have an
That'd work, but it's not as generic.
> implicit conversion from any type that defines .data() and .size()
> with the right return types? I'm skeptical of adding implicit
> conversions in general,
Why? Due to being based on data and size instead of begin and end?
I'd be more concerned about the implicit conversion from const char*,
though that's necessary to support string literals.
> but if you want it proposed, I'll include it
> as an option in the paper so people know to discuss it.
--
Olaf
--
.
Author: David Oliver <google@davidandpenny.net>
Date: Wed, 28 Nov 2012 11:54:51 -0800 (PST)
Raw View
------=_Part_40_5676120.1354132491939
Content-Type: text/plain; charset=ISO-8859-1
On Sunday, November 25, 2012 12:11:30 PM UTC-6, Jeffrey Yasskin wrote:
> On Sun, Nov 25, 2012 at 5:16 AM, Olaf van der Spek <olafv...@gmail.com>
> wrote:
> > Op zaterdag 24 november 2012 20:29:15 UTC+1 schreef Jeffrey Yasskin het
> > volgende:
> >> > I did notice this:
> >> >> Returns:
> >> > stox(string(str), idx, base) where x is the type suffix of the
> function
> >> > called.
> >> > I hope that's just semantics. std::string construction should be
> >> > avoided.
> >>
> >> Absolutely. The current wording is at
> >>
> https://github.com/google/cxx-std-draft/compare/master...string-ref#L1R5253
> ,
> >> which explicitly says that it's just as-if it called sto<whatever> on
> >> a temporary string.
> >
> > Doesn't having both const string& and string_ref overloads cause
> ambiguity
> > when called with "7"?
>
> Yes. I've got a note at
> https://github.com/google/cxx-std-draft/compare/master...string-ref#L1R808
>
> saying we either need to remove the string overloads or add const
> char* overloads. I'll probably add the const char* overloads for the
> next version of the paper because it matches what I've seen the LWG do
> before. It would be nice to remove the string overloads instead, but
> it could break users who have defined an implicit operator
> std::string().
>
It can also break string classes with implicit operator const char*(), such
as the Visual C++ CString, but that is a Bad Idea anyway.
My experience with a similar string reference class is that classes with
string members (standard or custom) which need to be initialized in the
constructor often benefit from overloads for a string (rvalue) reference to
take advantage of move (or CoW) semantics and a string_ref for generality.
They end up needing a const char* overload to allow for string literals,
especially to disambiguate unit tests. I suspect that in most non-time
critical cases, using string_ref as the sole overload is sufficient.
> > > What happened to construction from contiguous containers (like array
> and
> >>
> >> > vector)?
> >>
> >> There's no way in the current standard to identify contiguous ranges,
> >
> > True, didn't LWG think that should be fixed?
>
> I think there's support for fixing that, but it edges into all the
> other "Fix Iterators!!!" proposals, so it's not part of this paper.
> Maybe one of the range proposals will have it. One possibility for
> doing that, aside from the obvious contiguous_iterator_tag route,
> would be to have contiguous iterators define an explicit operator
> T*().
>
> >> so string_ref can't be constructed from them, except by explicitly
> >> passing in .data() and .size(). In theory, it would be possible to
> >> give them outgoing conversion operators when their element type is
> >> char-like. Do you think that happens enough to extend their interface?
> >
> > void f(string_ref);
> >
> > CString mfc;
> > QString qt;
> > f(mfc);
> > f(qt);
> >
> > Wouldn't string_ref constructors for array and vector make more sense,
> until
> > contiguous ranges can be properly supported?
>
> I'm not sure what the CString and QString examples have to do with
> array and vector. CString and QString, being string-like classes,
> ought to define an operator string_ref() so that they can be passed to
> these functions. Are you suggesting that string_ref should have an
> implicit conversion from any type that defines .data() and .size()
> with the right return types? I'm skeptical of adding implicit
> conversions in general, but if you want it proposed, I'll include it
> as an option in the paper so people know to discuss it.
>
If we had the moral equivalent of std::is_contiguous_iterator<T>, it would
be convenient to provide a range constructor:
template<class T>
string_ref(T b, typename enable_if<std::is_contiguous_iterator<T>::value &&
is_same<typename T::value_type, char>::value, T>::type e)
: begin_(b) // possible implementation.
, end_(e)
{}
A special case of this is constructing from a substring of a string_ref,
supporting standard and user algorithms. e.g.
std::string_ref skip_ws(std::string_ref r) {
return std::string_ref(std::find_if_not(r.begin(), r.end(),
std::isspace));
}
Extending this to constructing from contiguous containers is
straightforward. Making the constructors implicit is potentially
controversial, but IMO desirable, as string_ref can be used as a universal
replacement for const std::string& in its typical usage as a function
parameter, with better interworking with vector and array character buffers
and string literals, and without surprising temporary std::string objects.
Cheers!
David
--
------=_Part_40_5676120.1354132491939
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
On Sunday, November 25, 2012 12:11:30 PM UTC-6, Jeffrey Yasskin wrote:<br><=
blockquote class=3D"gmail_quote" style=3D"margin: 0px 0px 0px 0.8ex; border=
-left-width: 1px; border-left-color: rgb(204, 204, 204); border-left-style:=
solid; padding-left: 1ex;">On Sun, Nov 25, 2012 at 5:16 AM, Olaf van der S=
pek <<a href=3D"" target=3D"_blank" gdf-obfuscated-mailto=3D"-ECDRPoCk0Q=
J" style=3D"cursor: pointer;">olafv...@gmail.com</a>> wrote: <br>&g=
t; Op zaterdag 24 november 2012 20:29:15 UTC+1 schreef Jeffrey Yasskin het&=
nbsp;<br>> volgende: <br>>> > I did notice this: <br>=
>> >> Returns: <br>>> > stox(string(str), idx, ba=
se) where x is the type suffix of the function <br>>> > calle=
d. <br>>> > I hope that's just semantics. std::string constru=
ction should be <br>>> > avoided. <br>>> <br>=
>> Absolutely. The current wording is at <br>>> <a hr=
ef=3D"https://github.com/google/cxx-std-draft/compare/master...string-ref#L=
1R5253" target=3D"_blank" style=3D"cursor: pointer;">https://github.com/goo=
gle/cxx-<wbr>std-draft/compare/master...<wbr>string-ref#L1R5253</a>, <=
br>>> which explicitly says that it's just as-if it called sto<wha=
tever> on <br>>> a temporary string. <br>> <br>&=
gt; Doesn't having both const string& and string_ref overloads cause am=
biguity <br>> when called with "7"? <br><br>Yes. I've got a no=
te at <br><a href=3D"https://github.com/google/cxx-std-draft/compare/m=
aster...string-ref#L1R808" target=3D"_blank" style=3D"cursor: pointer;">htt=
ps://github.com/google/cxx-<wbr>std-draft/compare/master...<wbr>string-ref#=
L1R808</a> <br>saying we either need to remove the string overloads or=
add const <br>char* overloads. I'll probably add the const char* over=
loads for the <br>next version of the paper because it matches what I'=
ve seen the LWG do <br>before. It would be nice to remove the string o=
verloads instead, but <br>it could break users who have defined an imp=
licit operator <br>std::string(). <br></blockquote><div><br></div=
><div>It can also break string classes with implicit <font face=3D"cou=
rier new, monospace">operator const char*()</font>, such as the Visual C++&=
nbsp;<font face=3D"courier new, monospace">CString</font>, but that is a Ba=
d Idea anyway.</div><div><br></div><div>My experience with a similar string=
reference class is that classes with string members (standard or custom) w=
hich need to be initialized in the constructor often benefit from overloads=
for a string (rvalue) reference to take advantage of move (or CoW) semanti=
cs and a <font face=3D"courier new, monospace">string_ref</font> =
for generality. They end up needing a <font face=3D"courier new, monos=
pace">const char*</font> overload to allow for string literals, especi=
ally to disambiguate unit tests. I suspect that in most non-time critical c=
ases, using <font face=3D"courier new, monospace">string_ref </fo=
nt>as the sole overload is sufficient.</div><div> </div><blockquote cl=
ass=3D"gmail_quote" style=3D"margin: 0px 0px 0px 0.8ex; border-left-width: =
1px; border-left-color: rgb(204, 204, 204); border-left-style: solid; paddi=
ng-left: 1ex;">> > What happened to construction from contiguous cont=
ainers (like array and <br>>> <br>>> > vector)?&nb=
sp;<br>>> <br>>> There's no way in the current standard to=
identify contiguous ranges, <br>> <br>> True, didn't LWG t=
hink that should be fixed? <br><br>I think there's support for fixing =
that, but it edges into all the <br>other "Fix Iterators!!!" proposals=
, so it's not part of this paper. <br>Maybe one of the range proposals=
will have it. One possibility for <br>doing that, aside from the obvi=
ous contiguous_iterator_tag route, <br>would be to have contiguous ite=
rators define an explicit operator <br>T*(). <br><br>>> so =
string_ref can't be constructed from them, except by explicitly <br>&g=
t;> passing in .data() and .size(). In theory, it would be possible to&n=
bsp;<br>>> give them outgoing conversion operators when their element=
type is <br>>> char-like. Do you think that happens enough to e=
xtend their interface? <br>> <br>> void f(string_ref); =
;<br>> <br>> CString mfc; <br>> QString qt; <br>>=
; f(mfc); <br>> f(qt); <br>> <br>> Wouldn't string_=
ref constructors for array and vector make more sense, until <br>> =
contiguous ranges can be properly supported? <br><br>I'm not sure what=
the CString and QString examples have to do with <br>array and vector=
.. CString and QString, being string-like classes, <br>ought to define =
an operator string_ref() so that they can be passed to <br>these funct=
ions. Are you suggesting that string_ref should have an <br>implicit c=
onversion from any type that defines .data() and .size() <br>with the =
right return types? I'm skeptical of adding implicit <br>conversions i=
n general, but if you want it proposed, I'll include it <br>as an opti=
on in the paper so people know to discuss it. <br></blockquote><div><b=
r></div><div>If we had the moral equivalent of <font face=3D"courier n=
ew, monospace">std::is_contiguous_iterator<T></font>, it would be con=
venient to provide a range constructor:</div><div><blockquote style=3D"marg=
in: 0px 0px 0px 40px; border: none; padding: 0px;"><div class=3D"prettyprin=
t" style=3D"border: 1px solid rgb(187, 187, 187); background-color: rgb(250=
, 250, 250); word-wrap: break-word;"><code class=3D"prettyprint"><div class=
=3D"subprettyprint"><span class=3D"styled-by-prettify" style=3D"color: rgb(=
0, 0, 0);"><br></span><span class=3D"styled-by-prettify" style=3D"color: rg=
b(0, 0, 136);">template</span><span class=3D"styled-by-prettify" style=3D"c=
olor: rgb(102, 102, 0);"><</span><span class=3D"styled-by-prettify" styl=
e=3D"color: rgb(0, 0, 136);">class</span><span class=3D"styled-by-prettify"=
style=3D"color: rgb(0, 0, 0);"> T</span><span class=3D"styled-by-pret=
tify" style=3D"color: rgb(102, 102, 0);">></span><span class=3D"styled-b=
y-prettify" style=3D"color: rgb(0, 0, 0);"><br>string_ref</span><span class=
=3D"styled-by-prettify"><font color=3D"#666600">(T b,</font></span><span cl=
ass=3D"styled-by-prettify"><font color=3D"#666600"> typename enable_if=
<std::is_contiguous_iterator<T>::value && is_same<typen=
ame T::value_type, char>::value, T>::type e)<br> : begin=
_(b) // possible implementation.<br> , end_(e)<br>{}</fo=
nt></span></div><div class=3D"subprettyprint"><span class=3D"styled-by-pret=
tify" style=3D"color: rgb(0, 0, 0);"><br></span></div></code></div></blockq=
uote><div><br></div>A special case of this is constructing from a substring=
of a <span style=3D"background-color: rgb(250, 250, 250); color: rgb(=
0, 0, 0); font-family: monospace;">string_ref</span>, supporting standard a=
nd user algorithms. e.g.</div><blockquote style=3D"margin: 0px 0px 0px 40px=
; border: none; padding: 0px;"><div class=3D"prettyprint" style=3D"border: =
1px solid rgb(187, 187, 187); background-color: rgb(250, 250, 250); word-wr=
ap: break-word;"><code class=3D"prettyprint"><div class=3D"subprettyprint">=
<font color=3D"#660066">std::string_ref skip_ws(std::string_ref r) { <=
br></font><span style=3D"color: rgb(102, 0, 102);"> return std=
::string_ref(std::find_if_not(r.begin(), r.end(), std::isspace));<br></span=
><span style=3D"color: rgb(102, 0, 102);">}</span></div></code></div></bloc=
kquote><div><br>Extending this to constructing from contiguous containers i=
s straightforward. Making the constructors implicit is potentially controve=
rsial, but IMO desirable, as <span style=3D"background-color: rgb(250,=
250, 250); color: rgb(0, 0, 0); font-family: monospace;">string_ref</span>=
can be used as a universal replacement for <span style=3D"backgr=
ound-color: rgb(250, 250, 250); color: rgb(0, 0, 0); font-family: monospace=
;">const std::string&</span> in its typical usage as a function pa=
rameter, with better interworking with vector and array character buffers a=
nd string literals, and without surprising temporary <span style=3D"ba=
ckground-color: rgb(250, 250, 250); color: rgb(0, 0, 0); font-family: monos=
pace;">std::string</span> objects.</div><div><br></div><div>Cheers!</d=
iv><div><br></div><div>David</div>
<p></p>
-- <br />
<br />
<br />
<br />
------=_Part_40_5676120.1354132491939--
.
Author: Olaf van der Spek <olafvdspek@gmail.com>
Date: Fri, 28 Dec 2012 03:44:53 -0800 (PST)
Raw View
------=_Part_480_24302816.1356695093829
Content-Type: text/plain; charset=ISO-8859-1
Op maandag 26 november 2012 13:14:59 UTC+1 schreef Olaf van der Spek het
volgende:
>
> On Sun, Nov 25, 2012 at 7:11 PM, Jeffrey Yasskin <jyasskin@googlers.com>
> wrote:
> >> Doesn't having both const string& and string_ref overloads cause
> ambiguity
> >> when called with "7"?
> >
> > Yes. I've got a note at
> >
> https://github.com/google/cxx-std-draft/compare/master...string-ref#L1R808
> > saying we either need to remove the string overloads or add const
> > char* overloads. I'll probably add the const char* overloads for the
> > next version of the paper because it matches what I've seen the LWG do
>
> string_ref has the potential to clean up interfaces, but having 3
> overloads would be a seriously unclean interface.
> 3 similar overloads might have even more potential for ambiguity.
>
> > before. It would be nice to remove the string overloads instead, but
> > it could break users who have defined an implicit operator
> > std::string().
>
> That's a bit unfortunate.
>
> >> > What happened to construction from contiguous containers (like array
> and
> >>>
> >>> > vector)?
> >>>
> >>> There's no way in the current standard to identify contiguous ranges,
>
> Array and vector are contiguous by definition.
>
> >>
> >> True, didn't LWG think that should be fixed?
> >
> > I think there's support for fixing that, but it edges into all the
> > other "Fix Iterators!!!" proposals, so it's not part of this paper.
>
> string_ref would greatly benefit from it, though. Is there no way to
> use data() and size() implicitly instead?
>
> > Maybe one of the range proposals will have it. One possibility for
> > doing that, aside from the obvious contiguous_iterator_tag route,
> > would be to have contiguous iterators define an explicit operator
> > T*().
>
> That's a nice and clean idea!
>
> >>> so string_ref can't be constructed from them, except by explicitly
> >>> passing in .data() and .size(). In theory, it would be possible to
> >>> give them outgoing conversion operators when their element type is
> >>> char-like. Do you think that happens enough to extend their interface?
> >>
> >>
> >> void f(string_ref);
> >>
> >> CString mfc;
> >> QString qt;
> >> f(mfc);
> >> f(qt);
> >>
> >> Wouldn't string_ref constructors for array and vector make more sense,
> until
> >> contiguous ranges can be properly supported?
> >
> > I'm not sure what the CString and QString examples have to do with
> > array and vector. CString and QString, being string-like classes,
>
> They're all contiguous (char) ranges.
>
> > ought to define an operator string_ref() so that they can be passed to
> > these functions. Are you suggesting that string_ref should have an
>
> That'd work, but it's not as generic.
>
> > implicit conversion from any type that defines .data() and .size()
> > with the right return types? I'm skeptical of adding implicit
> > conversions in general,
>
> Why? Due to being based on data and size instead of begin and end?
> I'd be more concerned about the implicit conversion from const char*,
> though that's necessary to support string literals.
>
> > but if you want it proposed, I'll include it
> > as an option in the paper so people know to discuss it.
>
>
^
Jeffrey?
--
------=_Part_480_24302816.1356695093829
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
<br><br>Op maandag 26 november 2012 13:14:59 UTC+1 schreef Olaf van der Spe=
k het volgende:<blockquote class=3D"gmail_quote" style=3D"margin: 0;margin-=
left: 0.8ex;border-left: 1px #ccc solid;padding-left: 1ex;">On Sun, Nov 25,=
2012 at 7:11 PM, Jeffrey Yasskin <<a href=3D"mailto:jyasskin@googlers.c=
om" target=3D"_blank">jyasskin@googlers.com</a>> wrote:
<br>>> Doesn't having both const string& and string_ref overloads=
cause ambiguity
<br>>> when called with "7"?
<br>>
<br>> Yes. I've got a note at
<br>> <a href=3D"https://github.com/google/cxx-std-draft/compare/master.=
...string-ref#L1R808" target=3D"_blank">https://github.com/google/cxx-<wbr>s=
td-draft/compare/master...<wbr>string-ref#L1R808</a>
<br>> saying we either need to remove the string overloads or add const
<br>> char* overloads. I'll probably add the const char* overloads for t=
he
<br>> next version of the paper because it matches what I've seen the LW=
G do
<br>
<br>string_ref has the potential to clean up interfaces, but having 3
<br>overloads would be a seriously unclean interface.
<br>3 similar overloads might have even more potential for ambiguity.
<br>
<br>> before. It would be nice to remove the string overloads instead, b=
ut
<br>> it could break users who have defined an implicit operator
<br>> std::string().
<br>
<br>That's a bit unfortunate.
<br>
<br>>> > What happened to construction from contiguous conta=
iners (like array and
<br>>>>
<br>>>> > vector)?
<br>>>>
<br>>>> There's no way in the current standard to identify contigu=
ous ranges,
<br>
<br>Array and vector are contiguous by definition.
<br>
<br>>>
<br>>> True, didn't LWG think that should be fixed?
<br>>
<br>> I think there's support for fixing that, but it edges into all the
<br>> other "Fix Iterators!!!" proposals, so it's not part of this paper=
..
<br>
<br>string_ref would greatly benefit from it, though. Is there no way to
<br>use data() and size() implicitly instead?
<br>
<br>> Maybe one of the range proposals will have it. One possibility for
<br>> doing that, aside from the obvious contiguous_iterator_tag route,
<br>> would be to have contiguous iterators define an explicit operator
<br>> T*().
<br>
<br>That's a nice and clean idea!
<br>
<br>>>> so string_ref can't be constructed from them, except by ex=
plicitly
<br>>>> passing in .data() and .size(). In theory, it would be pos=
sible to
<br>>>> give them outgoing conversion operators when their element=
type is
<br>>>> char-like. Do you think that happens enough to extend thei=
r interface?
<br>>>
<br>>>
<br>>> void f(string_ref);
<br>>>
<br>>> CString mfc;
<br>>> QString qt;
<br>>> f(mfc);
<br>>> f(qt);
<br>>>
<br>>> Wouldn't string_ref constructors for array and vector make mor=
e sense, until
<br>>> contiguous ranges can be properly supported?
<br>>
<br>> I'm not sure what the CString and QString examples have to do with
<br>> array and vector. CString and QString, being string-like classes,
<br>
<br>They're all contiguous (char) ranges.
<br>
<br>> ought to define an operator string_ref() so that they can be passe=
d to
<br>> these functions. Are you suggesting that string_ref should have an
<br>
<br>That'd work, but it's not as generic.
<br>
<br>> implicit conversion from any type that defines .data() and .size()
<br>> with the right return types? I'm skeptical of adding implicit
<br>> conversions in general,
<br>
<br>Why? Due to being based on data and size instead of begin and end?
<br>I'd be more concerned about the implicit conversion from const char*,
<br>though that's necessary to support string literals.
<br>
<br>> but if you want it proposed, I'll include it
<br>> as an option in the paper so people know to discuss it.
<br><br></blockquote><div><br></div><div>^</div><div><br></div><div>Jeffrey=
?</div><div><br></div><div> </div>
<p></p>
-- <br />
<br />
<br />
<br />
------=_Part_480_24302816.1356695093829--
.