Topic: Support portable getline on different OS:
Author: Francis ANDRE <francis.andre.kampbell@orange.fr>
Date: Tue, 5 Apr 2016 11:24:34 +0200
Raw View
Hi
A/ The current specifications for std::getline could be summarized as
----------------------------------------------------------------------------------------------------------------------------------------
1)
istream& getline (istream& is, string& str, char delim);
istream& getline (istream&& is, string& str, char delim);
(2)
istream& getline (istream& is, string& str);
istream& getline (istream&& is, string& str);
Get line from stream into string: Extracts characters from is and stores
them into str until the delimitation character delim is found (or the
newline character, '\n', for (2)). The extraction also stops if the end
of file is reached in is or if some other error occurs during the input
operation. If the delimiter is found, it is extracted and discarded
(i.e. it is not stored and the next input operation will begin after it).
----------------------------------------------------------------------------------------------------------------------------------------
B/ Issue for portability (1)
Those specifications do not permit to build portable code over various
OS where the end-of-line delimiter is different of '\n' or consists or
more than one control charater as on Windows CR + LF or old Mac system
LF + CR. See https://en.wikipedia.org/wiki/Newline.
C/ Proposal
Thus I am proposing to add a new getline prototype as
istream& getline (istream& is, string& str, const char* delim);
istream& getline (istream&& is, string& str, cons char* delim);
so that any end-of-line delimiter be discarded and the resulting string
be the same on whatever OS it runs. It could be also usefull when
reading a foreign file which has a different end-of-line convention than
the native one.
Looking forward to feedback or interest.
--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/57038452.4080405%40orange.fr.
.
Author: Nicol Bolas <jmckesson@gmail.com>
Date: Tue, 5 Apr 2016 06:02:49 -0700 (PDT)
Raw View
------=_Part_96_1901243286.1459861369225
Content-Type: multipart/alternative;
boundary="----=_Part_97_1743951842.1459861369225"
------=_Part_97_1743951842.1459861369225
Content-Type: text/plain; charset=UTF-8
Newline conversion is something that gets done by the stream itself when in
text translation mode. So if you want platform-neutral text reading, open
the stream as a text stream.
--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/c2be3261-4720-4cfa-becd-90f03bc45e9a%40isocpp.org.
------=_Part_97_1743951842.1459861369225
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable
<div dir=3D"ltr">Newline conversion is something that gets done by the stre=
am itself when in text translation mode. So if you want platform-neutral te=
xt reading, open the stream as a text stream.<br></div>
<p></p>
-- <br />
You received this message because you are subscribed to the Google Groups &=
quot;ISO C++ Standard - Future Proposals" group.<br />
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to <a href=3D"mailto:std-proposals+unsubscribe@isocpp.org">std-proposa=
ls+unsubscribe@isocpp.org</a>.<br />
To post to this group, send email to <a href=3D"mailto:std-proposals@isocpp=
..org">std-proposals@isocpp.org</a>.<br />
To view this discussion on the web visit <a href=3D"https://groups.google.c=
om/a/isocpp.org/d/msgid/std-proposals/c2be3261-4720-4cfa-becd-90f03bc45e9a%=
40isocpp.org?utm_medium=3Demail&utm_source=3Dfooter">https://groups.google.=
com/a/isocpp.org/d/msgid/std-proposals/c2be3261-4720-4cfa-becd-90f03bc45e9a=
%40isocpp.org</a>.<br />
------=_Part_97_1743951842.1459861369225--
------=_Part_96_1901243286.1459861369225--
.
Author: Francis ANDRE <francis.andre.kampbell@orange.fr>
Date: Tue, 5 Apr 2016 15:36:55 +0200
Raw View
This is a multi-part message in MIME format.
--------------000401040102070605080105
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable
Le 05/04/2016 15:02, Nicol Bolas a =C3=A9crit :
> Newline conversion is something that gets done by the stream itself
> when in text translation mode. So if you want platform-neutral text
> reading, open the stream as a text stream.
I would like also to getline from foreign stream (a Windows stream
processed on a Linux machine), the later not be formated as expected by
the stream itself...
> --=20
> You received this message because you are subscribed to the Google
> Groups "ISO C++ Standard - Future Proposals" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to std-proposals+unsubscribe@isocpp.org
> <mailto:std-proposals+unsubscribe@isocpp.org>.
> To post to this group, send email to std-proposals@isocpp.org
> <mailto:std-proposals@isocpp.org>.
> To view this discussion on the web visit
> https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/c2be3261-472=
0-4cfa-becd-90f03bc45e9a%40isocpp.org
> <https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/c2be3261-47=
20-4cfa-becd-90f03bc45e9a%40isocpp.org?utm_medium=3Demail&utm_source=3Dfoot=
er>.
--=20
You received this message because you are subscribed to the Google Groups "=
ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp=
..org/d/msgid/std-proposals/5703BF77.9000600%40orange.fr.
--------------000401040102070605080105
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable
<html>
<head>
<meta content=3D"text/html; charset=3Dutf-8" http-equiv=3D"Content-Type=
">
</head>
<body bgcolor=3D"#FFFFFF" text=3D"#000000">
<br>
<br>
<div class=3D"moz-cite-prefix">Le 05/04/2016 15:02, Nicol Bolas a
=C3=A9crit=C2=A0:<br>
</div>
<blockquote
cite=3D"mid:c2be3261-4720-4cfa-becd-90f03bc45e9a@isocpp.org"
type=3D"cite">
<div dir=3D"ltr">Newline conversion is something that gets done by
the stream itself when in text translation mode. So if you want
platform-neutral text reading, open the stream as a text stream.<br=
>
</div>
</blockquote>
I would like also to getline from foreign stream (a Windows stream
processed on a Linux machine), the later not be formated as expected
by the stream itself...<br>
<blockquote
cite=3D"mid:c2be3261-4720-4cfa-becd-90f03bc45e9a@isocpp.org"
type=3D"cite">
-- <br>
You received this message because you are subscribed to the Google
Groups "ISO C++ Standard - Future Proposals" group.<br>
To unsubscribe from this group and stop receiving emails from it,
send an email to <a moz-do-not-send=3D"true"
href=3D"mailto:std-proposals+unsubscribe@isocpp.org">std-proposals+=
unsubscribe@isocpp.org</a>.<br>
To post to this group, send email to <a moz-do-not-send=3D"true"
href=3D"mailto:std-proposals@isocpp.org">std-proposals@isocpp.org</=
a>.<br>
To view this discussion on the web visit <a
moz-do-not-send=3D"true"
href=3D"https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/c2be32=
61-4720-4cfa-becd-90f03bc45e9a%40isocpp.org?utm_medium=3Demail&utm_sour=
ce=3Dfooter"><a class=3D"moz-txt-link-freetext" href=3D"https://groups.goog=
le.com/a/isocpp.org/d/msgid/std-proposals/c2be3261-4720-4cfa-becd-90f03bc45=
e9a%40isocpp.org">https://groups.google.com/a/isocpp.org/d/msgid/std-propos=
als/c2be3261-4720-4cfa-becd-90f03bc45e9a%40isocpp.org</a></a>.<br>
</blockquote>
<br>
</body>
</html>
<p></p>
-- <br />
You received this message because you are subscribed to the Google Groups &=
quot;ISO C++ Standard - Future Proposals" group.<br />
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to <a href=3D"mailto:std-proposals+unsubscribe@isocpp.org">std-proposa=
ls+unsubscribe@isocpp.org</a>.<br />
To post to this group, send email to <a href=3D"mailto:std-proposals@isocpp=
..org">std-proposals@isocpp.org</a>.<br />
To view this discussion on the web visit <a href=3D"https://groups.google.c=
om/a/isocpp.org/d/msgid/std-proposals/5703BF77.9000600%40orange.fr?utm_medi=
um=3Demail&utm_source=3Dfooter">https://groups.google.com/a/isocpp.org/d/ms=
gid/std-proposals/5703BF77.9000600%40orange.fr</a>.<br />
--------------000401040102070605080105--
.
Author: Matthew Woehlke <mwoehlke.floss@gmail.com>
Date: Tue, 05 Apr 2016 10:29:31 -0400
Raw View
On 2016-04-05 09:02, Nicol Bolas wrote:
> Newline conversion is something that gets done by the stream itself when in
> text translation mode.
....only for *native* streams. As I understand it, Francis wants to open
files with *non*-native line endings and read them "as text".
Actually, I would take this a step further and accept a regular
expression as the line delineator. I think this is necessary to
automatically handle all three common styles of line endings. (You can't
just say "any of CR or LF", because then you will get empty lines
between every real line with CRLF endings, but you also want to be able
to handle both CR and LF by themselves.)
Note that in either case, the suggestion would also allow handling of
non-standard delineators, e.g. ';', '\0', ...
--
Matthew
--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/ne0i4c%24b2a%241%40ger.gmane.org.
.
Author: =?utf-8?Q?Dietmar_K=C3=BChl?= <dietmar.kuehl@gmail.com>
Date: Tue, 5 Apr 2016 16:37:25 +0100
Raw View
Dealing with line-ending differences on parsing level is the wrong place to deal with it. The stream buffer level is the correct place: just create a suitable filtering stream buffer and move on.
> On 5 Apr 2016, at 15:29, Matthew Woehlke <mwoehlke.floss@gmail.com> wrote:
>
>> On 2016-04-05 09:02, Nicol Bolas wrote:
>> Newline conversion is something that gets done by the stream itself when in
>> text translation mode.
>
> ...only for *native* streams. As I understand it, Francis wants to open
> files with *non*-native line endings and read them "as text".
>
> Actually, I would take this a step further and accept a regular
> expression as the line delineator. I think this is necessary to
> automatically handle all three common styles of line endings. (You can't
> just say "any of CR or LF", because then you will get empty lines
> between every real line with CRLF endings, but you also want to be able
> to handle both CR and LF by themselves.)
>
> Note that in either case, the suggestion would also allow handling of
> non-standard delineators, e.g. ';', '\0', ...
>
> --
> Matthew
>
> --
> You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
> To post to this group, send email to std-proposals@isocpp.org.
> To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/ne0i4c%24b2a%241%40ger.gmane.org.
--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/CDE88A50-951E-4111-8FEC-C1FCE76C831C%40gmail.com.
.
Author: Francis ANDRE <francis.andre.kampbell@orange.fr>
Date: Tue, 5 Apr 2016 21:34:38 +0200
Raw View
Le 05/04/2016 16:29, Matthew Woehlke a =C3=A9crit :
> On 2016-04-05 09:02, Nicol Bolas wrote:
>> Newline conversion is something that gets done by the stream itself when=
in=20
>> text translation mode.
> ...only for *native* streams. As I understand it, Francis wants to open
> files with *non*-native line endings and read them "as text".
>
> Actually, I would take this a step further and accept a regular
> expression as the line delineator. I think this is necessary to
> automatically handle all three common styles of line endings. (You can't
> just say "any of CR or LF", because then you will get empty lines
> between every real line with CRLF endings, but you also want to be able
> to handle both CR and LF by themselves.)
Using regular expression would be quite overhelming... Main standard
end-of-line are CR/LR, LF/CR, CR, LF and some exotic single or double
control character...thus I would say that const char* makes the job.
>
> Note that in either case, the suggestion would also allow handling of
> non-standard delineators, e.g. ';', '\0', ...
>
--=20
You received this message because you are subscribed to the Google Groups "=
ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp=
..org/d/msgid/std-proposals/5704134E.3030001%40orange.fr.
.
Author: Francis ANDRE <francis.andre.kampbell@orange.fr>
Date: Tue, 5 Apr 2016 21:37:21 +0200
Raw View
So explain me why there are those functions in the ISO standard and why
the default EOL in the second set is '\n'??? A Linux bias from the
standard committee??
1) =20
istream& getline (istream& is, string& str, char delim);
istream& getline (istream&& is, string& str, char delim);
(2) =20
istream& getline (istream& is, string& str);
istream& getline (istream&& is, string& str);
Le 05/04/2016 17:37, Dietmar K=C3=BChl a =C3=A9crit :
> Dealing with line-ending differences on parsing level is the wrong place =
to deal with it. The stream buffer level is the correct place: just create =
a suitable filtering stream buffer and move on.
>
>> On 5 Apr 2016, at 15:29, Matthew Woehlke <mwoehlke.floss@gmail.com> wrot=
e:
>>
>>> On 2016-04-05 09:02, Nicol Bolas wrote:
>>> Newline conversion is something that gets done by the stream itself whe=
n in=20
>>> text translation mode.
>> ...only for *native* streams. As I understand it, Francis wants to open
>> files with *non*-native line endings and read them "as text".
>>
>> Actually, I would take this a step further and accept a regular
>> expression as the line delineator. I think this is necessary to
>> automatically handle all three common styles of line endings. (You can't
>> just say "any of CR or LF", because then you will get empty lines
>> between every real line with CRLF endings, but you also want to be able
>> to handle both CR and LF by themselves.)
>>
>> Note that in either case, the suggestion would also allow handling of
>> non-standard delineators, e.g. ';', '\0', ...
>>
>> --=20
>> Matthew
>>
>> --=20
>> You received this message because you are subscribed to the Google Group=
s "ISO C++ Standard - Future Proposals" group.
>> To unsubscribe from this group and stop receiving emails from it, send a=
n email to std-proposals+unsubscribe@isocpp.org.
>> To post to this group, send email to std-proposals@isocpp.org.
>> To view this discussion on the web visit https://groups.google.com/a/iso=
cpp.org/d/msgid/std-proposals/ne0i4c%24b2a%241%40ger.gmane.org.
--=20
You received this message because you are subscribed to the Google Groups "=
ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp=
..org/d/msgid/std-proposals/570413F1.2000903%40orange.fr.
.
Author: Matthew Woehlke <mwoehlke.floss@gmail.com>
Date: Tue, 05 Apr 2016 15:55:40 -0400
Raw View
On 2016-04-05 15:34, Francis ANDRE wrote:
> Le 05/04/2016 16:29, Matthew Woehlke a =C3=A9crit :
>> Actually, I would take this a step further and accept a regular
>> expression as the line delineator. I think this is necessary to
>> automatically handle all three common styles of line endings. (You can't
>> just say "any of CR or LF", because then you will get empty lines
>> between every real line with CRLF endings, but you also want to be able
>> to handle both CR and LF by themselves.)
>
> Using regular expression would be quite overhelming... Main standard
> end-of-line are CR/LR, LF/CR, CR, LF and some exotic single or double
> control character...thus I would say that const char* makes the job.
What does that parameter mean? "Any of"? "Exactly"?
If "any of", you will get a wrong result for CRLF line endings (every
other line empty). If "exactly", you have to know the endings
beforehand. (Simply suppressing blank lines is not an option either, as
some users will need to know about those.)
With a regular expression, you could write "\r?\n?" and it will handle
any of CR, LF or CRLF, and even many cases of mixed line endings.
If you don't mind the API requiring the user to know the line endings
beforehand, then sure, a string_view=C2=B9 matched exactly would suffice.
(=C2=B9 ...or is it string_span these days? I forget; haven't been paying
attention to those proposals...)
--=20
Matthew
--=20
You received this message because you are subscribed to the Google Groups "=
ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp=
..org/d/msgid/std-proposals/ne157s%24c7s%241%40ger.gmane.org.
.
Author: Tony V E <tvaneerd@gmail.com>
Date: Tue, 05 Apr 2016 15:59:50 -0400
Raw View
Why \n?
- because it is the newline character?
Linux bias?
-do you understand the history of C++, C, and Unix? And the history of Linu=
x?
Sent=C2=A0from=C2=A0my=C2=A0BlackBerry=C2=A0portable=C2=A0Babbage=C2=A0Devi=
ce
=C2=A0 Original Message =C2=A0
From: Francis ANDRE
Sent: Tuesday, April 5, 2016 3:37 PM
To: std-proposals@isocpp.org
Reply To: std-proposals@isocpp.org
Subject: Re: [std-proposals] Re: Support portable getline on different OS: =
std::getline(istream& is, string& str, const char* delim) ;
So explain me why there are those functions in the ISO standard and why
the default EOL in the second set is '\n'??? A Linux bias from the
standard committee??
1)=20
istream& getline (istream& is, string& str, char delim);
istream& getline (istream&& is, string& str, char delim);
(2)=20
istream& getline (istream& is, string& str);
istream& getline (istream&& is, string& str);
Le 05/04/2016 17:37, Dietmar K=C3=BChl a =C3=A9crit :
> Dealing with line-ending differences on parsing level is the wrong place =
to deal with it. The stream buffer level is the correct place: just create =
a suitable filtering stream buffer and move on.
>
>> On 5 Apr 2016, at 15:29, Matthew Woehlke <mwoehlke.floss@gmail.com> wrot=
e:
>>
>>> On 2016-04-05 09:02, Nicol Bolas wrote:
>>> Newline conversion is something that gets done by the stream itself whe=
n in=20
>>> text translation mode.
>> ...only for *native* streams. As I understand it, Francis wants to open
>> files with *non*-native line endings and read them "as text".
>>
>> Actually, I would take this a step further and accept a regular
>> expression as the line delineator. I think this is necessary to
>> automatically handle all three common styles of line endings. (You can't
>> just say "any of CR or LF", because then you will get empty lines
>> between every real line with CRLF endings, but you also want to be able
>> to handle both CR and LF by themselves.)
>>
>> Note that in either case, the suggestion would also allow handling of
>> non-standard delineators, e.g. ';', '\0', ...
>>
>> --=20
>> Matthew
>>
>> --=20
>> You received this message because you are subscribed to the Google Group=
s "ISO C++ Standard - Future Proposals" group.
>> To unsubscribe from this group and stop receiving emails from it, send a=
n email to std-proposals+unsubscribe@isocpp.org.
>> To post to this group, send email to std-proposals@isocpp.org.
>> To view this discussion on the web visit https://groups.google.com/a/iso=
cpp.org/d/msgid/std-proposals/ne0i4c%24b2a%241%40ger.gmane.org.
--=20
You received this message because you are subscribed to the Google Groups "=
ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp=
..org/d/msgid/std-proposals/570413F1.2000903%40orange.fr.
--=20
You received this message because you are subscribed to the Google Groups "=
ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp=
..org/d/msgid/std-proposals/20160405195950.4915281.78460.9135%40gmail.com.
.
Author: Francis ANDRE <francis.andre.kampbell@orange.fr>
Date: Tue, 5 Apr 2016 22:03:08 +0200
Raw View
Le 05/04/2016 16:29, Matthew Woehlke a =C3=A9crit :
> On 2016-04-05 09:02, Nicol Bolas wrote:
>> Newline conversion is something that gets done by the stream itself when=
in=20
>> text translation mode.
Hold on... it is also for *native* stream opened in binary mode...
On Windows, the OS end-of-line is CRLF. Reading a native file opened in
binary mode for functional reason with getline() returns a string with
'\r' by the end... This proposal would avoid this issue.
> ...only for *native* streams. As I understand it, Francis wants to open
> files with *non*-native line endings and read them "as text".
>
> Actually, I would take this a step further and accept a regular
> expression as the line delineator. I think this is necessary to
> automatically handle all three common styles of line endings. (You can't
> just say "any of CR or LF", because then you will get empty lines
> between every real line with CRLF endings, but you also want to be able
> to handle both CR and LF by themselves.)
>
> Note that in either case, the suggestion would also allow handling of
> non-standard delineators, e.g. ';', '\0', ...
>
--=20
You received this message because you are subscribed to the Google Groups "=
ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp=
..org/d/msgid/std-proposals/570419FC.4060600%40orange.fr.
.
Author: gmisocpp@gmail.com
Date: Tue, 5 Apr 2016 13:46:59 -0700 (PDT)
Raw View
------=_Part_1389_1623094807.1459889219139
Content-Type: multipart/alternative;
boundary="----=_Part_1390_370416557.1459889219140"
------=_Part_1390_370416557.1459889219140
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable
Hi
On Wednesday, April 6, 2016 at 8:03:10 AM UTC+12, Francis Andre wrote:
>
>
>
> Le 05/04/2016 16:29, Matthew Woehlke a =C3=A9crit :=20
> > On 2016-04-05 09:02, Nicol Bolas wrote:=20
> >> Newline conversion is something that gets done by the stream itself=20
> when in=20
> >> text translation mode.=20
> Hold on... it is also for *native* stream opened in binary mode...=20
>
> On Windows, the OS end-of-line is CRLF. Reading a native file opened in=
=20
> binary mode for functional reason with getline() returns a string with=20
> '\r' by the end... This proposal would avoid this issue.=20
> > ...only for *native* streams. As I understand it, Francis wants to open=
=20
> > files with *non*-native line endings and read them "as text".=20
> >=20
> > Actually, I would take this a step further and accept a regular=20
> > expression as the line delineator. I think this is necessary to=20
> > automatically handle all three common styles of line endings. (You can'=
t=20
> > just say "any of CR or LF", because then you will get empty lines=20
> > between every real line with CRLF endings, but you also want to be able=
=20
> > to handle both CR and LF by themselves.)=20
> >=20
> > Note that in either case, the suggestion would also allow handling of=
=20
> > non-standard delineators, e.g. ';', '\0', ...=20
> >=20
>
>
I think I got this originally from stack overflow somewhere and it seems to=
=20
work?
std::istream& readline(std::istream& is, std::string& t)
{
t.clear();
// The characters in the stream are read one-by-one using a=20
std::streambuf.
// That is faster than reading them one-by-one using the std::istream.
// Code that uses streambuf this way must be guarded by a sentry object=
..
// The sentry object performs various tasks,
// such as thread synchronization and updating the stream state.
std::istream::sentry se(is, true);
std::streambuf* sb =3D is.rdbuf();
for(;;)
{
int c =3D sb->sbumpc();
switch (c) {
case '\n':
return is;
case '\r':
if(sb->sgetc() =3D=3D '\n')
sb->sbumpc();
return is;
case EOF:
// Also handle the case when the last line has no line ending
if(t.empty())
is.setstate(std::ios::eofbit);
return is;
default:
t +=3D (char)c;
}
}
}
=20
--=20
You received this message because you are subscribed to the Google Groups "=
ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp=
..org/d/msgid/std-proposals/09a200fc-971f-44fc-ac24-98aa9719cd21%40isocpp.or=
g.
------=_Part_1390_370416557.1459889219140
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable
<div dir=3D"ltr">Hi<br><br>On Wednesday, April 6, 2016 at 8:03:10 AM UTC+12=
, Francis Andre wrote:<blockquote class=3D"gmail_quote" style=3D"margin: 0p=
x 0px 0px 0.8ex; padding-left: 1ex; border-left-color: rgb(204, 204, 204); =
border-left-width: 1px; border-left-style: solid;">
<br>
<br>Le 05/04/2016 16:29, Matthew Woehlke a =C3=A9crit :
<br>> On 2016-04-05 09:02, Nicol Bolas wrote:
<br>>> Newline conversion is something that gets done by the stream i=
tself when in=20
<br>>> text translation mode.
<br>Hold on... it is also for *native* stream opened in binary mode...
<br>
<br>On Windows, the OS end-of-line is CRLF. Reading a native file opened in
<br>binary mode for functional reason with getline() returns a string with
<br>'\r' by the end... =C2=A0This proposal would avoid this issue.
<br>> ...only for *native* streams. As I understand it, Francis wants to=
open
<br>> files with *non*-native line endings and read them "as text&q=
uot;.
<br>>
<br>> Actually, I would take this a step further and accept a regular
<br>> expression as the line delineator. I think this is necessary to
<br>> automatically handle all three common styles of line endings. (You=
can't
<br>> just say "any of CR or LF", because then you will get em=
pty lines
<br>> between every real line with CRLF endings, but you also want to be=
able
<br>> to handle both CR and LF by themselves.)
<br>>
<br>> Note that in either case, the suggestion would also allow handling=
of
<br>> non-standard delineators, e.g. ';', '\0', ...
<br>>
<br>
<br></blockquote><div><br></div><div>I think I got this originally from sta=
ck overflow somewhere and it seems to work?</div><div><br></div><div>std::i=
stream& readline(std::istream& is, std::string& t)<br>{<br>=C2=
=A0=C2=A0=C2=A0 t.clear();</div><div>=C2=A0=C2=A0=C2=A0 // The characters i=
n the stream are read one-by-one using a std::streambuf.<br>=C2=A0=C2=A0=C2=
=A0 // That is faster than reading them one-by-one using the std::istream.<=
br>=C2=A0=C2=A0=C2=A0 // Code that uses streambuf this way must be guarded =
by a sentry object.<br>=C2=A0=C2=A0=C2=A0 // The sentry object performs var=
ious tasks,<br>=C2=A0=C2=A0=C2=A0 // such as thread synchronization and upd=
ating the stream state.</div><div>=C2=A0=C2=A0=C2=A0 std::istream::sentry s=
e(is, true);<br>=C2=A0=C2=A0=C2=A0 std::streambuf* sb =3D is.rdbuf();</div>=
<div>=C2=A0=C2=A0=C2=A0 for(;;)<br>=C2=A0=C2=A0=C2=A0 {<br>=C2=A0=C2=A0=C2=
=A0=C2=A0=C2=A0=C2=A0=C2=A0 int c =3D sb->sbumpc();<br>=C2=A0=C2=A0=C2=
=A0=C2=A0=C2=A0=C2=A0=C2=A0 switch (c) {<br>=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=
=C2=A0=C2=A0 case '\n':<br>=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=
=A0=C2=A0=C2=A0=C2=A0=C2=A0 return is;<br>=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=
=A0=C2=A0 case '\r':<br>=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=
=C2=A0=C2=A0=C2=A0=C2=A0 if(sb->sgetc() =3D=3D '\n')<br>=C2=A0=
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=
=A0=C2=A0 sb->sbumpc();<br>=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=
=A0=C2=A0=C2=A0=C2=A0 return is;<br>=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=
=A0 case EOF:<br>=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=
=A0=C2=A0 // Also handle the case when the last line has no line ending<br>=
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 if(t.emp=
ty())<br>=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=
=C2=A0=C2=A0=C2=A0=C2=A0 is.setstate(std::ios::eofbit);<br>=C2=A0=C2=A0=C2=
=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 return is;<br>=C2=A0=C2=
=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 default:<br>=C2=A0=C2=A0=C2=A0=C2=A0=C2=
=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 t +=3D (char)c;<br>=C2=A0=C2=A0=C2=
=A0=C2=A0=C2=A0=C2=A0=C2=A0 }<br>=C2=A0=C2=A0=C2=A0 }<br>}<br>=C2=A0</div><=
/div>
<p></p>
-- <br />
You received this message because you are subscribed to the Google Groups &=
quot;ISO C++ Standard - Future Proposals" group.<br />
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to <a href=3D"mailto:std-proposals+unsubscribe@isocpp.org">std-proposa=
ls+unsubscribe@isocpp.org</a>.<br />
To post to this group, send email to <a href=3D"mailto:std-proposals@isocpp=
..org">std-proposals@isocpp.org</a>.<br />
To view this discussion on the web visit <a href=3D"https://groups.google.c=
om/a/isocpp.org/d/msgid/std-proposals/09a200fc-971f-44fc-ac24-98aa9719cd21%=
40isocpp.org?utm_medium=3Demail&utm_source=3Dfooter">https://groups.google.=
com/a/isocpp.org/d/msgid/std-proposals/09a200fc-971f-44fc-ac24-98aa9719cd21=
%40isocpp.org</a>.<br />
------=_Part_1390_370416557.1459889219140--
------=_Part_1389_1623094807.1459889219139--
.
Author: Zhihao Yuan <zy@miator.net>
Date: Tue, 5 Apr 2016 16:33:00 -0500
Raw View
On Tue, Apr 5, 2016 at 3:03 PM, Francis ANDRE
<francis.andre.kampbell@orange.fr> wrote:
> Hold on... it is also for *native* stream opened in binary mode...
>
> On Windows, the OS end-of-line is CRLF. Reading a native file opened in
> binary mode for functional reason with getline() returns a string with
> '\r' by the end... This proposal would avoid this issue.
So what, with msvcrt fgets reading a line in binary mode
also gives you '\r\n' at the end, what's the difference?
--
Zhihao Yuan, ID lichray
The best way to predict the future is to invent it.
___________________________________________________
4BSD -- http://blog.miator.net/
--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/CAGsORuBGpB%2BjYRgB3oCjBJhp2KJ3h429_%3DpVrgg%2BauUacvpacQ%40mail.gmail.com.
.