Topic: Stricter ordering requirements for fpos? (2nd trial)
Author: pierrebai@hotmail.com (Pierre Baillargeon)
Date: Wed, 22 Jan 2003 16:41:28 +0000 (UTC) Raw View
dietmar_kuehl@yahoo.com (Dietmar Kuehl) wrote in message news:<5b15f8fd.0301210240.33619340@posting.google.com>...
>
> The number of [external] bytes to a [internal] character depends on the used
> encoding. Streams are always concerned with numbers of internal characters
> and never with the number of external bytes: this is what I said and it does
> not matter whether you are using formatted or unformatted functions. There is
> no way to access the bytes directly using the stream interface (unless, of
> course, you have non-converting locale but this is a very special case).
My mistake then: I always assumed that unformatted meant that no
translation was taking place! I must have always worked with a
non-converting locale then, since I never had any problems. Time to
revise some code...
So how is one supposed to read raw bytes from a file using C++
functionality (vs. fread()) ?
---
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html ]
Author: dietmar_kuehl@yahoo.com (Dietmar Kuehl)
Date: Wed, 22 Jan 2003 21:21:38 +0000 (UTC) Raw View
Pierre Baillargeon wrote:
> So how is one supposed to read raw bytes from a file using C++
> functionality (vs. fread()) ?
Well, you just make sure that you are using the "C" locale, 'char' as
character type, and open your file in 'std::ios_base::binary' mode.
--
<mailto:dietmar_kuehl@yahoo.com> <http://www.dietmar-kuehl.de/>
Phaidros eaSE - Easy Software Engineering: <http://www.phaidros.com/>
---
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html ]
Author: dietmar_kuehl@yahoo.com (Dietmar Kuehl)
Date: Tue, 21 Jan 2003 16:25:49 +0000 (UTC) Raw View
pierrebai@hotmail.com (Pierre Baillargeon) wrote:
in message news:<6df0c6a8.0301201107.7ec3790f@posting.google.com>...
> dietmar_kuehl@yahoo.com (Dietmar Kuehl) wrote in message news:<b094d9$lkv0i$1@ID-86292.news.dfncis.de>...
> > Since the stream classes are never concerned with the number of bytes in
> > the underlying file but only with the number of internal characters it
> > would by unlogical to start using the underlying bytes in this case.
>
> That would explain why istream and ostream do not support functions
> like read() or write()... hum, wait a minute here... they do!
> Seriously, streams have both formatted and unformatted functions, so
> using the unformatted position is a valid usage pattern.
Note the declaration of the 'read()' and 'write()' members, however:
template <typename cT, typename traits = ...>
class basic_istream {
typedef cT char_type;
// ...
basic_istream<charT,traits>& read (char_type* s, streamsize n);
// ...
};
Likewise 'write()' for the class template 'basic_ostream':
basic_istream<charT,traits>& read (char_type* s, streamsize n);
The number of [external] bytes to a [internal] character depends on the used
encoding. Streams are always concerned with numbers of internal characters
and never with the number of external bytes: this is what I said and it does
not matter whether you are using formatted or unformatted functions. There is
no way to access the bytes directly using the stream interface (unless, of
course, you have non-converting locale but this is a very special case).
--
<mailto:dietmar_kuehl@yahoo.com> <http://www.dietmar-kuehl.de/>
Phaidros eaSE - Easy Software Engineering: <http://www.phaidros.com/>
---
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html ]
Author: kanze@gabi-soft.de (James Kanze)
Date: Tue, 21 Jan 2003 19:05:28 +0000 (UTC) Raw View
pierrebai@hotmail.com (Pierre Baillargeon) wrote in message
news:<6df0c6a8.0301201107.7ec3790f@posting.google.com>...
> dietmar_kuehl@yahoo.com (Dietmar Kuehl) wrote in message
> news:<b094d9$lkv0i$1@ID-86292.news.dfncis.de>...
> > Since the stream classes are never concerned with the number of
> > bytes in the underlying file but only with the number of internal
> > characters it would by unlogical to start using the underlying bytes
> > in this case.
> That would explain why istream and ostream do not support functions
> like read() or write()...
How do you figure this? Neither read nor write expose the underlying
bytes of the stream.
> hum, wait a minute here... they do! Seriously, streams have both
> formatted and unformatted functions, so using the unformatted position
> is a valid usage pattern.
But streams don't support untranslated input. The number of bytes read
in not necessarily directly related to the number of bytes in the file.
This was already true even in the old C style stdio. About the only
system I've seen where fpos corresponds systematically with the number
of bytes read or written in Unix.
C++ adds locale specific translations.
--
James Kanze mailto:jkanze@caicheuvreux.com
Conseils en informatique orient e objet/
Beratung in objektorientierter Datenverarbeitung
---
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html ]
Author: pierrebai@hotmail.com (Pierre Baillargeon)
Date: Mon, 20 Jan 2003 20:51:14 +0000 (UTC) Raw View
dietmar_kuehl@yahoo.com (Dietmar Kuehl) wrote in message news:<b094d9$lkv0i$1@ID-86292.news.dfncis.de>...
> Since the stream classes are never concerned with the number of bytes in
> the underlying file but only with the number of internal characters it
> would by unlogical to start using the underlying bytes in this case.
That would explain why istream and ostream do not support functions
like read() or write()... hum, wait a minute here... they do!
Seriously, streams have both formatted and unformatted functions, so
using the unformatted position is a valid usage pattern.
---
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html ]
Author: dsp@bdal.de (Daniel Spangenberg)
Date: Fri, 17 Jan 2003 09:48:39 +0000 (UTC) Raw View
Hello Dietmar!
Dietmar Kuehl schrieb:
> dsp@bdal.de (Daniel Spangenberg) wrote:
> > fileSize1 <= fileSize2
>
> But file sizes are not handled by the stream classes in other respects. What
> it handles, however, is character sequences and even if you have a bigger
> file, the number of character stored therein can be less! This is due to the
> implicit code conversions: An UTF-8 file with just two bytes can hold just
> two character (ie. when the two bytes represent ASCII characters) while it may
> also hold just one character.
>
Maybe it was a bad idea to insist on the comparison of file sizes, but I think,
that relative positional expressions do allow useful expressions, e.g. if two
streams open the same file for reading access, butb each does perform its own
analysis of the file data.
>
> Why should streams suddenly consider the number of underlying bytes while they
> normally only consider characters at a user level? ... and computing the
> number of characters in a file may involve inspecting every single byte
> therein. If you want to know the [relative] file size, use a direct interface
> to determine it from the file system (this is, at least currently, platform
> specific) or use 'std::distance()' with 'std::istreambuf_iterator'.
>
How does std::distance on 'std::istreambuf_iterator' help more than fpos? Can you
please explain this to me?
Thanks,
Daniel Spangenberg
---
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html ]
Author: dietmar_kuehl@yahoo.com (Dietmar Kuehl)
Date: Fri, 17 Jan 2003 17:04:47 +0000 (UTC) Raw View
Daniel Spangenberg wrote:
> How does std::distance on 'std::istreambuf_iterator' help more than fpos?
> Can you please explain this to me?
Sure: The files gets analysed and the number characters in the file are
counted when using 'std::istreambuf_iterator'. Using 'fpos' the best the
system can see is some form of byte count in the underlying file which
has nothing to do with the number of characters in the file, at least
not in general: For a multi-byte encoding like UTF-8 you cannot determine
how many characters are in the file from the number of bytes therein. The
best you can say is that there are no more characters than there are bytes
and that there are at least 1/6 as many characters as there are bytes (and
I think at least 1/4 as many Unicode characters since UTF-8 can encode
planes currently unused by Unicode).
Since the stream classes are never concerned with the number of bytes in
the underlying file but only with the number of internal characters it
would by unlogical to start using the underlying bytes in this case.
--
<mailto:dietmar_kuehl@yahoo.com> <http://www.dietmar-kuehl.de/>
Phaidros eaSE - Easy Software Engineering: <http://www.phaidros.com/>
---
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html ]
Author: dsp@bdal.de (Daniel Spangenberg)
Date: Wed, 15 Jan 2003 18:27:12 +0000 (UTC) Raw View
Hello C++ gurus,
I proposed the following question concerning the STL requirements on the
fpos template to the moderated lang.c++ group, but it seemed not to be
the appropriate place of discussion. So, I would like, to start here
again:
Yesterday night I was just stumbling across the requirements on the
template class fpos, as presented in 27.4.3.2 of Our Standard. In fact
I'm wondering, why there are no stricter ordering requirements on fpos
other than EqualComparable. Actually I'm missing Strict Weakly
Comparable (or at least LessThan Comparable), because the main
properties of fpos are somewhat similar to random access iterators.
One example is a standard-conforming method to compare the sizes of
fstreams. Via the sequence
std::ifstream file("MyFile", std::ios_base::in | std::ios_base::binary);
file.seekg(0, std::ios_base::end);
std::ifstream::pos_type fileSize = file.tellg();
file.seekg(0);
we have the opportunity to get the file position of the end of the file.
Interpreting this value as "the size" of a file, it would be quite
useful to allow comparisons like
fileSize1 <= fileSize2
which are not standard conforming according to the current C++98.
I think, the presented requirements of 27.4.3.2 allow us to interpret
fpos as a scalar-like type, with some characteristics of integral
numbers. Despite the speciality of an invalid positional value
(fpos(streamoff(-1))), there seem to be little problems, to enforce the
additional requirement on an operator < for fpos (There are no similar
problems to define a < operator as compared to std::complex, for
example, which describes a non-scalar type).
I did not find any discussions concerning this point on
http://anubis.dkuug.dk/JTC1/SC22/WG21/
so I would like to start this discussion here. Any suggestions?
Yours,
Daniel Spangenberg
---
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html ]
Author: dietmar_kuehl@yahoo.com (Dietmar Kuehl)
Date: Thu, 16 Jan 2003 20:28:05 +0000 (UTC) Raw View
dsp@bdal.de (Daniel Spangenberg) wrote:
> fileSize1 <= fileSize2
But file sizes are not handled by the stream classes in other respects. What
it handles, however, is character sequences and even if you have a bigger
file, the number of character stored therein can be less! This is due to the
implicit code conversions: An UTF-8 file with just two bytes can hold just
two character (ie. when the two bytes represent ASCII characters) while it may
also hold just one character.
Why should streams suddenly consider the number of underlying bytes while they
normally only consider characters at a user level? ... and computing the
number of characters in a file may involve inspecting every single byte
therein. If you want to know the [relative] file size, use a direct interface
to determine it from the file system (this is, at least currently, platform
specific) or use 'std::distance()' with 'std::istreambuf_iterator'.
The sole legimate use of 'std::fpos' I'm aware of is obtaining it from a
stream to restore the same position later. The computations applicable to this
type are essentially for the use of new stream buffer classes.
--
<mailto:dietmar_kuehl@yahoo.com> <http://www.dietmar-kuehl.de/>
Phaidros eaSE - Easy Software Engineering: <http://www.phaidros.com/>
---
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html ]