Topic: Extended integer I/O in C++0x


Author: Richard Smith <richard@ex-parrot.com>
Date: Thu, 20 Jan 2011 09:55:50 CST
Raw View
The current working draft [N3225] contains no facilities for formatted
I/O of extended integer types using iostreams, and it prohibits an
implementation from extending its iostream implementation to provide
it.  This is problematic because:

1) Simple code will compile but silently do the wrong thing.

E.g. assuming a 64-bit long long, an implementation is permitted to
parse the following literal as an extended integer type, but if so,
the long long inserter (operator<<) is called, resulting in an
implementation-defined (and likely unexpected) truncation of the
value:

   std::cout << 0x123456789ABCDEF00 << "\n";

2) The functionality already exists in the C library.

C99 and C++0x provide the %jd printf and scanf format specifiers for
the intmax_t type, and lower-level functions like strtoimax, so the
standard already requires implementations to support I/O on extended
integer types.  But the C-style functions are less type safe and less
convenient for use in templates.  Most authors recommend that
newcomers to the language should default to using iostreams instead of
stdio; having something as straightforward as this missing is likely
to be counter productive to that aim.

3) The natural implementation is not permitted.

It involves a extra basic_ostream::operator<< and
basic_istream::operator>> overloads.  It is probably best to have one
per extended type, as with the standard types.  Adding these overloads
is permitted under 17.6.4.5.  These would naturally call new overloads
of num_put::put and num_get::get, which can be sensibly reduced to one
for intmax_t and one for uintmax_t (instead of one per extended
type).  Again, permitted under 17.6.4.5.  But these should call new
virtual do_put and do_get overloads which are not permitted.

4) Implementations will probably implement it anyway.

Experience with long long in C++98 suggests that implementations will
just add the aforementioned do_put and do_get overloads.  This causes
portability problems with user-defined num_put and num_get facets, as
different implementations require different collections of virtual
do_put and do_get overloads.  It makes it easier for the user if these
can be standardised, e.g. by mandating the extra overloads, except
when intmax_t is a typedef to long long, as that would cause a compile
error. (A user can test for that with a macro comparing INTMAX_MAX to
LLONG_MAX. If that's considered ugly, alternatives are to make a
gratuitous change to the signature or name so there's no conflict
between intmax_t and long long, or to remove the long long overloads
and have istream / ostream convert them to intmax_t.  The seemingly
conflicts with existing practice, though in practice the conflict is
minimal as almost all existing implementations with such functionality
have intmax_t as long long.)

5) The required change is simple, safe and self-contained.

It's a pure library change, and we already know from experience with
long long how to add I/O support for new integer types.  The
specification will run to a few extra paragraphs at most and will not
impact other parts of the standard as very few parts of the standard
library do I/O or use extended integers.

6) C++0x is the right time to make this change.

The long long overloads added in C++0x already represent an
incompatible change to the num_put and num_get facets, albeit one that
codifies existing practice; if another incompatible change is likely
to be required in the near future, this is the right time to make it.
As the change breaks backwards compatibility, it's probably not
appropriate for a TR.  Deferring to C++1x means having on incompatible
change now and other ten years down the line.

Even though this is an extension (albeit a pretty minimal one), and
the time of adding extensions to C++0x is over, there's also a case
for viewing it as a defect.  Per my first point, it will result in
unexpected and counter-intuitive behaviour from very simple code.  (So
far as I can see, the implementation isn't permitted to add overloads
of operator<< that issue a diagnostic in this case.)  A user cannot
easily fix the problem themselves, and the obvious workarounds (cout
<< imaxtostr(x)) are clumsy in templates.

I would be interested in the group's opinion.  Are there any plans to
fix this?  Should there be?


--
[ comp.std.c++ is moderated.  To submit articles, try posting with your ]
[ newsreader.  If that fails, use mailto:std-cpp-submit@vandevoorde.com ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]