Topic: ctype virtual functions


Author: Matt McClure <matt.mcclure@ccur.com>
Date: 2000/01/19
Raw View
On Mon Jan 17 2000, 18:20, Dietmar Kuehl
<dietmar.kuehl@claas-solutions.de> wrote:

> In article <vl33drwd5tu.fsf@amber2.ccur.com>,
>   Matt McClure <matt.mcclure@ccur.com> wrote:
> > > Does paragraph 4 mean that std::basic_istream<>::sentry<> should
> behave
> > > in _every_ aspect -- compile-time, link-time, run-time, etc. -- as
> > > though it used the above code?
>
> My guess is that you are not that much interested in my view as I have
> already expressed it...

Quite to the contrary, I am very much interested in your view, and I
apologize for apparently having offended you.  I really appreciate all
the time you must have put into posting your replies to my questions.

> but I definitely think that it has to behave in every aspect as if it
> used the code. In particular, it is expected to behave as if it uses
> the ctype facet.

I agree that this seems to be the intent of the standard.

> > Since there has been no response to these questions, would it be
> > appropriate to submit a library issue to the standard committee?
>
> Since I will be at the next meeting it might be reasonable to write
> both your name and mine as the submitter on the library issue: The
> people in the library working group will then ask me what this issue
> is about and I can tell them, explaining the concerns if necessary...

If I submit the issue, I will include your name, but it may not be
necessary, if people concur with my understanding below.

> > Or does someone have a definitive answer that can be backed up by the
> > current standard?
>
> I think it is covered by the "as if" rule.

Thank you for pointing me to that.  It seems that you're right, but just
for clarification, does the following make sense:

1.9(1) says, "... conforming implementations are required to emulate
(only) the observable behavior of the abstract machine as explained
below."

And 1.9(5) says, "A conforming implementation executing a well   formed
program shall produce the same observable behavior as one of the
possible execution sequences of the corresponding instance of the
abstract machine with the same program and the same input,"

where "observable behavior is defined in 1.9(6): "The observable
behavior of the abstract machine is its sequence of reads and writes to
volatile data and calls to library I/O functions."

It seems clear that the "sequence of reads and writes" should include
program translation in addition to program execution, right?

--
Matt McClure
Concurrent Computer Corporation

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]






Author: Matt McClure <matt.mcclure@ccur.com>
Date: 2000/01/19
Raw View
On Thu Jan 20 2000, 03:42, James Kuyper <kuyper@wizard.net> wrote:

> Matt McClure wrote:
> ....
> > where "observable behavior is defined in 1.9(6): "The observable
> > behavior of the abstract machine is its sequence of reads and writes to
> > volatile data and calls to library I/O functions."
> >
> > It seems clear that the "sequence of reads and writes" should include
> > program translation in addition to program execution, right?
>
> Yes - if you can figure out any way to read or write volatile data
> during program translation, then that rule would apply. I can't think of
> any way to do it, since the data storage location for the reads and
> writes hasn't even been allocated yet until the program starts
> executing.

Then I'm mistaken in my last step.

> The "behavior of the abstract machine" refers specifically to the
> execution of the program, not to it's translation. The reservation of
> identifiers with particular names is a translation-time phenomenon.
> Except in debug mode, the identifiers cease to have any meaning by
> execution time - they have been translated into the appropriate machine
> code instructions.

In that case, I am still at a loss for a way to prove that the standard
does not implicitly require an implementation to (1) provide
specializations of std::ctype for an arbitrary type or (2) implement
std::basic_istream::sentry's constructor without using std::ctype,
despite the fact that the standard's _intent_ seems to be the opposite.

The standard clearly says that the only required specializations of
std::ctype are for char and wchar_t, so (1) is not required, and Dietmar
has shown that it is impossible to implement _working_ ctype member
functions for an arbitrary character type since the library implementor
knows nothing about user-defined types.

But this means that the library implementor either has to provide bogus
definitions for the ctype members, or circumvent the problem by finding
another way to implement sentry's constructor (a task that seems
impossible).

As Dietmar has written, the intent of the standard seems to be that
sentry's constructor must behave, during _translation_ and execution, as
if it were implemented with ctype.  But there doesn't seem to be enough
to _prove_ that the standard _requires_ its intent.

I'll submit an issue to the committee.

--
Matt McClure
Concurrent Computer Corporation


[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]






Author: James Kuyper <kuyper@wizard.net>
Date: 2000/01/20
Raw View
Matt McClure wrote:
....
> where "observable behavior is defined in 1.9(6): "The observable
> behavior of the abstract machine is its sequence of reads and writes to
> volatile data and calls to library I/O functions."
>
> It seems clear that the "sequence of reads and writes" should include
> program translation in addition to program execution, right?

Yes - if you can figure out any way to read or write volatile data
during program translation, then that rule would apply. I can't think of
any way to do it, since the data storage location for the reads and
writes hasn't even been allocated yet until the program starts
executing.

The "behavior of the abstract machine" refers specifically to the
execution of the program, not to it's translation. The reservation of
identifiers with particular names is a translation-time phenomenon.
Except in debug mode, the identifiers cease to have any meaning by
execution time - they have been translated into the appropriate machine
code instructions.

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]






Author: Matt McClure <matt.mcclure@ccur.com>
Date: 2000/01/17
Raw View
On Fri Jan 07 2000, 14:00, Matt McClure <matt.mcclure@ccur.com> wrote:

> > OK, it is necessary to construct the 'sentry' object. Is the use of
> > 'std::ctype' also required? Sure, according to 27.6.11.2
> > (lib.istream::sentry) paragraph 2 the constructor of 'sentry' skips
> > whitespace and paragraphs 3 and 4 make very explicit how this is going
> > to happen, namely by using 'std::ctype'.
>
> I'm with you until the above paragraph.  You imply that the standard
> requires an implementor to use std::ctype<> in order to define
> std::basic_istream<>::sentry<>, but paragraph 4 says:
>
>    4 To decide if the character c is a whitespace character, the
>    constructor performs ''as if'' it executes the following code
>    fragment:
>
>       const ctype<charT>& ctype = use_facet<ctype<charT> >(is.getloc());
>       if (ctype.is(ctype.space,c)!=0)
>          // c is a whitespace character.
>
> To me, it seems that the phrase "as if" means that the implementor can
> do anything he/she wants as long as it behaves the same as the above
> code.  But does that mean that if std::ctype<my_char> cannot be
> instantiated, then neither can the std::sentry<> constructor?  Or does
> it mean that the implementation should behave at _run-time_ as if it
> were executing the above code?
>
> On Wed Jan 05 2000, 03:51, James Kuyper <kuyper@wizard.net> wrote:
>
> > Keep in mind that the implementor is responsible for both std::sentry<>
> > and std::ctype<>; they can't use that typical implementation unless they
> > know that std::ctype<> was implemented in such a way that std::sentry<>
> > meets its requirements.
>
> Does paragraph 4 mean that std::basic_istream<>::sentry<> should behave
> in _every_ aspect -- compile-time, link-time, run-time, etc. -- as
> though it used the above code?
>
> If so, then I agree with you, Dietmar.  If not, then my question is
> really, how else could std::basic_istream<>::sentry<> possibly be
> implemented so that it could be instantiated correctly using a
> user-defined character type?

Since there has been no response to these questions, would it be
appropriate to submit a library issue to the standard committee?

Or does someone have a definitive answer that can be backed up by the
current standard?

--
Matt McClure
Concurrent Computer Corporation


[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]






Author: Dietmar Kuehl <dietmar.kuehl@claas-solutions.de>
Date: 2000/01/17
Raw View
Hi,
In article <vl33drwd5tu.fsf@amber2.ccur.com>,
  Matt McClure <matt.mcclure@ccur.com> wrote:
> > Does paragraph 4 mean that std::basic_istream<>::sentry<> should
behave
> > in _every_ aspect -- compile-time, link-time, run-time, etc. -- as
> > though it used the above code?

My guess is that you are not that much interested in my view as I have
already expressed it... but I definitely think that it has to behave
in every aspect as if it used the code. In particular, it is expected
to behave as if it uses the ctype facet.

> Since there has been no response to these questions, would it be
> appropriate to submit a library issue to the standard committee?

Since I will be at the next meeting it might be reasonable to write
both your name and mine as the submitter on the library issue: The
people in the library working group will then ask me what this issue
is about and I can tell them, explaining the concerns if necessary...

> Or does someone have a definitive answer that can be backed up by the
> current standard?

I think it is covered by the "as if" rule.
--
<mailto:dietmar.kuehl@claas-solutions.de>
homepage: <http://www.informatik.uni-konstanz.de/~kuehl>


Sent via Deja.com http://www.deja.com/
Before you buy.


[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]






Author: Matt McClure <matt.mcclure@ccur.com>
Date: 2000/01/04
Raw View
Thanks for your help so far.

On Sat Dec 25 1999, 20:43, Dietmar Kuehl
<dietmar.kuehl@claas-solutions.de> wrote:

> In article <hp902pqwly.fsf@shell.faradic.net>,
>   Matt McClure <matthew.mcclure.es.99@aya.yale.edu> wrote:
> > Is an implementation required to provide definitions for the ctype
> > virtual member functions in section 22.2.1.1.2 of the standard?
>
> No, it is not. An implementation is only required to support the
> two specializations 'std::ctype<char>' and 'std::ctype<wchar_t>'. This
> is, of course, a portability problem: Neither can a user rely on any
> other specialization being present [...]

You say a user cannot rely on any other specialization being present,
but it appears that there is some example code in the standard which
does just that, namely the "typical implementation of the sentry
constructor" in 27.6.1.1.2(6).

Here is the scenario causing a problem with our implementation: we have
a test that defines a user-defined character type (including a
char_traits specialization), and then explicitly instantiates
std::getline() using basic_istream, char_traits, basic_string, and
allocator, all parameterized by the user-defined type.  Std::getline()
calls the basic_istream::sentry constructor, which, similarly to the
standard's example code, initializes a ctype object.  Since
basic_istream was parameterized by the user-defined type, so is the
sentry, and therefore, the ctype object is as well.  But we have no
definitions for the ctype virtual member functions for the user-defined
type, so the implementation is unable to instantiate the ctype object,
causing link-time errors.

So what piece of the puzzle is broken?  Is it the test, the
implementation, or the standard?

Should a user be able to instantiate std::getline() using a user-defined
character type (ccvs_char) as a template paramter as in the following code:

   template std::basic_istream<ccvs_char, std::char_traits<ccvs_char> >& std::getline<>(std::basic_istream<ccvs_char, std::char_traits<ccvs_char> >&, std::basic_string<ccvs_char, std::char_traits<ccvs_char>, std::allocator<ccvs_char> >&);

Thanks again.
--
Matt McClure
Concurrent Computer Corporation
954-973-5120


[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]






Author: James Kuyper <kuyper@wizard.net>
Date: 2000/01/05
Raw View
Matt McClure wrote:
>
> Thanks for your help so far.
>
> On Sat Dec 25 1999, 20:43, Dietmar Kuehl
> <dietmar.kuehl@claas-solutions.de> wrote:
>
> > In article <hp902pqwly.fsf@shell.faradic.net>,
> >   Matt McClure <matthew.mcclure.es.99@aya.yale.edu> wrote:
> > > Is an implementation required to provide definitions for the ctype
> > > virtual member functions in section 22.2.1.1.2 of the standard?
> >
> > No, it is not. An implementation is only required to support the
> > two specializations 'std::ctype<char>' and 'std::ctype<wchar_t>'. This
> > is, of course, a portability problem: Neither can a user rely on any
> > other specialization being present [...]
>
> You say a user cannot rely on any other specialization being present,
> but it appears that there is some example code in the standard which
> does just that, namely the "typical implementation of the sentry
> constructor" in 27.6.1.1.2(6).

Keep in mind that the implementor is responsible for both std::sentry<>
and std::ctype<>; they can't use that typical implementation unless they
know that std::ctype<> was implemented in such a way that std::sentry<>
meets its requirements.

> Here is the scenario causing a problem with our implementation: we have
> a test that defines a user-defined character type (including a
> char_traits specialization), and then explicitly instantiates
> std::getline() using basic_istream, char_traits, basic_string, and
> allocator, all parameterized by the user-defined type.  Std::getline()
> calls the basic_istream::sentry constructor, which, similarly to the
> standard's example code, initializes a ctype object.  Since
> basic_istream was parameterized by the user-defined type, so is the
> sentry, and therefore, the ctype object is as well.  But we have no
> definitions for the ctype virtual member functions for the user-defined
> type, so the implementation is unable to instantiate the ctype object,
> causing link-time errors.

Well, the solution is clear: provide a specialization of std::ctype<>
for your user-defined character type.
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]





Author: Dietmar Kuehl <dietmar.kuehl@claas-solutions.de>
Date: 2000/01/05
Raw View
In article <vl3hfgt3kbz.fsf@amber2.ccur.com>,
  Matt McClure <matt.mcclure@ccur.com> wrote:
>You say a user cannot rely on any other specialization being present,
>but it appears that there is some example code in the standard which
>does just that, namely the "typical implementation of the sentry
>constructor" in 27.6.1.1.2(6).

The tricky part is that the standard does not spell out the requirements
a user has to fulfill to use the stream classes with a character type
different from 'char' and 'wchar_t'. These are, however, non-trivial
and, honestly, I haven't really tested what is necessary. Here is an
overview of what is at least required:

- You need to define a character type. Using a standard type as a
  character type is only portable for 'char' and 'wchar_t': The library
  might specialize the relevant templates for all standard types
  although it is only required to specialize for 'char' and 'wchar_t'.
  Thus, it is neither safe to specialize on any of the standard types
  nor is it safe to rely on the implementation to provide
  specializations.

- You need to provide a traits type for the character type. Whether this
  is a specialization of 'std::char_traits' or just some arbitrary
  traits type definining the corresponding members is up to you.
  However, specializing 'std::char_traits' is probably the most
  convenient approach because this type is used as default template
  argument and thus it is not necessary to provide this type all the
  time.

- For many operations you need to provide a specialization of
  'std::ctype' for your character type, install it in some 'std::locale'
  object and make sure this 'std::locale' object is installed in the
  relevant streams (either by making it the global locale object or by
  'imbue()'ing it to the relevant streams). Most notably, this is
  necessary if you are using any of the input function because all of
  these construct a 'std::basic_istream::sentry' object which uses the
  corresponding 'std::ctype' object. The 'std::ctype' facet is also used
  by the functions for formatted numeric I/O, more specifically be the
  virtual functions of the 'std::num_put' and 'std::num_get' facets.
  There are a few other functions which also use the 'std::ctype' facet.

- If you want to use files with your character type, you need to provide
  a specialization of 'std::codecvt'. This facet is only used by the
  file streams and is not necessary if you can go without file streams.
  If you need this facet, it has to be installed into a 'std::locale'
  object which in turn is to be installed into a stream as outlined
  above for 'std::ctype'.

- If you want to use the functions for numeric formatting, you have to
  provide at least a specialization of 'std::numpunct' and installed it
  into a 'std::locale' object. The default implementation of numeric
  formatting and parsing will use the corresponding 'std::ctype' facet
  from the same 'std::locale' object. To avoid this, you would have to
  specialize 'std::num_put' and 'std::num_get'. However, specialization
  is not really required. But what is required to use the numeric
  functions is that the corresponding facets are installed in the
  'std::locale' object. Although it is possible for the implementation
  to install all specializaton which are actually used automatically
  (eg. using a little linker hackery to find out about the used
  specializations), an implementation is not required to do so and all
  implementations I have seen (including my own one) don't do it
  automatically.

This is the absolute minimum a user has to do! I think this should also
be sufficient but I'm not sure about this one. The reason why the user
has to provide specializations for 'std::ctype', 'std::codecvt', and
'std::numpunct' is simply that the implementation cannot know how
the user defined character type has to be used.

Basically, 'std::ctype' is used to tell the library how certain
characters, eg. digits and certain letters, are represented in the user
defined character. I'm pretty sure that the library only depends on the
following semantics to work correctly:

- the 'do_is()', 'do_scan_is()', and 'do_scan_not()' functions have to
  correctly detect white spaces, digits, and letters.
- the 'do_widen()' and 'do_narrow()' function have to convert
  correctly the characters "0123456789abcdefABCDEFxX+-\n"between
  'char' and the character type.

I don't think that 'do_toupper()' and 'do_tolower()' are used be the
library.Correspondingly, only the character classes for digits, spaces,
and alphas are used. This might make it easier to implement a
specialization for 'std::ctype'.

The code conversion stuff is absolutely crucial when doing file I/O.
Basically what this does is to convert the character type to a
representation using 'char's because it is assumed that only 'char's
can be read and written to files. If it is possible to read and write
characters differently and probably more efficiently, it might be
reasonable to avoid the use of file streams and define a corresponding
stream buffer instead. Of course, the stream buffer can be a
specialization of 'std::basic_filebuf'. Well, I'm not really sure about
the restrictions for user defined specializations, ie. whether the user
can freely define the members of a specialization of
'std::basic_filebuf' without taking the standard semantics into account.

'std::numpunct' could have been defined in terms of 'std::ctype' but
it is not. Thus, a specialization for 'std::numpunct' is necessary to
provide the representations of the Boolean values, the thousands
separators, and the decimal point. The grouping can be identical to
the one for standard specializations.

>So what piece of the puzzle is broken?  Is it the test, the
>implementation, or the standard?

If you are refering the user code when you say "the test", it is the
test. The library implementation and the standard are fine: The library
because it is not required to do anything and the standard because
an implementation cannot guess the semantics of unknown types.
However, the user does not have to guess but can provide the
necessary pieces of information which are distributed over various
classes.
>
>Should a user be able to instantiate std::getline() using a
user-defined
>character type (ccvs_char) as a template paramter as in the following
code:
>
>   template std::basic_istream<ccvs_char, std::char_traits<ccvs_char>
>& std::getline<>(std::basic_istream<ccvs_char,
std::char_traits<ccvs_char> >&, std::basic_string<ccvs_char,
std::char_traits<ccvs_char>, std::allocator<ccvs_char> >&);

This code is part of the story and the user should be able to do
something like this (I haven't tested whether this code is correct but
it looks reasonable). However, the user has to provide at least also
a specialization of 'std::ctype<ccvs_char>' and install a 'std::locale'
object into any stream on which the above function is used. The
'std::ctype<ccvs_char>' facet is used to determine white spaces in the
'sentry' object (although this code is never executed when only using
'getline()'!) and to provide the end of line character using 'widen()'
on the character '\n'.
--
<mailto:dietmar.kuehl@claas-solutions.de>
homepage: <http://www.informatik.uni-konstanz.de/~kuehl>


Sent via Deja.com http://www.deja.com/
Before you buy.


[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]






Author: Matt McClure <matt.mcclure@ccur.com>
Date: 2000/01/06
Raw View
On Wed Jan 05 2000, 03:51, James Kuyper <kuyper@wizard.net> wrote:

> > You say a user cannot rely on any other specialization being present,
> > but it appears that there is some example code in the standard which
> > does just that, namely the "typical implementation of the sentry
> > constructor" in 27.6.1.1.2(6).
>
> Keep in mind that the implementor is responsible for both std::sentry<>
> and std::ctype<>; they can't use that typical implementation unless they
> know that std::ctype<> was implemented in such a way that std::sentry<>
> meets its requirements.

Is it required that std::sentry<> be instantiable for a user-defined
char-like type, for which the user has only provided std::char_traits<>?

It sounds as though you're saying std::sentry must be instantiable, but
Dietmar Kuehl <dietmar.kuehl@claas-solutions.de>, has suggested that
this is not the case unless the user has provided std::ctype<> as well.
He sent me a very thorough personal reply to my last post that I believe
he is planning to post here as well.

> > Here is the scenario causing a problem with our implementation: we have
> > a test that defines a user-defined character type (including a
> > char_traits specialization), and then explicitly instantiates
> > std::getline() using basic_istream, char_traits, basic_string, and
> > allocator, all parameterized by the user-defined type.  Std::getline()
> > calls the basic_istream::sentry constructor, which, similarly to the
> > standard's example code, initializes a ctype object.  Since
> > basic_istream was parameterized by the user-defined type, so is the
> > sentry, and therefore, the ctype object is as well.  But we have no
> > definitions for the ctype virtual member functions for the user-defined
> > type, so the implementation is unable to instantiate the ctype object,
> > causing link-time errors.
>
> Well, the solution is clear: provide a specialization of std::ctype<>
> for your user-defined character type.

Just to clarify, I am the implementor, not the user.

I have received an email from the user (the company that writes the test
suite we use for our compiler) that says the standard only requires them
to provide a definition of the char-like type and a definition of
std::char_traits<> compliant with Table 37 in order to explicitly
instantiate std::getline<>.  Does the standard explicitly specify what
is required for a user-defined type?  If so, where?  Dietmar's
explanation _suggested_ what should be required, but didn't point to
specific sections of the standard.

--
Matt McClure
Concurrent Computer Corporation
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]





Author: Dietmar Kuehl <dietmar.kuehl@claas-solutions.de>
Date: 2000/01/07
Raw View
Hi,
In article <vl3r9fwh4j1.fsf@amber2.ccur.com>,
  Matt McClure <matt.mcclure@ccur.com> wrote:
> On Wed Jan 05 2000, 03:51, James Kuyper <kuyper@wizard.net> wrote:
> Is it required that std::sentry<> be instantiable for a user-defined
> char-like type, for which the user has only provided
> std::char_traits<>?

No. Lets see how this statement can be proved using the standard.

Setting the stage:

In 22.1.1.1.1 (lib.locale.category) are two tables describing groups
of facets. Basically, standard facets fall into three groups:

- Those provided by the standard C++ library which are also installed
  in all 'std::locale' objects. Examples are 'std::ctype<char>' and
  'std::num_put<char, std::ostreambufiterator<char> >'. All facets in
  this group are listed in the table in paragraph 2 (Locale Category
  Facets).

- Those provided by the standard C++ library which are not required to
  be installed in all 'std::locale' objects (although it is possible
  at least with some link time magic to install them when needed).
  An example is 'std::num_put<char, char*>'. All facets in this group
  are listed in the table in paragraph 4 (Required Instantiations)
  although some of them are only listed in terms of formal types, ie.
  this group of facets is actually infinite.

- Those not even provided by the standard C++ library. There is no
  listing of these facets but since all others are listed in one of
  two tables the facets in this group are the instantiations for all
  valid types not covered by the other two tables. An example of this
  group is 'std::ctype<my_character_type>' where 'my_character_type'
  is a type suitable as character type (I think it is required to be
  a POD but I'm not sure about this one).

In particular, the standard library is, according to 22.1.1.1.1, not
required to provide the instantiations 'std::ctype<my_character_type>',
'std::codecvt<my_character_type, char, mbstate_t>' and
'std::numpunct<my_character_type>' (unless 'my_character_type' is a
typedef for 'char' or 'wchar_t'; I will assume that this not the case
in the rest of this article).

Although it could be possible to define 'std::numpunct' in terms of
'std::ctype', this is effectively impossible because the function in
the 'std::numpunct' facet don't know which 'std::locale' object to use.
For all three class templates the implementation cannot define suitable
semantics for unknown character types: It does not know anything about
the character type except that it has default and copy constructors, a
copy assignment, and a destructor. Thus, if the user uses a user
defined character type suitable instantiations of these class templates
have also to be provided.

In addition to the mere instantation of the facet types, the user has
to make sure that corresponding objects are installed in the used
'std::locale' object because otherwise these objects will never be
used and a 'std::bad_cast' exception will be thrown instead (if
'std::use_facet()' does not find a facet in an 'std::locale' object it
throws this exception).

Now, lets go on get streams into the play. Can a user instantiate a
stream just with a user defined character type and a corresponding
traits class? Sure. For example:

  #include <sstream>
  struct my_char {
    my_char(): m_char(0) {}
    my_char(bool, char c): m_char(c) {}
    char m_char;
  };

  template <>
  struct char_traits<my_char> {
    // lots of stuff...
  };

  int main() {
    std::basic_stringstream<my_char> stream;
  }

Although I think the standard guarantees that this should work, I doubt
that any current implementation allows it (at least those I checked,
ie. Dinkumware shipping with VC++6.0, STLport ??, and cxxrt, don't)!
The tricky part are the functions 'init()' and 'fill()': After calling
'init()' there is a guarantee on the value of 'fill()', namely that it
returns "widen(' ')". Unfortunetely, 'widen()' uses 'ctype<my_char>'.
However, since I never call 'fill()', I don't think that it is legal
to check for the presence of a suitable 'std::ctype<my_char>'. Anyone
with a better insight into the standard care to comment? I really
would like to know what is supposed to happen in such a case... I can
definitely hack around this problem, eg. by providing a non-functional
default implementation for 'std::ctype<C>' which only uses the
operations required for a character type in its implementation.
Currently SGI's implementation don't even provide a definition of
'std::ctype' of the general case and only specialize for 'char' and
'wchar_t'. Dinkumware and me provide a definition but these assume some
operations which are not present in all valid character types (eg. a
conversion to 'char'). I'm not sure whether any of these approaches is
really standard conforming.

Actually, it is relatively important that the fill character is *not*
initialized by the 'std::basic_ios<..>::init()' method: Doing it in the
'init()' method requires that an instantiation of 'std::ctype<my_char>'
is installed in the global 'std::locale' object. Hardly something which
is desired... How this is handled by the implementation is, of course,
up to the implementation (I have some ideas but I don't have to tell
everybody what these are; Howard once confirmed explicitly my suspicion
that the competition is listening), as long as it works correctly.
Defering the initialization of 'fill()' until 'fill()' is used allows
setting of some locale object using 'imbue()'. However, if 'fill()' is
called prior to initialization of a suitable 'std::locale' object I
think some error handling, ie. either an exception or setting of some
error bit is in order. I don't think that the standard give any clear
statement what is going to happen in this case. Do you guys feel bored
at the committee meetings? In this case I think I can file some defect
reports on rather esoteric stuff.

Anyway... Once the library is "corrected", this should work (I just
handled most of the stuff for my implementation but there is still some
work to do...). Actually, there are even a few operations on the stream
which are legal, eg. calling 'rdbuf()' and read/writing of characters
using the stream buffer directly.

Of course, this is not particularily helpful. As it turns out, we can't
use much of the other functions in the streams. For example, we cannot
use any of the formatted output function as these *might* call 'fill()'
which in turn requires at least a definition of the corresponding
'std::ctype' instantiation. I think it is possible to use the
unformatted output functions. But then, I first thought that this would
also be true for the formatted output functions until I thought about
the fill character...

Now what is about input which bring us back to the original question:
Does use of 'std::basic_istream<..>::sentry' require a user defined
'std::ctype' specialization? The obvious answer is: Yes, it requires
a user specialization of this class.

Lets bring the relevant section together: In 27.6.1.2.1
(lib.istream.formatted.reqmts) for formatted input and in 27.6.1.3
(lib.istream.unformatted) it is stated that all input function start
execution by constructing a 'sentry' object. That is, the
implementation can't get away by avoiding the construction of this
offender. However, it might pull some tricks to avoid the conditional
for the second constructor argument of the 'sentry' object for
unformatted input functions: It is always 'true'. Thereby an
implementation might avoid the need for the 'std::ctype' specialization
for unformatted input functions. However, it is not required to do
this optimization.

OK, it is necessary to construct the 'sentry' object. Is the use of
'std::ctype' also required? Sure, according to 27.6.11.2
(lib.istream::sentry) paragraph 2 the constructor of 'sentry' skips
whitespace and paragraphs 3 and 4 make very explicit how this is going
to happen, namely by using 'std::ctype'.

Now it is established that 'std::ctype<my_char>' is used by the input
functions (at least potentially). Unfortunately, there is no statement
about what is going to happen if this class is not present or if a
corresponding object is not installed in the corresponding locale
object. Since the instantiation is used in the code, the class has to
be at least defined. Since the standard is only required to provide
the instantiations for 'char' and 'wchar_t' it is already unclear
whether it required or even allowed to also provide a class definition
for the general case. In any case, the library implementation is not
required to provide an implementation of an instantiation for any other
class than 'char' and 'wchar_t' (see above). As a result, the user has
to specialize 'std::ctype' if the input and output functions are to be
used. In addition to providing the specialization, it also necessary to
install a corresponding object in the used locale object.

So far we have:
- It is sufficient to provide a character type and corresponding traits
  to create a stream and for basic output using the stream plus the
  operations, both input and output, on the stream buffer (as long as
  the stream buffer isn't a 'basic_filebuf').

- For formatted output and any form of input using a stream rather than
  a stream buffer it is necessary to provide a specialization of
  'std::ctype' for the user defined character type.

What is still open is formatted numeric I/O and file I/O. For numeric
I/O the situation is quite obvious: The stream classes use the classes
'std::num_put<my_char>' and 'std::num_get<my_char>'. These are provided
for all character types by the implementation. All what is necessary is
to install corresponding objects in the 'std::locale' object. Except
that the standard implementations of 'std::num_put' and 'std::num_get'
also use 'std::numpunct' which is only provided for the character types
'char' and 'wchar_t'. BTW, these classes also use 'std::ctype' to
obtain "widened" forms of the digits used. The relevant section on
numeric formatting are

- 27.6.1.2.2, lib.istream.formatted.arithmetic for formatted input
- 27.6.2.5.2, lib.ostream.inserters.arithmetic for formatted output
- 22.2.2.1.2, lib.facet.num.get.virtuals paragraph 1 for the use of
  ctype and numpunct in num_get (more details are in the following
  paragraphs of the same section eg. in paragraph 8)
- 22.2.2.2.2, lib.facet.num.put.virtuals paragraph 15 for the use of
  numpunct in num_put and paragraph 14 for the use of ctype in num_put

To get numeric formatting to work with a user define character type,
the user has to provide a suitable instantiation of 'std::numpunct'.
I don't think that it would be sufficient to derive classes from the
corresponding 'std::num_get' and 'std::num_put' specializations and
install these in the locale object (it would be much harder anyway).
However, defining a suitable 'std::numpunct' is not hard if you know
the used character type.

This leaves file I/O open. While implementing 'std::ctype' and
'std::numpunct' is fairly easy, file I/O using 'basic_filebuf' requires
an implementation of 'std::codecvt' (27.8.1.1, lib.filebuf, paragraph 5
and several other section in 27.8) which is not that easy.

Like for the ctype facet, the implementation is not required to provide
the numpunct and codecvt facets for any other character type than
'char' and 'wchar_t' (and for the code conversion facets it also
depends on the other two template arguments, of course). Thus, the
burdon to implement these is put on the user.

> Dietmar Kuehl <dietmar.kuehl@claas-solutions.de>, has suggested that
> this is not the case unless the user has provided std::ctype<> as
> well. He sent me a very thorough personal reply to my last post that
> I believe he is planning to post here as well.

Yes, I have posted it and I think it is more than just a suggestions.
OK, I left out the relevant section numbers but these are not to hard
to find...

> Just to clarify, I am the implementor, not the user.

Interesting aspect: Helping out the competion again :-) For whom are
you implementing the standard C++ library. ... and why? After all there
are two completely free implementations of iostreams and locales
available (the one from SGI and mine).

> I have received an email from the user (the company that writes the
test
> suite we use for our compiler) that says the standard only requires
them
> to provide a definition of the char-like type and a definition of
> std::char_traits<> compliant with Table 37 in order to explicitly
> instantiate std::getline<>.  Does the standard explicitly specify what
> is required for a user-defined type?  If so, where?  Dietmar's
> explanation _suggested_ what should be required, but didn't point to
> specific sections of the standard.

There are some explicit requirements on the types. However, there are
also some implicit ones. Since the operations of the IOStream library
are described, at least paritally, in terms of the classes 'std::ctype',
'std::numpunct', and 'std::codecvt' these types are also used. There
is no requirement on the library implementation to provide certain
instantiations. Since it is impossible to implement the corresponding
instantiations, they are, in effect, not there. The library
implementation is not required to provide the instantiations (see the
tables in the locales chapter) but is required to use them (see the
cited section in the iostreams chapter). If the user is not required to
provide them, who is? Since they are needed someone has to provide them.
--
<mailto:dietmar.kuehl@claas-solutions.de>
homepage: <http://www.informatik.uni-konstanz.de/~kuehl>


Sent via Deja.com http://www.deja.com/
Before you buy.


[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]






Author: Matt McClure <matt.mcclure@ccur.com>
Date: 2000/01/07
Raw View
On Thu Jan 06 2000, 19:25, Dietmar Kuehl
<dietmar.kuehl@claas-solutions.de> wrote:

> In article <vl3r9fwh4j1.fsf@amber2.ccur.com>,
>   Matt McClure <matt.mcclure@ccur.com> wrote:
> > On Wed Jan 05 2000, 03:51, James Kuyper <kuyper@wizard.net> wrote:
> > Is it required that std::sentry<> be instantiable for a user-defined
> > char-like type, for which the user has only provided
> > std::char_traits<>?
>
> No. Lets see how this statement can be proved using the standard.

[...]

> Now what is about input which bring us back to the original question:
> Does use of 'std::basic_istream<..>::sentry' require a user defined
> 'std::ctype' specialization? The obvious answer is: Yes, it requires
> a user specialization of this class.
>
> Lets bring the relevant section together: In 27.6.1.2.1
> (lib.istream.formatted.reqmts) for formatted input and in 27.6.1.3
> (lib.istream.unformatted) it is stated that all input function start
> execution by constructing a 'sentry' object.

[...]


> OK, it is necessary to construct the 'sentry' object. Is the use of
> 'std::ctype' also required? Sure, according to 27.6.11.2
> (lib.istream::sentry) paragraph 2 the constructor of 'sentry' skips
> whitespace and paragraphs 3 and 4 make very explicit how this is going
> to happen, namely by using 'std::ctype'.

I'm with you until the above paragraph.  You imply that the standard
requires an implementor to use std::ctype<> in order to define
std::basic_istream<>::sentry<>, but paragraph 4 says:

   4 To decide if the character c is a whitespace character, the
   constructor performs ''as if'' it executes the following code
   fragment:

      const ctype<charT>& ctype = use_facet<ctype<charT> >(is.getloc());
      if (ctype.is(ctype.space,c)!=0)
         // c is a whitespace character.

To me, it seems that the phrase "as if" means that the implementor can
do anything he/she wants as long as it behaves the same as the above
code.  But does that mean that if std::ctype<my_char> cannot be
instantiated, then neither can the std::sentry<> constructor?  Or does
it mean that the implementation should behave at _run-time_ as if it
were executing the above code?

On Wed Jan 05 2000, 03:51, James Kuyper <kuyper@wizard.net> wrote:

> Keep in mind that the implementor is responsible for both std::sentry<>
> and std::ctype<>; they can't use that typical implementation unless they
> know that std::ctype<> was implemented in such a way that std::sentry<>
> meets its requirements.

Does paragraph 4 mean that std::basic_istream<>::sentry<> should behave
in _every_ aspect -- compile-time, link-time, run-time, etc. -- as
though it used the above code?

If so, then I agree with you, Dietmar.  If not, then my question is
really, how else could std::basic_istream<>::sentry<> possibly be
implemented so that it could be instantiated correctly using a
user-defined character type?

> > Just to clarify, I am the implementor, not the user.
>
> Interesting aspect: Helping out the competion again :-) For whom are
> you implementing the standard C++ library.

For Concurrent (http://www.ccur.com/).  I'm not implementing the library
from scratch, although someone else here did just that; I'm simply
maintaining it.

--
Matt McClure
Concurrent Computer Corporation


[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]






Author: Dietmar Kuehl <dietmar.kuehl@claas-solutions.de>
Date: 1999/12/26
Raw View
Hi,
In article <hp902pqwly.fsf@shell.faradic.net>,
  Matt McClure <matthew.mcclure.es.99@aya.yale.edu> wrote:
> Is an implementation required to provide definitions for the ctype
> virtual member functions in section 22.2.1.1.2 of the standard?

No, it is not. An implementation is only required to support the
two specializations 'std::ctype<char>' and 'std::ctype<wchar_t>'. This
is, of course, a portability problem: Neither can a user rely on any
other specialization being present nor can he safely specialize for
built-in types because the implementation might already provide a
corresponding specialization. Thus, it is only safe to specialize this
type user defined types.

> If so, is it acceptable for the implementation to define the functions
> for arbitrary template arguments by converting to and from char and
> using the ctype<char> specialized functions to do the actual work?

Since an implementation is not required to have a working version of
'std::ctype' for any other type than those mentioned above, any
definition is acceptable as long a user can specialize the class for
user defined character types.
--
<mailto:dietmar.kuehl@claas-solutions.de>
homepage: <http://www.informatik.uni-konstanz.de/~kuehl>


Sent via Deja.com http://www.deja.com/
Before you buy.
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]





Author: Matt McClure <matthew.mcclure.es.99@aya.yale.edu>
Date: 1999/12/22
Raw View
Is an implementation required to provide definitions for the ctype
virtual member functions in section 22.2.1.1.2 of the standard?

If so, is it acceptable for the implementation to define the functions
for arbitrary template arguments by converting to and from char and
using the ctype<char> specialized functions to do the actual work?

--
Matt
http://www.faradic.net/~mmcclure/
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]