Topic: std::ctype::do_is


Author: "abadura" <abadura@o2.pl>
Date: Mon, 2 Oct 2006 12:04:02 CST
Raw View
   The Standard in {22.2.1.1.2 ctype virtual functions
[lib.locale.ctype.virtuals]} specifies:

>>>>>>>>>>
   bool do_is(mask m, charT c) const;
   const charT* do_is(const charT* low, const charT* high, mask* vec)
const;

1 Effects: Classifies a character or sequence of characters. For each
argument character, identifies a value M of type ctype_base::mask. The
second form identifies a value M of type ctype_base::mask for each *p
where (low<=p && p<high), and places it into vec[p-low].

2 Returns: The first form returns the result of the expression (M & m)
!= 0; i.e., true if the character has the characteristics specified.
The second form returns high.
>>>>>>>>>>

   The point 2 seems to be wrong. Lets consider a situation when we ask
for mask equal to 0 (we can since it is bitmask) for a character which
actually has "type" 0 (which is possible with the Standard and takes
place in Cygwin for example for characters outside ASCII in "classic"
locale for "char").
   The return value should be "true" since we asked for "0" and the
"character type" of examined character is "0". But "(M & m)" gives 0
since M = m = 0 and the result is "false" then.

   Is there an error or my reasoning is not very good here?

   Adam Badura

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]





Author: "Greg Herlihy" <greghe@pacbell.net>
Date: Tue, 3 Oct 2006 11:50:37 CST
Raw View
abadura wrote:
> The Standard in {22.2.1.1.2 ctype virtual functions
> [lib.locale.ctype.virtuals]} specifies:
>
> >>>>>>>>>>
>    bool do_is(mask m, charT c) const;
>    const charT* do_is(const charT* low, const charT* high, mask* vec)
> const;
> 2 Returns: The first form returns the result of the expression (M & m)
> != 0; i.e., true if the character has the characteristics specified.
> The second form returns high.
> >>>>>>>>>>
>
>    The point 2 seems to be wrong. Lets consider a situation when we ask
> for mask equal to 0 (we can since it is bitmask) for a character which
> actually has "type" 0 (which is possible with the Standard and takes
> place in Cygwin for example for characters outside ASCII in "classic"
> locale for "char").

The m parameter is a bitmask type (see [lib.bitmask.types]). So the
do_is() function does not match characters by comparing their values
againt m's. Instead each individual bit in m has been assigned a
discrete, boolean value that classifies a character. So the program
sets those bits in "m" that correspond to the properties it wishes to
distinguish (uppercase, numeric, alphanumeric, printable and so forth,
see [lib.categorery.ctype]). Note that these properties need not be
exclusive of each other, a single character may have more than one
matching property.

So passing a zero-valued mask (one with no bits set) would mean there
are no characters of interest to the program - so none will be found.
In short, a zero mask leaves the caller with no reason even to call the
do_is() function.

Greg

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]





Author: "kanze" <kanze@gabi-soft.fr>
Date: Tue, 3 Oct 2006 12:09:24 CST
Raw View
abadura wrote:
> The Standard in {22.2.1.1.2 ctype virtual functions
> [lib.locale.ctype.virtuals]} specifies:

> >>>>>>>>>>
>    bool do_is(mask m, charT c) const;
>    const charT* do_is(const charT* low, const charT* high, mask* vec)
> const;

> 1 Effects: Classifies a character or sequence of characters. For each
> argument character, identifies a value M of type ctype_base::mask. The
> second form identifies a value M of type ctype_base::mask for each *p
> where (low<=p && p<high), and places it into vec[p-low].

> 2 Returns: The first form returns the result of the expression (M & m)
> != 0; i.e., true if the character has the characteristics specified.
> The second form returns high.
> >>>>>>>>>>

>    The point 2 seems to be wrong. Lets consider a situation when we ask
> for mask equal to 0 (we can since it is bitmask) for a character which
> actually has "type" 0 (which is possible with the Standard and takes
> place in Cygwin for example for characters outside ASCII in "classic"
> locale for "char").

If a character has "type" 0, any call to do_is must return
false.  Period.  That's the definition of the function; we don't
even have to look at the mask.

>    The return value should be "true" since we asked for "0" and the
> "character type" of examined character is "0". But "(M & m)" gives 0
> since M = m = 0 and the result is "false" then.

>    Is there an error or my reasoning is not very good here?

The error is that you seem to be misunderstanding the function.
It returns true if the character corresponds to at least one of
the criteria we are asking about.  If we ask about zero
criteria, then by definition, it must return false, since there
can be no "at least one of" zero things.

A character whose value in the table is 0 can be assimilated to
an invalid character.  The way to ask if a character is invalid
is to call is with a mask which is the or of all of the possible
mask values, not with a mask of zero.  Defining something like:

    static const mask valid = space | print | cntrl | upper
                            | lower | alpha | digit | punct
                            | xdigit ;

would seem reasonable to me.  (Strictly speaking, the or with
xdigit isn't necessary, since all characters for which xdigit is
true also have either digit or alpha true.)  For classical
ASCII, and for Unicode wide characters, it would even seem to be
useful.  (One could even add the requirement in char_traits that
if int_type was the same as char_type, is( value, eof() ) must
return false.  And of course, that if the types are different,
eof() not be representable in a char_type.)

--
James Kanze                                           GABI Software
Conseils en informatique orient   e objet/
                   Beratung in objektorientierter Datenverarbeitung
9 place S   mard, 78210 St.-Cyr-l'   cole, France, +33 (0)1 30 23 00 34


---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]