Topic: basic execution character set, value of digit


Author: Francis Glassborow <francis@robinton.demon.co.uk>
Date: 1999/09/08
Raw View
In article <7r53p0$1h2o@enews4.newsguy.com>, Darin Adler
<darin@bentspoon.com> writes
>You can write int valwxdigit(wchar_t wc) portably without using any library
>functions with a switch statement. For the decimal digits the standard has
>the further guarantee that the digits have consecutive values starting with
>L'0', but I can't find a corresponding guarantee for the alphabetic digits.

There isn't one because there is no such requirement.


Francis Glassborow      Journal Editor, Association of C & C++ Users
64 Southfield Rd
Oxford OX4 1PA          +44(0)1865 246490
All opinions are mine and do not represent those of any organisation


[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]






Author: Adam Spragg <adam_spragg@novaclarion.com>
Date: 1999/09/06
Raw View
Markus Mauhart wrote:
>
> The following sentence is taken from 22.2.1.1.2 ("ctype virtual functions"), par 13:
>   "In addition, for any digit character c, the expression (do_narrow(c,dfault)-'0')"
>   "evaluates to the digit value of the character."

On a related note, I have been messing about with a wide string parser
that parses standard 'C' escape sequences, and came across a problem.

Given a wide char, it is possible to figure out if it's a digit using
'iswdigit(wc)', and work out the value using the method described above.

Similarly, it's possible to work out if the character is a hex digit
using 'iswxdigit(wc)'

However, is it possible to figure out the equivalent value of wide hex
character if it's not [a-fA-F] e.g. cyrillic hex notation? If so, how?

Is there an 'int valwxdigit(char_t wc)' function equivalent in the
standard that will return the value of a wide hex digit. e.g.

valwxdigit(L'd');

would return 13.

TIA

Adam

--
Why is it that the smaller and easier a bug is to fix, the less I want
to actually fix it?

----------------
The opinions expressed in this email are mine alone, and do not
neccesarily represent those of my employer, my parents, or the people
who wrote the email software I use.
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]





Author: James.Kanze@dresdner-bank.com
Date: 1999/09/07
Raw View
In article <7qnlii$sct$2@fleetstreet.Austria.EU.net>,
  "Markus Mauhart" <mmauhart@ping.at> wrote:
>
> The following sentence is taken from 22.2.1.1.2 ("ctype virtual =
> functions"), par 13:
>   "In addition, for any digit character c, the expression =
> (do_narrow(c,dfault)-'0')"
>   "evaluates to the digit value of the character."

> Question:
> Does the cited sentence (together with some other parts of the
standard)
> imply the trueth of (('i'-'0' =3D=3D i) && ('i'=3D=3D'0'+i) for all i
in
> {0,1,2,3,4,5,6,7,8,9}) ?
> In other words, the members {0,1,2,3,4,5,6,7,8,9} of the basic
execution
> character set have numeric values 'constant + i' which aplies both for
> their corresponding character literals 'i' and when stored in an
char-object?

> After reading 2.2 ("character sets") I allways thought that the only
> thing we know about the encoding of the [basic] execution character
> set is that the representation of '\o' has all zero bits.

I'm not sure where it is all specified (perhaps partially only by
reference to the C standard), but we do know a little bit more:

  - '\0' has the numerical value 0, and thus, all bits set to zero.
  - The digits in the *basic* character set are contiguous and in order.
  - The 26 lower case alphabetic characters in the *basic* character set
    are in order (but not necessarily contiguous).
  - The 26 upper case alphabetic characters in the *basic* character set
    are in order (but not necessarily contiguous).
  - All characters in the basic character set have non negative codes
    and a non negative representation when stored in a char.

(I'm not actually 100% sure about the third and fourth, but I seem to
recall reading it somewhere.)

These guarantees are all present in C (and widely used).

--
James Kanze                   mailto: James.Kanze@dresdner-bank.com
Conseils en informatique orient   e objet/
                  Beratung in objekt orientierter Datenverarbeitung
Ziegelh   ttenweg 17a, 60598 Frankfurt, Germany Tel. +49(069)63198627


Sent via Deja.com http://www.deja.com/
Share what you know. Learn what you don't.


[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]






Author: "Markus Mauhart" <mmauhart@ping.at>
Date: 1999/09/03
Raw View
The following sentence is taken from 22.2.1.1.2 ("ctype virtual =
functions"), par 13:
  "In addition, for any digit character c, the expression =
(do_narrow(c,dfault)-'0')"
  "evaluates to the digit value of the character."

Question:
Does the cited sentence (together with some other parts of the standard)
imply the trueth of (('i'-'0' =3D=3D i) && ('i'=3D=3D'0'+i) for all i in
{0,1,2,3,4,5,6,7,8,9}) ?
In other words, the members {0,1,2,3,4,5,6,7,8,9} of the basic execution
character set have numeric values 'constant + i' which aplies both for
their corresponding character literals 'i' and when stored in an char-object?

After reading 2.2 ("character sets") I allways thought that the only thing
we know about the encoding of the [basic] execution character set is that
the representation of '\o' has all zero bits.




[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]






Author: Valentin Bonnard <Bonnard.V@wanadoo.fr>
Date: 1999/09/03
Raw View
Markus Mauhart wrote:

> Does the cited sentence (together with some other parts of the standard)
> imply the trueth of (('i'-'0' == i) && ('i'=='0'+i) for all i in
> {0,1,2,3,4,5,6,7,8,9}) ?

Yes, although we knew this before the cited existed.

--

Valentin Bonnard


[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]