Topic: locales
Author: James Kuyper <kuyper@wizard.net>
Date: 1999/04/02 Raw View
denis bider wrote:
>
> Hello folks,
>
> so it's Standard C++ time and I've been trying to learn how to use the
> locale functionality. It stopped at the tolower() function.
>
> I would expect that it is somehow possible to use the locale that is
> installed on the system automatically. Ie, the program should, on
> runtime, automatically detect that it's running in Slovenia rather than
> in the UK, therefore it should use the appropriate rules for character
> conversion. How can this be done?
>
> Ie, I when I call the tolower() function, I would like it to
> automatically determine whether it should use the mapping:
>
> ABCDEFGHIJKLMNOPQRSTUVWXYZ
>} z
>
> depending on the settings of the computer it runs on.
>
> How do I do this?
The locale named "" is defined as representing the native environment.
How that native environment is determined is implementation-specific. On
many unix-like systems, it's controlled by an environment variable with
a name that matches the corresponding category, such as LC_COLLATE for
the collate category.
---
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html ]
Author: David R Tribble <dtribble@technologist.com>
Date: 1999/04/02 Raw View
denis bider (spam@whitehouse.gov) wrote:
> : I would expect that it is somehow possible to use the locale that is
> : installed on the system automatically. Ie, the program should, on
> : runtime, automatically detect that it's running in Slovenia rather
> : than in the UK, therefore it should use the appropriate rules for
> : character conversion. How can this be done?
Dietmar Kuehl wrote:
> By default, the global locale used is the well-known "C" locale. This
> kind of necessary from the standard point of view: otherwise suddenly
> the behavior of old code would change its behavior.
Yes. At program startup, the equivalent of this is performed (in C):
setlocale(LC_ALL, "C");
> However, it is easily possible to change the global locale to some
> specific locale. For example, to install a German locale, you would
> use a call like this:
>
> std::locale::global(std::locale("de_DE"));
>
> This would replace both the global C and the global C++ locale.
> Whether the C locale is effected generally depends on whether the
> locale used in the call to 'global()' has a name or not.
All of which is true. In order to detect the locale settings for
the program, which presumably inherited the settings from the
user's environment settings, do this (in C):
setlocale(LC_ALL, "");
or its moral equivalent in C++, which I think is:
std::locale::global(std::locale(""));
Specifying LC_ALL means "set the entire locale of the program", and
specifying "" means "using the implementation-defined native
environment". On most POSIX-like systems, this deduces the locale
information from the program's environment variables (which vary
from system to system: "LANG", "NLS_INFO", etc.).
(As a side note, I have noticed that 'setlocale(LC_ALL, "")' on
HP-UX 10.2x causes a null pointer to be dereferenced within the
standard C/C++ library, but apparently without ill effects to
the execution of the program.)
-- David R. Tribble, dtribble@technologist.com --
---
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html ]
Author: James.Kanze@dresdner-bank.com
Date: 1999/04/08 Raw View
In article <370171C6.9941B463@wizard.net>,
James Kuyper <kuyper@wizard.net> wrote:
> denis bider wrote:
> >
> > so it's Standard C++ time and I've been trying to learn how to use the
> > locale functionality. It stopped at the tolower() function.
> >
> > I would expect that it is somehow possible to use the locale that is
> > installed on the system automatically. Ie, the program should, on
> > runtime, automatically detect that it's running in Slovenia rather than
> > in the UK, therefore it should use the appropriate rules for character
> > conversion. How can this be done?
> >
> > Ie, I when I call the tolower() function, I would like it to
> > automatically determine whether it should use the mapping:
> >
> > ABCDEFGHIJKLMNOPQRSTUVWXYZ
> >} z
> >
> > depending on the settings of the computer it runs on.
> >
> > How do I do this?
>
> The locale named "" is defined as representing the native environment.
> How that native environment is determined is implementation-specific. On
> many unix-like systems, it's controlled by an environment variable with
> a name that matches the corresponding category, such as LC_COLLATE for
> the collate category.
Which is all fine and dandy, but the case handling in C/C++ really isn't
usable for internationalization, for several reasons:
- It supposes a dicotomy upper/lower. Regretfully, this dicotomy only
olds for western European languages. The example, Slovenian,
requires three cases, and most non-Roman alphabets don't have case.
- It supposes a one to one mapping between upper and lower. This
fails even for the common western European languages, where German
has a letter which only exists in lower case (and is replaced by the
two character sequence SS in upper case), and French frequently
doesn't use accented characters in upper case (not really correct,
but frequent practice).
Globally, I don't know of a good solution for handling these sort of
problems. Even locale specific functions to convert entire strings
isn't sufficient (or rather, isn't possible), since the functions would
have to understand the semantics of the language. (E.g.: given the
French word "CREE", the lower case quivalent is either "cr e" or
"cr ", depending on whether the word is present tense or a past
participle.)
--
James Kanze mailto: James.Kanze@dresdner-bank.com
Conseils en informatique orient e objet/
Beratung in objekt orientierter Datenverarbeitung
Ziegelh ttenweg 17a, 60598 Frankfurt, Germany Tel. +49 (069) 63 19 86 27
-----------== Posted via Deja News, The Discussion Network ==----------
http://www.dejanews.com/ Search, Read, Discuss, or Start Your Own
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html ]
Author: spam@whitehouse.gov (denis bider)
Date: 1999/03/30 Raw View
Hello folks,
so it's Standard C++ time and I've been trying to learn how to use the
locale functionality. It stopped at the tolower() function.
I would expect that it is somehow possible to use the locale that is
installed on the system automatically. Ie, the program should, on
runtime, automatically detect that it's running in Slovenia rather than
in the UK, therefore it should use the appropriate rules for character
conversion. How can this be done?
Ie, I when I call the tolower() function, I would like it to
automatically determine whether it should use the mapping:
ABCDEFGHIJKLMNOPQRSTUVWXYZ
z
depending on the settings of the computer it runs on.
How do I do this?
denis
---
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html ]
Author: kuehl@horn.fmi.uni-konstanz.de (Dietmar Kuehl)
Date: 1999/03/30 Raw View
Hi,
denis bider (spam@whitehouse.gov) wrote:
: I would expect that it is somehow possible to use the locale that is
: installed on the system automatically. Ie, the program should, on
: runtime, automatically detect that it's running in Slovenia rather than
: in the UK, therefore it should use the appropriate rules for character
: conversion. How can this be done?
By default, the global locale used is the well-known "C" locale. This
kind of necessary from the standard point of view: otherwise suddenly
the behavior of old code would change its behavior. However, it is
easily possible to change the global locale to some specific locale.
For example, to install a German locale, you would use a call like
this:
std::locale::global(std::locale("de_DE"));
This would replace both the global C and the global C++ locale.
Whether the C locale is effected generally depends on whether the
locale used in the call to 'global()' has a name or not.
Of course there is also a problem with this statement: The name passed
to it (ie. "de_DE") is not standardized, at least not by the C++
standard. I think there is a different ISO standard on the way which
would standardize such names but I'm neither sure which standard this
would be nor whether it would have any relevance to the C++ community:
The names supported by a C++ implementation are implementation defined.
This also means that I can't tell the name of the locale you
request...
Anyway, to have the program automatically install the correct locale
without hardcoding any names of locales, you would determine the name
of the locale from an appropriately named environment variable,
something like "LC_ALL". This is then supposed to be set by the user
(or the administrator) to tell the programs about the location of the
machine or of the conventions preferred by the user.
BTW, note that there are two version fs of the function 'tolower()' in
the namespace 'std': The one inherited from C which does not take a
'locale' object as argument and the one defined as shortcut by the C++
library which takes a 'locale' object as argumen (that is in addition
to the character). For the C++ version you would obtain a named
'locale' object or the result of 'locale's default constructor after
setting the global 'locale' appropriately. For better performance it is
most of the time reasonable to use the 'ctype' member direcctly instead
of the global C++ 'tolower()' function: The global shortcut is
basically implemented as
namespace std
{
template <typename cT>
cT tolower(cT c, locale const& l)
{
return use_facet<ctype<cT> >(l).tolower(c);
}
}
Depeending on the scheme used by 'use_facet' to look up the
facet, this can be to expansive to be used in a tight loop. ... and even
if 'use_facet' uses some clever approach it would almost certainly be
more efficient to use the 'ctype' member directly.
--
<mailto:dietmar.kuehl@claas-solutions.de>
<http://www.informatik.uni-konstanz.de/~kuehl/>
I am a realistic optimist - that's why I appear to be slightly pessimistic
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html ]
Author: jcoffin@taeus.com (Jerry Coffin)
Date: 1999/03/31 Raw View
In article <MPG.1169a3745048bab6989683@mensa1.uibk.ac.at>,
spam@whitehouse.gov says...
[ ... ]
> I would expect that it is somehow possible to use the locale that is
> installed on the system automatically. Ie, the program should, on
> runtime, automatically detect that it's running in Slovenia rather than
> in the UK, therefore it should use the appropriate rules for character
> conversion. How can this be done?
setlocale(LC_ALL, "");
or
std::locale::global(std::locale(""));
Setting the global locale in the C++ library automatically makes the
setlocale call with the same locale, which changes the behavior of the
parts of the library that came from C. If your implementation isn't
particularly complete and up-to-date, it may not support the latter
yet.
There are only two locales defined in the standard: the "C" locale,
which is basically US English, and the "" locale, which is (hopefully)
the default for the machine. Depending on your implementation, you
can probably plan on using various other strings to force particular
conventions. With a little luck, your implementation will accept (at
least some) three-letter country codes as define in ISO/IEC 3166, but
I don't believe either the C nor C++ standard guarantees that.
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html ]