Topic: string class


Author: d96-mst@nada.kth.se (Mikael St ldal)
Date: 1997/07/20
Raw View
Suppose that I have a line of text stored in an instance of the string
class in <string>.

I want to compare the first part of the text line with another string. Is
there any easy and strightforward way to do that?

string LineOfText = "FOOBAR";
string Compare1 = "FOO";
string Compare2 = "BAR";

if (something(LineOfText,Compare1))
 cout << "This does get printed" << endl;

if (something(LineOfText,Compare2))
 cout << "This doesn't get printed" << endl;

What should something look like?
---
[ comp.std.c++ is moderated.  To submit articles: Try just posting with your
                newsreader.  If that fails, use mailto:std-c++@ncar.ucar.edu
  comp.std.c++ FAQ: http://reality.sgi.com/austern/std-c++/faq.html
  Moderation policy: http://reality.sgi.com/austern/std-c++/policy.html
  Comments? mailto:std-c++-request@ncar.ucar.edu
]





Author: James Kanze <james-albert.kanze@vx.cit.alcatel.fr>
Date: 1997/07/23
Raw View
d96-mst@nada.kth.se (Mikael Steldal) writes:

 |>  Suppose that I have a line of text stored in an instance of the string
 |>  class in <string>.
 |>
 |>  I want to compare the first part of the text line with another string. Is
 |>  there any easy and strightforward way to do that?
 |>
 |>  string LineOfText = "FOOBAR";
 |>  string Compare1 = "FOO";
 |>  string Compare2 = "BAR";
 |>
 |>  if (something(LineOfText,Compare1))
 |>   cout << "This does get printed" << endl;
 |>
 |>  if (something(LineOfText,Compare2))
 |>   cout << "This doesn't get printed" << endl;
 |>
 |>  What should something look like?

    if ( LineOfText.compare( 0 , Compare1.length() , Compare1 ) == 0 )
        cout << "String has prefix" << endl ;

--
James Kanze   home:   kanze@gabi-soft.fr       +33 (0)1 39 55 85 62
              office: kanze@vx.cit.alcatel.fr  +33 (0)1 69 63 14 54
GABI Software, 22 rue Jacques-Lemercier, F-78000 Versailles, France
           -- Conseils en informatique industrielle --
---
[ comp.std.c++ is moderated.  To submit articles: try just posting with      ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu         ]
[ FAQ:      http://reality.sgi.com/employees/austern_mti/std-c++/faq.html    ]
[ Policy:   http://reality.sgi.com/employees/austern_mti/std-c++/policy.html ]
[ Comments? mailto:std-c++-request@ncar.ucar.edu                             ]





Author: "Bradd W. Szonye" <bradds@concentric.net>
Date: 1997/07/23
Raw View
Mikael Steldal <d96-mst@nada.kth.se> wrote in article
<qHPyzEpfh6LO092yn@nada.kth.se>...
> Suppose that I have a line of text stored in an instance of the string
> class in <string>.
>
> I want to compare the first part of the text line with another string. Is
> there any easy and strightforward way to do that?
>
> string LineOfText = "FOOBAR";
> string Compare1 = "FOO";
> string Compare2 = "BAR";
>
> if (something(LineOfText,Compare1))
>  cout << "This does get printed" << endl;
>
> if (something(LineOfText,Compare2))
>  cout << "This doesn't get printed" << endl;
>
> What should something look like?

One answer:

    bool is_prefix_equal(string const & text, string const & prefix)
    {
        return string(text, 0, prefix.length()) == prefix;
    }

Another answer:

    bool is_prefix_equal(string text, string const & prefix)
    {
        if (text.length() > prefix.length())
            text.resize(prefix.length());
        return text == prefix;
    }

I'm sure there are other tricks you can play with, for example, the
character traits, but it all comes down pretty much to considering only the
first 'prefix.length()' characters of the text. Perhaps there is a more
"elegant" answer among the standard algorithms, but the above are workable
enough.

This is more a programming topic than a standards topic; consider
submitting it to comp.lang.c++.moderated instead. (I'd do it myself, but
the last time I tried, my article appeared in neither newsgroup rather than
both.)
--
Bradd W. Szonye
bradds@concentric.net
http://www.concentric.net/~Bradds
---
[ comp.std.c++ is moderated.  To submit articles: try just posting with      ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu         ]
[ FAQ:      http://reality.sgi.com/employees/austern_mti/std-c++/faq.html    ]
[ Policy:   http://reality.sgi.com/employees/austern_mti/std-c++/policy.html ]
[ Comments? mailto:std-c++-request@ncar.ucar.edu                             ]





Author: kanze@lts.sel.alcatel.de (James Kanze US/ESC 60/3/141 #40763)
Date: 1996/08/06
Raw View
In article <Dvo5A9.3E0@onyx.indstate.edu> mikes@abc.se \e (Mikael
St\eldal) writes:

|> >|> >What's wrong with:
|> >|> > if ( s.compare( 0 , 3 , "foo" ) == 0 ) ...

|> >|> It not good that you have to manually specify the size of the other
|> >|> string, it is error prone, tedious and non-elegant.

|> > if ( s.find( "foo" ) == 0 ) ...

|> >Solves this, at the risk of being less obvious as to what you are doing.

|> But doesn't that detect if s is "bar foo" too? That's not what I want.

The condition in the if is true if s.find returns 0.  string::find
returns the position at which it finds the string, or NPOS if it does
not find it.  If it returns 0, that means that if found "foo" at the
start of s.  If s == "bar foo", s.find will return 4, not 0.

|> Someone suggested rfind(), would that work?

No.  string::rfind starts at the end of the string, working back.

|> >If instead of "foo", you use a string object, then the second parameter
|> >in the compare version can be its length.

|> That's still rather clumsy IMHO.

|> >If you are really concerned about testing the first word (and not that
|> >the first three letters are "foo"), you might also try something like:

|> > if ( s.compare( 0 , s.find_first_of( " \t\n" ) , "foo" ) ) ...

|> Also clumsy.

It depends on what you want to do.  I've never had a case where I needed
to compare the first n characters, regardless of the character which
follows.  I suggested the above as an alternative because it seems more
in line with what is usually needed.

Of course, if you are extensivly separating out tokens which are
separated by white space, you might want to convert the string to an
istringstream, and use that.

|> It should be possible to do something like

|> if (s.compare("foo"))

|> only. This could be accomplished by changing the the standard to not
|> compare the length of the strings.

So what do I do in the more frequent case where I actually want to
compare the complete string?  Calling a function that doesn't actually
compare the entire string "compare" is very misleading.  If I understand
you correctly, what you want is a function "isPrefix", or something like
that (e.g.: s.is_prefix( "foo" ), which returns true if and only if
s.size() >= 3 and the first 3 characters of s are "foo").  This seems
awfully specialized to put in a standard to me.
--
James Kanze         Tel.: (+33) 88 14 49 00        email: kanze@gabi-soft.fr
GABI Software, Sarl., 8 rue des Francs-Bourgeois, F-67000 Strasbourg, France
Conseils,    tudes et r   alisations en logiciel orient    objet --
                -- A la recherche d'une activit    dans une region francophone
---
[ comp.std.c++ is moderated.  To submit articles: Try just posting with your
                newsreader.  If that fails, use mailto:std-c++@ncar.ucar.edu
  comp.std.c++ FAQ: http://reality.sgi.com/austern/std-c++/faq.html
  Moderation policy: http://reality.sgi.com/austern/std-c++/policy.html
  Comments? mailto:std-c++-request@ncar.ucar.edu
]





Author: fjh@mundook.cs.mu.OZ.AU (Fergus Henderson)
Date: 1996/08/06
Raw View
mikes@abc.se \e (Mikael St\eldal) writes:
>[James Kanze writes]
>> if ( s.find( "foo" ) == 0 ) ...
>
>>Solves this, at the risk of being less obvious as to what you are doing.
>
>But doesn't that detect if s is "bar foo" too? That's not what I want.
>Someone suggested rfind(), would that work?

No, string::find() returns the position at which the string was found,
so testing for == 0 ensures that only matches at the start of the string
will count.

Using string::rfind() instead of string::find() is just a
small optimization.  `s.rfind("foo", 0) == 0' will start searching at
position 0, and since that is the first position, and it is searching
in reverse, if it's not found there, it will stop immediately.
`s.find("foo") == 0' will start searching at position 0, and if
it's not found there, try positions 1, 2, ...; if found at positions
1, 2, ..., the return value won't be 0, so the search is wasted.

>It should be possible to do something like
>
>if (s.compare("foo"))
>
>only. This could be accomplished by changing the the standard to not
>compare the length of the strings.

That would be a very bad idea, IMHO; normally you want to check that
the lengths match.

There is some argument for

 if (s.is_prefix_of("foo")) // succeeds for s == "f", s == "fo",
     // and s == "foo".

and

 if (s.has_prefix("foo")) // succeeds for s == "foo",
     // s == "foobar", etc.

but the argument is not a strong one, since there are easy
alternatives, and in any case it's basically too late to contemplate
making those sort of changes at this point in the standardization
process.

I don't see much argument for a symmetrical

 if (s.compare_ignoring_length("foo"))
     // succeeds for s == "f", s == "fo",
     // s == "foo", s = "foobar", etc.

since I find it hard to imagine when this would be useful, and again
there are easy enough work-arounds, e.g.

 bool compare_ignoring_length(const string &s1, const string &s2) {
  return s.rfind(foo, 0) == 0 || foo.rfind(s, 0) == 0;
 }

 if (compare_ignoring_length(s, "foo"))
     // succeeds for s == "f", s == "fo",
     // s == "foo", s = "foobar", etc.

--
Fergus Henderson <fjh@cs.mu.oz.au>   |  "I have always known that the pursuit
WWW: <http://www.cs.mu.oz.au/~fjh>   |  of excellence is a lethal habit"
PGP: finger fjh@128.250.37.3         |     -- the last words of T. S. Garp.


[ comp.std.c++ is moderated.  To submit articles: try just posting with      ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu         ]
[ FAQ:      http://reality.sgi.com/employees/austern_mti/std-c++/faq.html    ]
[ Policy:   http://reality.sgi.com/employees/austern_mti/std-c++/policy.html ]
[ Comments? mailto:std-c++-request@ncar.ucar.edu                             ]





Author: fjh@mundook.cs.mu.OZ.AU (Fergus Henderson)
Date: 1996/08/07
Raw View
kanze@lts.sel.alcatel.de (James Kanze US/ESC 60/3/141 #40763) writes:

>mikes@abc.se \e (Mikael St\eldal) writes:
>
>|> > if ( s.find( "foo" ) == 0 ) ...
>
>|> Someone suggested rfind(), would that work?
>
>No.  string::rfind starts at the end of the string, working back.

The suggestion was for `s.rfind("foo", 0) == 0'.
`s.rfind("foo")' would start at the end of the string,
but `s.rfind("foo", 0)' starts at the start of the string.

>If I understand
>you correctly, what you want is a function "isPrefix", or something like
>that (e.g.: s.is_prefix( "foo" ), which returns true if and only if
>s.size() >= 3 and the first 3 characters of s are "foo").  This seems
>awfully specialized to put in a standard to me.

I happen to have on hand 98000 lines of code that was written in a
language that does have such a string prefix subroutine in its standard
library.  A quick check with grep finds 8 uses of it.  That seems like
enough to make it worth standardization, don't you think?

(However, at this stage of the standardization process, it's a bit late
for even minor extensions like this.)

--
Fergus Henderson <fjh@cs.mu.oz.au>   |  "I have always known that the pursuit
WWW: <http://www.cs.mu.oz.au/~fjh>   |  of excellence is a lethal habit"
PGP: finger fjh@128.250.37.3         |     -- the last words of T. S. Garp.


[ comp.std.c++ is moderated.  To submit articles: try just posting with      ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu         ]
[ FAQ:      http://reality.sgi.com/employees/austern_mti/std-c++/faq.html    ]
[ Policy:   http://reality.sgi.com/employees/austern_mti/std-c++/policy.html ]
[ Comments? mailto:std-c++-request@ncar.ucar.edu                             ]





Author: Rich Paul <linguist@cyberspy.com>
Date: 1996/08/07
Raw View
Mikael Steldal wrote:
> >       if ( s.compare( 0 , s.find_first_of( " \t\n" ) , "foo" ) ) ...
>
> Also clumsy.
>

what would be less clumsy would be

if ( Regex("^foo[ \t\n]").match(s) )
{
};

Perhaps we should start pushing for a standard Regex class.  I wrote a
Regex interpreter class, and a Regex compiler class, which together
allow this syntax.  You can switch to a "wildmat" format, or any other
format for that matter, by filling in the type of compiler you want
to use.

--
#include <legalbs/standarddisclaimer>
Rich Paul                |  If you like what I say, tell my
C++, OOD, OOA, OOP,      |  employer, but if you don't,
OOPs, I forgot one ...   |  don't blame them.  ;->



Author: kanze@lts.sel.alcatel.de (James Kanze US/ESC 60/3/141 #40763)
Date: 1996/08/08
Raw View
In article <4uagmd$s21@mulga.cs.mu.OZ.AU> fjh@mundook.cs.mu.OZ.AU
(Fergus Henderson) writes:

|> I happen to have on hand 98000 lines of code that was written in a
|> language that does have such a string prefix subroutine in its standard
|> library.  A quick check with grep finds 8 uses of it.  That seems like
|> enough to make it worth standardization, don't you think?

It is a question concerning generality: I have an application of about
250 KLOC, and almost every third line uses an ASN.1 type.  Does this
make ASN.1 types worthy of standardization?  On the other hand, there is
not a single use of double (or float) in the entire application.  Maybe
we should simplify the language by removing them:-)?

Whether something belongs in the standard should depend on the number of
domains for which it would be useful.  As a second criteria, I would
suggest the ability to do the job otherwise, with similar performance.

I think that a string prefix function fails on both counts, but I really
don't know enough about all possible domains to be sure concerning the
first count.  The fact that it is easily expressed (albeit not
naturally) is also reassuring.  Perhaps what is really needed is some
way of declaring a function as syntactically a member: i.e.: it uses
member function calling syntax, but has no additional access
priviledges.  Although I personally think that derivation is sufficient
for this, as well.
--
James Kanze         Tel.: (+33) 88 14 49 00        email: kanze@gabi-soft.fr
GABI Software, Sarl., 8 rue des Francs-Bourgeois, F-67000 Strasbourg, France
Conseils,    tudes et r   alisations en logiciel orient    objet --
                -- A la recherche d'une activit    dans une region francophone
---
[ comp.std.c++ is moderated.  To submit articles: Try just posting with your
                newsreader.  If that fails, use mailto:std-c++@ncar.ucar.edu
  comp.std.c++ FAQ: http://reality.sgi.com/austern/std-c++/faq.html
  Moderation policy: http://reality.sgi.com/austern/std-c++/policy.html
  Comments? mailto:std-c++-request@ncar.ucar.edu
]





Author: mikes@abc.se \e \e (Mikael St\eldal)
Date: 1996/07/31
Raw View
>|> If you have something stored in a class string object,

>|> string s("foo bar");

>|> and want to check if the beginning of s is equal to another string
>|> ("foo"), how are you supposed to do that? Does

>|> if (s.compare("foo") == 0) cout << "the first word of s is 'foo'";

>|> work? From what I have understood from the draft standard, it never
>|> returns 0 unless the two strings are of the same length. Am I wrong?

>What's wrong with:
> if ( s.compare( 0 , 3 , "foo" ) == 0 ) ...

It not good that you have to manually specify the size of the other
string, it is error prone, tedious and non-elegant.
---
[ comp.std.c++ is moderated.  To submit articles: try just posting with      ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu         ]
[ FAQ:      http://reality.sgi.com/employees/austern_mti/std-c++/faq.html    ]
[ Policy:   http://reality.sgi.com/employees/austern_mti/std-c++/policy.html ]
[ Comments? mailto:std-c++-request@ncar.ucar.edu                             ]





Author: mikes@abc.se \e \e (Mikael St\eldal)
Date: 1996/07/31
Raw View
>|> BTW, why hasn't the string class facitilites for case insensitive
>|> comparision?

>I'm not sure that they weren't.  Could the comparison function be locale
>dependant?  If not, making it so would handle the case insensitivity in
>an elegant fashion.

Yes, that would be useful.

>C locale is for C, which is case sensitive.  And of course, anything
>defined as case insensitive must be locale dependant. In French, for
>example, all accented characters must compare equal to the unaccented
>variant,

The problem is that many compiler vendors are lazy implementing
different locales. It would be useful if the standard enforces at
least one case-insensitive locale, it does only have to support
english (maybe the "Pascal" locale?). I'm working with implementing
case-insensitive protocols like SMTP (argh! why are those protocols
not case sensitive???).
---
[ comp.std.c++ is moderated.  To submit articles: try just posting with      ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu         ]
[ FAQ:      http://reality.sgi.com/employees/austern_mti/std-c++/faq.html    ]
[ Policy:   http://reality.sgi.com/employees/austern_mti/std-c++/policy.html ]
[ Comments? mailto:std-c++-request@ncar.ucar.edu                             ]





Author: fjh@mundook.cs.mu.OZ.AU (Fergus Henderson)
Date: 1996/08/01
Raw View
mikes@abc.se \e \e  (Mikael St\eldal) writes:

>>|> If you have something stored in a class string object,
>
>>|> string s("foo bar");
>
>>|> and want to check if the beginning of s is equal to another string
>>|> ("foo"), how are you supposed to do that?
[...]
>>What's wrong with:
>> if ( s.compare( 0 , 3 , "foo" ) == 0 ) ...
>
>It not good that you have to manually specify the size of the other
>string, it is error prone, tedious and non-elegant.

OK, how about

 if (s.rfind("foo", 0) == 0) ...

instead?

--
Fergus Henderson <fjh@cs.mu.oz.au>   |  "I have always known that the pursuit
WWW: <http://www.cs.mu.oz.au/~fjh>   |  of excellence is a lethal habit"
PGP: finger fjh@128.250.37.3         |     -- the last words of T. S. Garp.
---
[ comp.std.c++ is moderated.  To submit articles: try just posting with      ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu         ]
[ FAQ:      http://reality.sgi.com/employees/austern_mti/std-c++/faq.html    ]
[ Policy:   http://reality.sgi.com/employees/austern_mti/std-c++/policy.html ]
[ Comments? mailto:std-c++-request@ncar.ucar.edu                             ]





Author: kanze@lts.sel.alcatel.de (James Kanze US/ESC 60/3/141 #40763)
Date: 1996/08/01
Raw View
In article <DvEy7u.7MB@onyx.indstate.edu> mikes@abc.se \e \e (Mikael
St\eldal) writes:

|> >|> If you have something stored in a class string object,

|> >|> string s("foo bar");

|> >|> and want to check if the beginning of s is equal to another string
|> >|> ("foo"), how are you supposed to do that? Does

|> >|> if (s.compare("foo") == 0) cout << "the first word of s is 'foo'";

|> >|> work? From what I have understood from the draft standard, it never
|> >|> returns 0 unless the two strings are of the same length. Am I wrong?

|> >What's wrong with:
|> > if ( s.compare( 0 , 3 , "foo" ) == 0 ) ...

|> It not good that you have to manually specify the size of the other
|> string, it is error prone, tedious and non-elegant.

 if ( s.find( "foo" ) == 0 ) ...

Solves this, at the risk of being less obvious as to what you are doing.
If instead of "foo", you use a string object, then the second parameter
in the compare version can be its length.  It could also be 'strlen(
"foo" )', of course, or even 'sizeof( "foo" ) - 1'.  (In these cases, of
course, you should declare "foo" as a static object beforehand, rather
than repeating the literal string twice.)

If you are really concerned about testing the first word (and not that
the first three letters are "foo"), you might also try something like:

 if ( s.compare( 0 , s.find_first_of( " \t\n" ) , "foo" ) ) ...
--
James Kanze         Tel.: (+33) 88 14 49 00        email: kanze@gabi-soft.fr
GABI Software, Sarl., 8 rue des Francs-Bourgeois, F-67000 Strasbourg, France
Conseils,    tudes et r   alisations en logiciel orient    objet --
                -- A la recherche d'une activit    dans une region francophone
---
[ comp.std.c++ is moderated.  To submit articles: Try just posting with your
                newsreader.  If that fails, use mailto:std-c++@ncar.ucar.edu
  comp.std.c++ FAQ: http://reality.sgi.com/austern/std-c++/faq.html
  Moderation policy: http://reality.sgi.com/austern/std-c++/policy.html
  Comments? mailto:std-c++-request@ncar.ucar.edu
]





Author: herbs@cntc.com (Herb Sutter)
Date: 1996/08/01
Raw View
>>|> BTW, why hasn't the string class facitilites for case insensitive
>>|> comparision?
>
>>I'm not sure that they weren't.  Could the comparison function be locale
>>dependant?  If not, making it so would handle the case insensitivity in
>>an elegant fashion.
>
>Yes, that would be useful.

Would it?  I often want to compare the same two strings in different
ways: sometimes case is important, sometimes it's not.  Making it
locale-dependant doesn't help me much, since that only changes global
program behaviour.  Likewise, supplying a case-insensitive
string_char_traits::compare() would fix a particular string object's
case-sensitivity, whereas I want to compare the same objects
differently at different times.

Is there any standard function for case-sensitive and -insensitive
comparisons for the same string objects?


Herb Sutter (herbs@cntc.com)

Current Network Technologies Corp.
3100 Ridgeway, Suite 42, Mississauga ON Canada L5L 5M5
Tel 416-805-9088  Fax 905-608-2611
---
[ comp.std.c++ is moderated.  To submit articles: try just posting with      ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu         ]
[ FAQ:      http://reality.sgi.com/employees/austern_mti/std-c++/faq.html    ]
[ Policy:   http://reality.sgi.com/employees/austern_mti/std-c++/policy.html ]
[ Comments? mailto:std-c++-request@ncar.ucar.edu                             ]





Author: kanze@lts.sel.alcatel.de (James Kanze US/ESC 60/3/141 #40763)
Date: 1996/08/01
Raw View
In article <DvEy0y.7Ft@onyx.indstate.edu> mikes@abc.se \e \e (Mikael
St\eldal) writes:

|> >|> BTW, why hasn't the string class facitilites for case insensitive
|> >|> comparision?

|> >I'm not sure that they weren't.  Could the comparison function be locale
|> >dependant?  If not, making it so would handle the case insensitivity in
|> >an elegant fashion.

|> Yes, that would be useful.

|> >C locale is for C, which is case sensitive.  And of course, anything
|> >defined as case insensitive must be locale dependant. In French, for
|> >example, all accented characters must compare equal to the unaccented
|> >variant,

|> The problem is that many compiler vendors are lazy implementing
|> different locales.

I've not found this to be the case with Sun.  I have no experience with
other platforms, but given the amount of PC software that is localized,
I would imagine that PC compilers all support locales pretty well, too.

(I am speaking about the present situation here, of course.  It did take
them a long time to get around to supporting localization.)

|> It would be useful if the standard enforces at
|> least one case-insensitive locale, it does only have to support
|> english (maybe the "Pascal" locale?).

Sun has provisions for "installing" new locales, although I don't know
how difficult it really is.  A well designed locale mechanism should be
able to handle this without too much difficulty.  I'm not sure, but the
Sun mechanism may involve writing and installing some DLL's, which isn't
as easy as it could be.

|> I'm working with implementing
|> case-insensitive protocols like SMTP (argh! why are those protocols
|> not case sensitive???).

There are many reasons, I suspect.  It certainly facilitates things well
you are using telnet for testing:-).  In this regard, I sort of like the
idea of an SMTP locale, although this is not really what locales were
intended for.

If you check with Plauger's book on the C standard library, his
implementation allows installing a new locale from within the program.
This is probably the best solution for SMTP, and should be supported by
all C++ implementations by creating a private specialization of the
locale standard class.  Most implementation will probably not support
this at present, though, or at least not well.

When compilers finally do start becoming conform to the standard (which,
of course, won't happen until there is a standard), you should be able
to create an SMTP locale for the parsing parts and imbue the socket
streams with it without changing the global local used for displaying
user messages, etc.  In fact, if I understand correctly, you should be
able to modify the imbue'd locale on the fly, say using SMTP for the
header, "C" as the default for the body, and switching to other locales
if specified by mime.  (In fact, I'm not totally sure it is that easy.
I seem to remember that only parts of the header are case insensitive.)

(Not a standards issue, but if your problem is parsing Internet standard
text format headers, I would try and get a regular expression machine
from somewhere.  In which case, the case sensitivity question becomes
moot: just specify "^[Tt][Oo][ \t]*:", etc.)
--
James Kanze         Tel.: (+33) 88 14 49 00        email: kanze@gabi-soft.fr
GABI Software, Sarl., 8 rue des Francs-Bourgeois, F-67000 Strasbourg, France
Conseils, itudes et rialisations en logiciel orienti objet --
                -- A la recherche d'une activiti dans une region francophone
---
[ comp.std.c++ is moderated.  To submit articles: try just posting with      ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu         ]
[ FAQ:      http://reality.sgi.com/employees/austern_mti/std-c++/faq.html    ]
[ Policy:   http://reality.sgi.com/employees/austern_mti/std-c++/policy.html ]
[ Comments? mailto:std-c++-request@ncar.ucar.edu                             ]





Author: kanze@lts.sel.alcatel.de (James Kanze US/ESC 60/3/141 #40763)
Date: 1996/08/02
Raw View
In article <3200d574.221055561@news.interlog.com> herbs@cntc.com (Herb
Sutter) writes:

|> >>|> BTW, why hasn't the string class facitilites for case insensitive
|> >>|> comparision?
|> >
|> >>I'm not sure that they weren't.  Could the comparison function be locale
|> >>dependant?  If not, making it so would handle the case insensitivity in
|> >>an elegant fashion.
|> >
|> >Yes, that would be useful.

|> Would it?  I often want to compare the same two strings in different
|> ways: sometimes case is important, sometimes it's not.  Making it
|> locale-dependant doesn't help me much, since that only changes global
|> program behaviour.

But the very notion of case is locale dependant.  It is impossible to
have a case insensitive comparison except for a specific locale.

And of course, you can change the locale as often as you wish.  With
streams (but not strings), the stream itself can be imbued with a
locale; different streams will parse/display input/output differently,
independantly of the global locale.

|> Likewise, supplying a case-insensitive
|> string_char_traits::compare() would fix a particular string object's
|> case-sensitivity, whereas I want to compare the same objects
|> differently at different times.

|> Is there any standard function for case-sensitive and -insensitive
|> comparisons for the same string objects?

Yes and no.  You can always use the C comparison functions on the
results of c_str.  In this case: strcmp and memcmp compare the internal
representation, strcoll compares them as strings of characters (which
may or may not be case insensitive, according to locale, but will
normally be case insensitive for locales using the Roman alphabet).

If I understand the intent correctly, the abstraction underlying the
string class is not an array of machine level bytes (which would be,
say, vector< char >), but a sequence of characters or a string of text.
In this case, only a locale sensitive comparison really makes sense.
(This begs the question of what to do if one of the strings is in
Czeckish and the other in French.  I think that such cases would
generally require the use of wstring.)
--
James Kanze         Tel.: (+33) 88 14 49 00        email: kanze@gabi-soft.fr
GABI Software, Sarl., 8 rue des Francs-Bourgeois, F-67000 Strasbourg, France
Conseils, itudes et rialisations en logiciel orienti objet --
                -- A la recherche d'une activiti dans une region francophone
---
[ comp.std.c++ is moderated.  To submit articles: try just posting with      ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu         ]
[ FAQ:      http://reality.sgi.com/employees/austern_mti/std-c++/faq.html    ]
[ Policy:   http://reality.sgi.com/employees/austern_mti/std-c++/policy.html ]
[ Comments? mailto:std-c++-request@ncar.ucar.edu                             ]





Author: sean@delta.com (Sean L. Palmer)
Date: 1996/08/05
Raw View
> |> BTW, why hasn't the string class facitilites for case insensitive
> |> comparision?
>
> I'm not sure that they weren't.  Could the comparison function be locale
> dependant?  If not, making it so would handle the case insensitivity in
> an elegant fashion.  C locale is for C, which is case sensitive.  And of
> course, anything defined as case insensitive must be locale dependant.
> In French, for example, all accented characters must compare equal to
> the unaccented variant, and in German, the "scharfes-ess" ('_', if you
> can read ISO 8859-1) compares equal to the two character sequence "SS".
>
> While waiting, you can get the same effect, although less elegantly,
> with: "strcoll( s1.c_str() , s2.c_str() )".

I think they intend to do it using a different underlying character type
(one that has a compare function which is case-sensitive) for the
basic_string<> template.

I have written an istring class using this technique.

The general technique it uses is to uppercase each character as it
enters the string, by any means, but I'm sure a version could be made
that doesn't modify the characters (although for my purposes it was more
efficient to do the locale conversion just once)

I think that there should be a more elegant way to accomplish this than
the method I had to use. I ran into problems with compatibility between
string types and character/case-insensitive-character types.

I think the committee ought to give a few glances at this problem and
see if it can be made any easier. (for what that's worth)

---
[ comp.std.c++ is moderated.  To submit articles: try just posting with      ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu         ]
[ FAQ:      http://reality.sgi.com/employees/austern_mti/std-c++/faq.html    ]
[ Policy:   http://reality.sgi.com/employees/austern_mti/std-c++/policy.html ]
[ Comments? mailto:std-c++-request@ncar.ucar.edu                             ]





Author: mikes@abc.se \e (Mikael St\eldal)
Date: 1996/08/05
Raw View
>|> >What's wrong with:
>|> > if ( s.compare( 0 , 3 , "foo" ) == 0 ) ...

>|> It not good that you have to manually specify the size of the other
>|> string, it is error prone, tedious and non-elegant.

> if ( s.find( "foo" ) == 0 ) ...

>Solves this, at the risk of being less obvious as to what you are doing.

But doesn't that detect if s is "bar foo" too? That's not what I want.
Someone suggested rfind(), would that work?

>If instead of "foo", you use a string object, then the second parameter
>in the compare version can be its length.

That's still rather clumsy IMHO.

>If you are really concerned about testing the first word (and not that
>the first three letters are "foo"), you might also try something like:

> if ( s.compare( 0 , s.find_first_of( " \t\n" ) , "foo" ) ) ...

Also clumsy.

It should be possible to do something like

if (s.compare("foo"))

only. This could be accomplished by changing the the standard to not
compare the length of the strings.
---
[ comp.std.c++ is moderated.  To submit articles: try just posting with      ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu         ]
[ FAQ:      http://reality.sgi.com/employees/austern_mti/std-c++/faq.html    ]
[ Policy:   http://reality.sgi.com/employees/austern_mti/std-c++/policy.html ]
[ Comments? mailto:std-c++-request@ncar.ucar.edu                             ]





Author: John Hancock <jhancock+@cs.cmu.edu>
Date: 1996/08/05
Raw View
>From this thread, does this mean a string class is being added
to the C++ standard? If so, is the string class found off the
STL page in "mstring.h" going to be the standard or will there be
something else (and is there a free implementation of it anywhere)?

Thanks,

John




--
--------------------------------------------------------------------------
     John A. Hancock, Robotics Institute, Carnegie Mellon University
        jhancock@ri.cmu.edu, http://www.ius.cs.cmu.edu/~jhancock/
"Life is short, but long enough to get what's coming to you." - John Alton
---
[ comp.std.c++ is moderated.  To submit articles: Try just posting with your
                newsreader.  If that fails, use mailto:std-c++@ncar.ucar.edu
  comp.std.c++ FAQ: http://reality.sgi.com/austern/std-c++/faq.html
  Moderation policy: http://reality.sgi.com/austern/std-c++/policy.html
  Comments? mailto:std-c++-request@ncar.ucar.edu
]





Author: mikes@abc.se (Mikael St\eldal)
Date: 1996/07/24
Raw View
If you have something stored in a class string object,

string s("foo bar");

and want to check if the beginning of s is equal to another string
("foo"), how are you supposed to do that? Does

if (s.compare("foo") == 0) cout << "the first word of s is 'foo'";

work? From what I have understood from the draft standard, it never
returns 0 unless the two strings are of the same length. Am I wrong?

Do you have to do stringstream, substr or some other fancy stuff to
accomplish this simple task? Then the string class has a design flaw.

BTW, why hasn't the string class facitilites for case insensitive
comparision?




[ comp.std.c++ is moderated.  To submit articles: try just posting with      ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu         ]
[ FAQ:      http://reality.sgi.com/employees/austern_mti/std-c++/faq.html    ]
[ Policy:   http://reality.sgi.com/employees/austern_mti/std-c++/policy.html ]
[ Comments? mailto:std-c++-request@ncar.ucar.edu                             ]





Author: kanze@lts.sel.alcatel.de (James Kanze US/ESC 60/3/141 #40763)
Date: 1996/07/25
Raw View
In article <Dv1uDF.CyG@onyx.indstate.edu> mikes@abc.se (Mikael St\eldal)
writes:

|> BTW, why hasn't the string class facitilites for case insensitive
|> comparision?

I'm not sure that they weren't.  Could the comparison function be locale
dependant?  If not, making it so would handle the case insensitivity in
an elegant fashion.  C locale is for C, which is case sensitive.  And of
course, anything defined as case insensitive must be locale dependant.
In French, for example, all accented characters must compare equal to
the unaccented variant, and in German, the "scharfes-ess" ('_', if you
can read ISO 8859-1) compares equal to the two character sequence "SS".

While waiting, you can get the same effect, although less elegantly,
with: "strcoll( s1.c_str() , s2.c_str() )".
--
James Kanze         Tel.: (+33) 88 14 49 00        email: kanze@gabi-soft.fr
GABI Software, Sarl., 8 rue des Francs-Bourgeois, F-67000 Strasbourg, France
Conseils, itudes et rialisations en logiciel orienti objet --
                -- A la recherche d'une activiti dans une region francophone
---
[ comp.std.c++ is moderated.  To submit articles: try just posting with      ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu         ]
[ FAQ:      http://reality.sgi.com/employees/austern_mti/std-c++/faq.html    ]
[ Policy:   http://reality.sgi.com/employees/austern_mti/std-c++/policy.html ]
[ Comments? mailto:std-c++-request@ncar.ucar.edu                             ]





Author: kanze@lts.sel.alcatel.de (James Kanze US/ESC 60/3/141 #40763)
Date: 1996/07/25
Raw View
In article <Dv1uDF.CyG@onyx.indstate.edu> mikes@abc.se (Mikael St\eldal)
writes:

|> If you have something stored in a class string object,

|> string s("foo bar");

|> and want to check if the beginning of s is equal to another string
|> ("foo"), how are you supposed to do that? Does

|> if (s.compare("foo") == 0) cout << "the first word of s is 'foo'";

|> work? From what I have understood from the draft standard, it never
|> returns 0 unless the two strings are of the same length. Am I wrong?

What's wrong with:
 if ( s.compare( 0 , 3 , "foo" ) == 0 ) ...
--
James Kanze         Tel.: (+33) 88 14 49 00        email: kanze@gabi-soft.fr
GABI Software, Sarl., 8 rue des Francs-Bourgeois, F-67000 Strasbourg, France
Conseils,    tudes et r   alisations en logiciel orient    objet --
                -- A la recherche d'une activit    dans une region francophone
---
[ comp.std.c++ is moderated.  To submit articles: Try just posting with your
                newsreader.  If that fails, use mailto:std-c++@ncar.ucar.edu
  comp.std.c++ FAQ: http://reality.sgi.com/austern/std-c++/faq.html
  Moderation policy: http://reality.sgi.com/austern/std-c++/policy.html
  Comments? mailto:std-c++-request@ncar.ucar.edu
]





Author: alhy@courant.SLAC.Stanford.EDU (J. Scott Berg)
Date: Wed, 22 Mar 1995 19:52:52 GMT
Raw View
In article <Pine.ULT.3.91.950322165341.28586A-100000@rowan>,
Darren Gunter  <drg@coventry.ac.uk> wrote:
> I have a string class with the following functions
>
> string(char); - constructor
> string &operator=(const string &); - assignment
> string &operator=(int); - changes string length
>
> if I have a string called temp and use
>
> temp = 'h';
>
> the function which changes the string length is called instead. Can
> anyone help me???

The problem is that char is an integral type, and the implicit
promotion to int has higher priority than the user-defined conversion
through the string constructor (ARM 13.2).


--
J. Scott Berg       Real mail: Varian Physics; Stanford  CA  94305-4060
email: ALHY@slac.stanford.edu
phone: (415) 926-4732 (w)  (415) 854-2713 (h)




Author: matt@physics2.berkeley.edu (Matt Austern)
Date: 23 Mar 1995 01:37:47 GMT
Raw View
In article <D5uxw5.Hq7@unixhub.SLAC.Stanford.EDU> alhy@courant.SLAC.Stanford.EDU (J. Scott Berg) writes:

> > string(char); - constructor
> > string &operator=(const string &); - assignment
> > string &operator=(int); - changes string length
> >
> > if I have a string called temp and use
> >
> > temp = 'h';
> >
> > the function which changes the string length is called instead. Can
> > anyone help me???
>
> The problem is that char is an integral type, and the implicit
> promotion to int has higher priority than the user-defined conversion
> through the string constructor (ARM 13.2).

The real problem is that the original poster's design abuses operator
overloading: instead of operator=(int), the resize operation should
simply be something like string::resize(int).  After all: is this
really an assignment operation in any meaningful sense?

Operator overloading is useful, on occasion, but it ought to be used
sparingly.  This sort of problem is what you get when you overuse
user-defined conversions, user-defined binary operators, and function
overloading.  Most function calls should simply be explicit function
calls.

Followups redirected to comp.lang.c++, since this is a discussion of
C++ programming technique rather than a discussion of the C++
standard.

--

                               --matt




Author: fmarin@dino.conicit.ve (Felix Marin)
Date: 23 Mar 1995 18:01:43 -0400
Raw View
Barry Margolin (barmar@nic.near.net) wrote:
: In article <Pine.ULT.3.91.950322165341.28586A-100000@rowan> Darren Gunter <drg@coventry.ac.uk> writes:
: >I have a string class with the following functions

: >string(char); - constructor
: >string &operator=(const string &); - assignment
: >string &operator=(int); - changes string length

: >if I have a string called temp and use

: >temp = 'h';

: >the function which changes the string length is called instead. Can
: >anyone help me???

: According to Section 13.2, pp.318-319 of the ARM, when doing argument
: matching in overloaded functions, sequences of conversions that contain
: only integral promotions (such as promoting char to int) are better than
: those involving user-defined conversions (e.g. calling string(char) to
: convert 'h' to a string).

: Your example is similar to the one on p.325 of the ARM.

: (My page numbers are from the May, 1992 reprinting.)

: --
: Barry Margolin
: BBN Planet Corporation, Cambridge, MA
: barmar@bbnplanet.com
: --
: Barry Margolin
: BBN Planet Corporation, Cambridge, MA
: barmar@bbnplanet.com


An alternative solution is to avoid the constructor string(char) and
replace it by another one, i.e. string(const char *). In this case, there
is no problem with the statement s="h" instead of s='h'.




Author: klitos@picard.datastream.co.uk (Klitos Kyriacou)
Date: 23 Mar 1995 14:33:27 GMT
Raw View
In article <Pine.ULT.3.91.950322165341.28586A-100000@rowan>, Darren Gunter <drg@coventry.ac.uk> writes:
>I have a string class with the following functions
>
>string(char); - constructor
>string &operator=(const string &); - assignment
>string &operator=(int); - changes string length
>
>if I have a string called temp and use
>
>temp = 'h';
>
>the function which changes the string length is called instead. Can
>anyone help me???
Your constructor initialises a string that does not yet exist.
In the absence of operator =(int), the statement temp = 'h'
would first construct a temporary, unnamed, string object and
then use the assignment operator to assign the temporary string
to the string named 'temp'.  So this is an indirect, two-step
process (you can't use the constructor to 're-construct' the
existing object 'temp').  Because it is an indirect process,
C++ will use a more direct process if one is available.  The
definition of operator =(int) provides such a process: it is
regarded more direct to convert a char to an int and then call
operator =(int).  (Perhaps someone else can tell us the rules
that C++ uses to decide how to process the statement temp = 'h',
which to me seems ambiguous.)  You can still force the compiler
to do what you originally wanted by modifying your statement to
be: temp = string('h');.

Regards,
Klitos.


________________________________________________________________________
Klitos Kyriacou
daytime: kkyriacou@datastream.co.uk  (Datastream Intl Ltd, London, U.K.)
evening: ubac3pi@dcs.bbk.ac.uk  (Birkbeck College, University of London)




Author: Darren Gunter <drg@coventry.ac.uk>
Date: Wed, 22 Mar 1995 16:56:47 +0000
Raw View
I have a string class with the following functions

string(char); - constructor
string &operator=(const string &); - assignment
string &operator=(int); - changes string length

if I have a string called temp and use

temp = 'h';

the function which changes the string length is called instead. Can




Author: barmar@nic.near.net (Barry Margolin)
Date: 22 Mar 1995 15:36:29 -0500
Raw View
In article <Pine.ULT.3.91.950322165341.28586A-100000@rowan> Darren Gunter <drg@coventry.ac.uk> writes:
>I have a string class with the following functions

>string(char); - constructor
>string &operator=(const string &); - assignment
>string &operator=(int); - changes string length

>if I have a string called temp and use

>temp = 'h';

>the function which changes the string length is called instead. Can
>anyone help me???

According to Section 13.2, pp.318-319 of the ARM, when doing argument
matching in overloaded functions, sequences of conversions that contain
only integral promotions (such as promoting char to int) are better than
those involving user-defined conversions (e.g. calling string(char) to
convert 'h' to a string).

Your example is similar to the one on p.325 of the ARM.

(My page numbers are from the May, 1992 reprinting.)

--
Barry Margolin
BBN Planet Corporation, Cambridge, MA
barmar@bbnplanet.com
--
Barry Margolin
BBN Planet Corporation, Cambridge, MA
barmar@bbnplanet.com