Thread

Topic: Class String of Standard

Author: James Kuyper <kuyper@wizard.net>
Date: 1998/06/12 Raw View

J. Kanze wrote:
...
> A call to c_str or data DOES invalidate all iterators and references
> into the string.

I found a statement that c_str() and data() invalidate references
returned by operator[]; since at() is defined in terms of operator[],
those references are also invalidated. I couldn't find a similar
statement about the iterators; where is it?

[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: AllanW@my-dejanews.com
Date: 1998/06/14 Raw View

> |>  > In article <6lhfm1$q74$1@nnrp1.dejanews.com>,
> |>  >   AllanW@my-dejanews.com wrote:
> |>  > > If c_string() is part of the ANSI/ISO
> |>  > > standard, then I expect that most library vendors will always keep a
> |>  > > trailing '\0' character in their internal buffers at all times.

In article <m3wwao3zw5.fsf@gabi-soft.fr>,
  kanze@gabi-soft.fr (J. Kanze) wrote:
> This is my own implementation; it does ensure that the allocated buffer
> is always at least one larger than the capacity, so that it can append
> the '\0' without reallocation, but that is simply because I can see no
> interest, on a platform with virtual memory, in trying to save this
> extra byte.
This is a minor optimization of the technique I originally spoke of.
Leaving the extra character uninitialized until c_str() is called avoids
a very few machine instructions during array allocation, in return for
those very same machine instructions whenever c_str is called (which is
presumably fairly rare).  If the c_str function is called less often than
the internal array is allocated, this is a real (but minor) win.

But did you perform any performance testing?  Or did you at least do enough
code analysis to prove that the c_str function was really called rarely, not
only in your own code, but in the code of anyone else who uses the library?

If the ratio (calls to c_str)/(number of allocations) is greater than one,
your code might actually be de-optimized to the same degree that you thought
it was optimized.  And you wouldn't know it, because (let's face it) there
is very little difference either way.

In article <6lcfss$q3v$1@duke.telepac.pt>,
  "Paulo Alexandre Madureira de Abreu" <anti_spam@mail.telepac.pt> asked
  two questions:
> 1- the function at provides checked access and returns out_of_range error,
> what is exactly this error (i.e.: what king of char is returned?)
Already answered; but I think this demonstrates that Mr. Abreu doesn't know
a whole lot about library internals.  My assumption is that Mr. Abreu isn't
trying to implement his own library, he just wants to understand what
happens in his current library. I applaud the effort -- you can never know
too much about internals of any software you use (unless you use it to write
unsupportable code).  Ask a fairly beginner question, get a fairly beginner
answer.

> 2- this implementation of class string use NULL character. If not, could you
> explain me how the c_string() code should be done?
Also already answered, perhaps more extensively than we should have.  Note
that originally I never said that the '\0' is stored at all times on all
platforms; I merely said that it would "probably" be done this way on "most"
platforms.  The point is, you can use the class as if this was true, and not
have to pay any penalties.

IMHO, you may have been picking nits when you took exception to my response.
It may not be 100% true for library implementors, but it's true enough to
answer the original question.

Such shortcuts are often needed in the real world.  The alternative would
require us, when teaching C++ as a first language, to explain operator<< in
great detail while demonstrating the common "hello world" program.  But
'great detail' would include many things that beginning students are neither
prepared nor required to know: * The concept of operator overloading  * The
two different meanings of the operator, depending on argument types,  * The
binary number system (to explain bit shifting),  * Quoted strings return a
pointer to the first character,  * What a pointer is, and why it's ever used,
 * Null-terminated char arrays, and why they're different than class string,
* Class string,  * #include statement and header files,  * The preprocessor,
* Classes istream, ostream, and iostream, and the buffer types,  * The
relationship of streams to files and to "standard input/output",  * The rules
of precedence, * The expression as a statement,  * The concept of "side
effects",  * Etc. All of these things need to be learned eventually, but
giving it to beginning students too quickly would make perfectly intelligent
people think that they are incapable of learning to program computers.

(I have seen this type of thing in real life, first-hand, although what I saw
was even worse than the example above, because the taught concepts didn't
apply even remotely to the subject.  It was in the late 1970's, in a class on
programming in BASIC, which started with a description of magnetic core --
and yes, that was already obsolete -- and a description of the EBCDIC
character system -- which wasn't obsolete, but also wasn't used on the
ASCII HP timeshare system we had access to.  Never did explain what was then
called "USASCII", except to say it was another code similar to EBCDIC.  The
instructor then went on to describe the parts of the CPU, even making quick
mention of IBM mainframe's PPU, for Peripheral Processing Unit...  No
surprize that 2/3 of the class dropped out within a few weeks.)

-----== Posted via Deja News, The Leader in Internet Discussion ==-----
http://www.dejanews.com/   Now offering spam-free web-based newsreading

[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: kanze@gabi-soft.fr (J. Kanze)
Date: 1998/06/14 Raw View

James Kuyper <kuyper@wizard.net> writes:

|>  J. Kanze wrote:
|>  ...
|>  > A call to c_str or data DOES invalidate all iterators and references
|>  > into the string.
|>
|>  I found a statement that c_str() and data() invalidate references
|>  returned by operator[]; since at() is defined in terms of operator[],
|>  those references are also invalidated. I couldn't find a similar
|>  statement about the iterators; where is it?

21.3/5:

    References, pointers, and iterators referring to the elements of a
    basic_string sequence may be invalidated by the following uses of
    that basic_string object:

    [...]

    -- Calling data() and c_str() member functions.

I believe that this text was inserted (or extensively modified) in the
FDIS, so if you are looking at CD2, it may not be the same.  (There were
significant problems in the definition of lifetime of iterators and
references in the CD2.)

--
James Kanze    +33 (0)1 39 23 84 71    mailto: kanze@gabi-soft.fr
GABI Software, 22 rue Jacques-Lemercier, 78000 Versailles, France
Conseils en informatique orient   e objet --
              -- Beratung in objektorientierter Datenverarbeitung
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: kanze@gabi-soft.fr (J. Kanze)
Date: 1998/06/15 Raw View

AllanW@my-dejanews.com writes:

|>  > |>  > In article <6lhfm1$q74$1@nnrp1.dejanews.com>,
|>  > |>  >   AllanW@my-dejanews.com wrote:
|>  > |>  > > If c_string() is part of the ANSI/ISO
|>  > |>  > > standard, then I expect that most library vendors will always
|>  > |>  > > keep a trailing '\0' character in their internal buffers at all
|>  > |>  > > times.
|>
|>  In article <m3wwao3zw5.fsf@gabi-soft.fr>,
|>    kanze@gabi-soft.fr (J. Kanze) wrote:
|>  > This is my own implementation; it does ensure that the allocated buffer
|>  > is always at least one larger than the capacity, so that it can append
|>  > the '\0' without reallocation, but that is simply because I can see no
|>  > interest, on a platform with virtual memory, in trying to save this
|>  > extra byte.

|>  This is a minor optimization of the technique I originally spoke of.
|>  Leaving the extra character uninitialized until c_str() is called avoids
|>  a very few machine instructions during array allocation, in return for
|>  those very same machine instructions whenever c_str is called (which is
|>  presumably fairly rare).  If the c_str function is called less often than
|>  the internal array is allocated, this is a real (but minor) win.

The *optimization* was not made for performance reasons, but for
software engineering reasons -- it optimized my time in writing the
library not to have to think about the extra '\0' except in the function
c_str.

|>  But did you perform any performance testing?  Or did you at least do enough
|>  code analysis to prove that the c_str function was really called rarely,
|>  not only in your own code, but in the code of anyone else who uses the
|>  library?
|>
|>  If the ratio (calls to c_str)/(number of allocations) is greater than one,
|>  your code might actually be de-optimized to the same degree that you
|>  thought it was optimized.  And you wouldn't know it, because (let's face
|>  it) there is very little difference either way.

As you say, whatever difference there might be probably isn't
measurable.  (On the other hand, I do know that in my own code, c_str is
*very* rare.)

If I were concerned about the performance of my string class, I probably
would have adopted an organization something like that of the SGI rope
class, with non-contiguous data.  But obviously, this would depend on
exactly which operations I would like to optimize.

--
James Kanze    +33 (0)1 39 23 84 71    mailto: kanze@gabi-soft.fr
GABI Software, 22 rue Jacques-Lemercier, 78000 Versailles, France
Conseils en informatique orient   e objet --
              -- Beratung in objektorientierter Datenverarbeitung
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: James Kuyper <kuyper@wizard.net>
Date: 1998/06/15 Raw View

J. Kanze wrote:
>
> James Kuyper <kuyper@wizard.net> writes:
>
> |>  J. Kanze wrote:
> |>  ...
> |>  > A call to c_str or data DOES invalidate all iterators and references
> |>  > into the string.
> |>
> |>  I found a statement that c_str() and data() invalidate references
> |>  returned by operator[]; since at() is defined in terms of operator[],
> |>  those references are also invalidated. I couldn't find a similar
> |>  statement about the iterators; where is it?
>
> 21.3/5:
>
>     References, pointers, and iterators referring to the elements of a
>     basic_string sequence may be invalidated by the following uses of
>     that basic_string object:
>
>     [...]
>
>     -- Calling data() and c_str() member functions.
>
> I believe that this text was inserted (or extensively modified) in the
> FDIS, so if you are looking at CD2, it may not be the same.  (There were
> significant problems in the definition of lifetime of iterators and
> references in the CD2.)

There's no paragraph 5 in section 21.3 of my copy of CD2. I'll be glad
when the standard is finally approved. I can't justify paying for a copy
of anything less than the final approved standard - I'm not sure I can
justify paying for that.
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: jkanze@otelo.ibmmail.com
Date: 1998/06/09 Raw View

In article <6lhfm1$q74$1@nnrp1.dejanews.com>,
  AllanW@my-dejanews.com wrote:
> > 2- this implementation of class string use NULL character. If not, could
you
> > explain me how the c_string() code should be done?
>
> The draft standard I have access to doesn't list the c_string() member
> function, so I'm just guessing here.

I would presume that this is just a spelling error for c_str().

> If c_string() is part of the ANSI/ISO
> standard, then I expect that most library vendors will always keep a trailing
> '\0' character in their internal buffers at all times.

The two implementations I'm familiar with don't.  In fact, one doesn't
even maintain the string in sequential memory until c_str() or data() are
called.

Note that the string can contain a '\0', which does NOT represent the
end of the string as far as the string class is concerned.  It probably
will represent the end of the string for users of the results of c_str().

> The other methods are
> just too much problems.  Even if you add the '\0' on-demand, there's the
> problem that the (presumably const?) c_string() function might end up
> altering the internal char array.

Any attempt to modify the string throught the pointer returned by c_str()
or data() is undefined behavior.  The implementation can thus ignore this
case.

--
James Kanze    +33 (0)1 39 23 84 71    mailto: kanze@gabi-soft.fr
        +49 (0)69 66 45 33 10    mailto: jkanze@otelo.ibmmail.com
GABI Software, 22 rue Jacques-Lemercier, 78000 Versailles, France
Conseils en informatique orient   e objet --
              -- Beratung in objektorientierter Datenverarbeitung

-----== Posted via Deja News, The Leader in Internet Discussion ==-----
http://www.dejanews.com/   Now offering spam-free web-based newsreading
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: AllanW@my-dejanews.com
Date: 1998/06/10 Raw View

In article <6lj1j8$34d$1@nnrp1.dejanews.com>,
  James Kanze <jkanze@otelo.ibmmail.com> wrote:
>
> In article <6lhfm1$q74$1@nnrp1.dejanews.com>,
>   AllanW@my-dejanews.com wrote:
> > If c_string() is part of the ANSI/ISO
> > standard, then I expect that most library vendors will always keep a
> > trailing '\0' character in their internal buffers at all times.
>
> The two implementations I'm familiar with don't.  In fact, one doesn't
> even maintain the string in sequential memory until c_str() or data() are
> called.
>
> > The other methods are
> > just too much problems.  Even if you add the '\0' on-demand, there's the
> > problem that the (presumably const?) c_string() function might end up
> > altering the internal char array.
>
> Any attempt to modify the string throught the pointer returned by c_str()
> or data() is undefined behavior.  The implementation can thus ignore this
> case.

I wasn't referring to the possibility of the user code modifying the
string.  I meant that if the string class adds '\0' to the end of the
internal character data (incrementing the "allocated" size but not the
"in use" size), then it might have to relocate that array of data to do
so.  Depending on how iterators were implemented, this might invalidate
all of them.

Let's back up a bit.  There are three ways that I can think of to handle
strings that allow the c_str() function to work properly:

    * Store the terminating '\0' character at all times, or at least make
      sure that the internal array of characters has room for a terminating
      '\0'.  This character is not part of the string, meaning that it
      wouldn't be included in the length of the strings, iterators would
      never point to it (or it would be considered past-the-end), etc.
      This is the method I suggested that "most vendors would probably"
      adopt.

      The advantage of this approach is that the c_str function can be
      implmented simply by returning the address of the internal array of
      characters.

      The disadvantage of this approach is that every single string has
      one extra character, overhead that would not be otherwise needed.

    * Store the terminating null character on-demand.  That is, don't
      bother to store a trailing '\0' until it's needed for some (hopefully
      rare) case, such as when c_str has been called.  At that time, check
      and see if the string already has a trailing '\0' byte.  If so,
      simply return the address of the internal array of characters and
      we're done.  If not, then we must add a trailing '\0' character to
      the end of the internal array.

      The advantage of this approach is that if the calling program never
      uses function c_str, then we never have the extra character overhead.
      And if the calling function often uses c_str, then soon most strings
      will develop their trailing '\0' byte, thereby keeping the c_str
      function very quick.

      The disadvantage of this approach is that c_str may be called when
      the internal array of characters is already full.  In that case, we
      must re-allocate the internal array, just as if the user was adding
      a new character to the end of the string.  Depending on how iterators
      are implemented, this could invalidate all the iterators for the
      string.

    * Never store the trailing '\0' character as part of the string's
      internal data.  (This seems to be how both of the two implementations
      that you are familiar with work).  Instead, calling c_str() allocates
      some memory, copies the characters into this allocated memory, adds
      the trailing '\0' byte, and returns it to the caller.

      But where is the memory for this copy?

      --> A static (or global) buffer, reserved for use only by the
          c_str() function.

          This wouldn't work well, because there would be a limit on
          the string length, and because you couldn't use the c_str()
          function on two strings at once (to pass them both to an
          old C-style function, for instance).

      --> A pool of static (or global) buffers, all reserved for use
          only by the c_str() function.

          This works slightly better, because we can have more than
          one c_str() at the same time.  But how many do we need?
          5? 20? 1000? No matter what number you pick, sooner or
          later someone will break it -- for instance, by calling
          c_str() twice from within a recursive function.  Also,
          these static buffers are starting to use up a *LOT* of
          memory, even though they might never be used at all!

      --> A proxy object, created by c_str, which contains the array
          of characters including the '\0' character, and delete[]'s
          it when the proxy object is destroyed.

          The best choice yet, because there's no limit on how many
          are active at once, it doesn't use any memory until it's
          needed, and it cleans up after itself.

          However, every single call to c_str() involves copying the
          whole string into the proxy element.  c_str() will be a
          very slow function indeed.

      --> A separate allocated C-style string, stored within the string
          object itself.  Any modification to the string contents will
          delete the C-style string, if it exists.  Calling the c_str()
          function will create it if neccesary, and then return the
          address.

          Possibly as good as this concept can go.  Once again, there's
          no limit on how many are active at once, it doesn't use any
          memory until it's needed, and all memory is eventually
          released.  Also, calling c_str() twice returns the same
          results (at the same address!) without performing the slow
          copy again.

          Unfortunately, the memory allocated for the copy takes up
          space until the string is changed or deleted.  If used a lot,
          this can cause most of the strings used to consume twice as
          much memory as would otherwise be needed.

My prediction (about which method "most vendors would probably use" was
not based on any poll of C++ library vendors, but only on my own
analysis and insight.  First, I looked at the other two solutions, and
concluded that they both had problems.  Next, I decided that if I was
implementing the string class myself, this is how I would have done it.
Third, (and here's the only truely dangerous bit), I allowed myself the
vanity of thinking that if I would do it this way, then many library
vendors would too. (I've never been naive enough to think that ALL
library vendors would do it my way, just some of them -- or, in this
case, "most" of them.)  In the past, such vanity has gotten me into
trouble, but not very often.  Library vendors are, on the whole,
reasonable.  And chances are, if I think that something is reasonable,
then at least a few others will think so too.  (We may all be mistaken,
of course, but that's another story.)

 * * *

So, Mr. Kanze, for the two implementations that you're familiar
with, you said that one of them doesn't even maintain the string
in sequential memory until c_str() or data() are called.  Let's
call that one the NotSeq implementation, and the other one the
IsSeq implementation.

Does the NotSeq implementation use proxy object?  Or does it have
a C-style string pointer as part of the string object?  If neither
of these is correct, then how does it handle c_str()?

Does the IsSeq implementation add '\0' to the internal array on
demand?  If not, then how does it handle c_str()?

Do you use c_str() much in any of your programs?  If so, have
you found one of these implementations to be superior to the
other, in memory consumption and/or run-time execution speed?
Which one?  Do you know why?

-----== Posted via Deja News, The Leader in Internet Discussion ==-----
http://www.dejanews.com/   Now offering spam-free web-based newsreading

[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: kanze@gabi-soft.fr (J. Kanze)
Date: 1998/06/11 Raw View

AllanW@my-dejanews.com writes:

|>
|>  In article <6lj1j8$34d$1@nnrp1.dejanews.com>,
|>    James Kanze <jkanze@otelo.ibmmail.com> wrote:
|>  >
|>  > In article <6lhfm1$q74$1@nnrp1.dejanews.com>,
|>  >   AllanW@my-dejanews.com wrote:
|>  > > If c_string() is part of the ANSI/ISO
|>  > > standard, then I expect that most library vendors will always keep a
|>  > > trailing '\0' character in their internal buffers at all times.
|>  >
|>  > The two implementations I'm familiar with don't.  In fact, one doesn't
|>  > even maintain the string in sequential memory until c_str() or data() are
|>  > called.
|>  >
|>  > > The other methods are
|>  > > just too much problems.  Even if you add the '\0' on-demand, there's the
|>  > > problem that the (presumably const?) c_string() function might end up
|>  > > altering the internal char array.
|>  >
|>  > Any attempt to modify the string throught the pointer returned by c_str()
|>  > or data() is undefined behavior.  The implementation can thus ignore this
|>  > case.
|>
|>  I wasn't referring to the possibility of the user code modifying the
|>  string.  I meant that if the string class adds '\0' to the end of the
|>  internal character data (incrementing the "allocated" size but not the
|>  "in use" size), then it might have to relocate that array of data to do
|>  so.  Depending on how iterators were implemented, this might invalidate
|>  all of them.

A call to c_str or data DOES invalidate all iterators and references
into the string.

    [...]
|>      * Never store the trailing '\0' character as part of the string's
|>        internal data.  (This seems to be how both of the two implementations
|>        that you are familiar with work).  Instead, calling c_str() allocates
|>        some memory, copies the characters into this allocated memory, adds
|>        the trailing '\0' byte, and returns it to the caller.
|>
|>        But where is the memory for this copy?

    [...]
|>        --> A separate allocated C-style string, stored within the string
|>            object itself.  Any modification to the string contents will
|>            delete the C-style string, if it exists.  Calling the c_str()
|>            function will create it if neccesary, and then return the
|>            address.
|>
|>            Possibly as good as this concept can go.  Once again, there's
|>            no limit on how many are active at once, it doesn't use any
|>            memory until it's needed, and all memory is eventually
|>            released.  Also, calling c_str() twice returns the same
|>            results (at the same address!) without performing the slow
|>            copy again.
|>
|>            Unfortunately, the memory allocated for the copy takes up
|>            space until the string is changed or deleted.  If used a lot,
|>            this can cause most of the strings used to consume twice as
|>            much memory as would otherwise be needed.

This is the method used in the SGI rope class (which is pretty close to
the standard string).

In practice, at least in my own code, calls to c_str (or its equivalent
in my own string class) are fairly rare.  And of course, I'm running on
a machine with virtual memory (at least 200 Mega), so the extra memory
usage is irrelevant.

|>  My prediction (about which method "most vendors would probably use" was
|>  not based on any poll of C++ library vendors, but only on my own
|>  analysis and insight.  First, I looked at the other two solutions, and
|>  concluded that they both had problems.  Next, I decided that if I was
|>  implementing the string class myself, this is how I would have done it.
|>  Third, (and here's the only truely dangerous bit), I allowed myself the
|>  vanity of thinking that if I would do it this way, then many library
|>  vendors would too. (I've never been naive enough to think that ALL
|>  library vendors would do it my way, just some of them -- or, in this
|>  case, "most" of them.)  In the past, such vanity has gotten me into
|>  trouble, but not very often.  Library vendors are, on the whole,
|>  reasonable.  And chances are, if I think that something is reasonable,
|>  then at least a few others will think so too.  (We may all be mistaken,
|>  of course, but that's another story.)
|>
|>   * * *
|>
|>  So, Mr. Kanze, for the two implementations that you're familiar
|>  with, you said that one of them doesn't even maintain the string
|>  in sequential memory until c_str() or data() are called.  Let's
|>  call that one the NotSeq implementation, and the other one the
|>  IsSeq implementation.
|>
|>  Does the NotSeq implementation use proxy object?  Or does it have
|>  a C-style string pointer as part of the string object?  If neither
|>  of these is correct, then how does it handle c_str()?

This is the SGI rope class, see above.

|>  Does the IsSeq implementation add '\0' to the internal array on
|>  demand?  If not, then how does it handle c_str()?

This is my own implementation; it does ensure that the allocated buffer
is always at least one larger than the capacity, so that it can append
the '\0' without reallocation, but that is simply because I can see no
interest, on a platform with virtual memory, in trying to save this
extra byte.

|>  Do you use c_str() much in any of your programs?  If so, have
|>  you found one of these implementations to be superior to the
|>  other, in memory consumption and/or run-time execution speed?
|>  Which one?  Do you know why?

In the past, I used the equivalent to c_str rather intensively, as I had
considerable code which was written to scan C style strings.  In more
recent code, much of this has been rewritten to use string iterators;
about the only remaining calls to c_str are for the filename argument to
fstream (or system level calls).

I've never done any actual measures -- my applications are NOT string
intensive, so I really don't care.  On the other hand, the SGI rope
class was written by one of the real specialists in this sort of thing,
precisely because contiguous allocation was too expensive for intensive
usage.

--
James Kanze    +33 (0)1 39 23 84 71    mailto: kanze@gabi-soft.fr
GABI Software, 22 rue Jacques-Lemercier, 78000 Versailles, France
Conseils en informatique orient   e objet --
              -- Beratung in objektorientierter Datenverarbeitung
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: "Paulo Alexandre Madureira de Abreu" <anti_spam@mail.telepac.pt>
Date: 1998/06/06 Raw View

Class String of Standard

I have to doubts concerning with the standard class String.

1- the function at provides checked access and returns out_of_range error,
what is exactly this error (i.e.: what king of char is returned?)

2- this implementation of class string use NULL character. If not, could you
explain me how the c_string() code should be done?






[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: "Paulo Alexandre Madureira de Abreu" <anti_spam@mail.telepac.pt>
Date: 1998/06/08 Raw View

Class String of Standard

I have to doubts concerning with the standard class String.

1- the function at provides checked access and returns out_of_range error,
what is exactly this error (i.e.: what king of char is returned?)

2- this implementation of class string use NULL character. If not, could you
explain me how the c_string() code should be done?

Thanks in advance,

Paulo Abreu

------------------------
my e-mail: pamaxx@mail.telepac.pt
note: take of all the x from my e-mail
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: AllanW@my-dejanews.com
Date: 1998/06/08 Raw View

In article <6lcfss$q3v$1@duke.telepac.pt>,
  "Paulo Alexandre Madureira de Abreu" <anti_spam@mail.telepac.pt> wrote:

> Class String of Standard

> 1- the function at provides checked access and returns out_of_range error,
> what is exactly this error (i.e.: what king of char is returned?)

Member function "at" does not ever RETURN out_of_range error.  If the index is
not valid, it can THROW an out_of_range error.  This is part of a large topic
called "exception handling."

Example:
    string s("One two three");
    //        0....5...10..
    try {
        cout << s.at(4);    // Writes 't'
        cout << s.at(9);    // Writes 'h'
        cout << s.at(10);   // Writes 'r'
        cout << s.at(6);    // Writes 'o'
        cout << s.at(5);    // Writes 'w'
        cout << s.at(15);   // THROWS EXCEPTION out_of_range
        cout << s.at(1);    // *** NEVER GETS HERE ***
    } catch (out_of_range) {
        // Come here if any of the at() functions threw out_of_range
        cout << " out_of_range"; // Explain what happened
    }
    cout << endl;

Look up "throw" and "catch" in your C++ language manual for more information.

> 2- this implementation of class string use NULL character. If not, could you
> explain me how the c_string() code should be done?

The draft standard I have access to doesn't list the c_string() member
function, so I'm just guessing here.  If c_string() is part of the ANSI/ISO
standard, then I expect that most library vendors will always keep a trailing
'\0' character in their internal buffers at all times.  The other methods are
just too much problems.  Even if you add the '\0' on-demand, there's the
problem that the (presumably const?) c_string() function might end up
altering the internal char array.  Therefore, that array and the int that
stores it's allocated length would have to be mutable, and it would be quite
difficult to keep them thread-safe, and etc., and etc., and etc.

-----== Posted via Deja News, The Leader in Internet Discussion ==-----
http://www.dejanews.com/   Now offering spam-free web-based newsreading

[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]