Topic: String literals


Author: d96-mst@nada.kth.se (Mikael Steldal)
Date: 1997/05/20
Raw View
In article <5l13m8$hsm@pasilla.bbnplanet.com>,
Barry Margolin <barmar@bbnplanet.com> wrote:

>>  An ordinary string literal has type "array of n const char"
>>  and static storage duration.

Does that mean that

char* foo = "foo";

is an error? Good!
---
[ comp.std.c++ is moderated.  To submit articles: try just posting with      ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu         ]
[ FAQ:      http://reality.sgi.com/employees/austern_mti/std-c++/faq.html    ]
[ Policy:   http://reality.sgi.com/employees/austern_mti/std-c++/policy.html ]
[ Comments? mailto:std-c++-request@ncar.ucar.edu                             ]





Author: fjh@mundook.cs.mu.OZ.AU (Fergus Henderson)
Date: 1997/05/15
Raw View
Steve Clamage <stephen.clamage@eng.sun.com> writes:

>Fergus Henderson wrote:
>>
>> According to the Nov 96 WP (which I think is the same as CD2)
>> you _can_ get equal.
>>
>>         int x, y;
>>         assert(&x != &y);       // according to Nov 96 WP, this might fail
>
>According to the way I read 5.10 "Equality operators", pointers
>to two complete objects of the same type that are not part of the
>same array MUST compare unequal.

Please quote chapter and verse.  Your copy of 5.10 must be different
to the one I'm reading.  In my copy, 5.10 has only two paragraphs.
The first paragraph says that equality operators behave the same
as the relational operators (for which the behaviour in cases like
this is unspecified).  The second paragraph deals only with
pointers to members.

--
Fergus Henderson <fjh@cs.mu.oz.au>   |  "I have always known that the pursuit
WWW: <http://www.cs.mu.oz.au/~fjh>   |  of excellence is a lethal habit"
PGP: finger fjh@128.250.37.3         |     -- the last words of T. S. Garp.
---
[ comp.std.c++ is moderated.  To submit articles: try just posting with      ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu         ]
[ FAQ:      http://reality.sgi.com/employees/austern_mti/std-c++/faq.html    ]
[ Policy:   http://reality.sgi.com/employees/austern_mti/std-c++/policy.html ]
[ Comments? mailto:std-c++-request@ncar.ucar.edu                             ]





Author: James Kanze <james-albert.kanze@vx.cit.alcatel.fr>
Date: 1997/05/16
Raw View
Oleg Zabluda <zabluda@math.psu.edu> writes:

|>  If I were to formulate the appropriate portion of the DWP, I would say
|>  that string literals are _NOT_ char arrays. They have all the properties
|>  of a static const char array with 3 (AFAIK) exceptions.
|>
|>  1. They may have overlapped memory locations.
|>  2. There is a (deprecated) conversion "string literal" -> char*
|>  3. You can use them to initialize char arrays.

Actually, I'd be tempted to say that they aren't char array's, but
string literals:-).  It just happens that char array's and a string
literal's have the same type, and a certain number of similarities in
their semantics.

But this is a more general discussion.  (I think I originally heard it
as "is a union a class".)  When defining something (a string literal)
which is similar to something else (a char array), you can either say
that the string literal is a special form of char array, with the
following exceptions, or you can say that it is something different,
which shares the following characteristics.  In general, I prefer the
second approach (but the draft standard says that unions are classes),
although in the case of string literals, I think that the similarities
with char arrays are so overwhelming that the first approach is truely
justified in this case.

--
James Kanze      home:     kanze@gabi-soft.fr        +33 (0)1 39 55 85 62
                 office:   kanze@vx.cit.alcatel.fr   +33 (0)1 69 63 14 54
GABI Software, Sarl., 22 rue Jacques-Lemercier, F-78000 Versailles France
     -- Conseils en informatique industrielle --
---
[ comp.std.c++ is moderated.  To submit articles: try just posting with      ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu         ]
[ FAQ:      http://reality.sgi.com/employees/austern_mti/std-c++/faq.html    ]
[ Policy:   http://reality.sgi.com/employees/austern_mti/std-c++/policy.html ]
[ Comments? mailto:std-c++-request@ncar.ucar.edu                             ]





Author: Julian Pardoe <pardoej@lonnds.ml.com>
Date: 1997/05/16
Raw View
Fergus Henderson wrote:
> According to the Nov 96 WP (which I think is the same as CD2)
> you _can_ get equal.
>
>         int x, y;
>         assert(&x != &y);       // according to Nov 96 WP, this might fail

In this case we have a serious problem!  Consider

    Foo *x = new Foo (1);
    Foo *y = new Foo (2);

There must be zillions of lines of code that assume that x != y.
It's the whole basis for using an object's address as a surrogate
for its identity.  (It's also part of the basis for C disallowing
zero-sized objects.)

And what about the null pointer?

    Foo *x = new Foo (1);
    Foo *y = NULL;

May x != y in this case to or is the null pointer a special
case?  (Well, I guess it since it is guaranteed to be unequal
to any other pointer and this rule may override the other.
Or maybe it is "different" (in some extra-standard way), but
there's no reliable way to test this in a valid program.
The implications are bizarre!)

-- jP --
---
[ comp.std.c++ is moderated.  To submit articles: try just posting with      ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu         ]
[ FAQ:      http://reality.sgi.com/employees/austern_mti/std-c++/faq.html    ]
[ Policy:   http://reality.sgi.com/employees/austern_mti/std-c++/policy.html ]
[ Comments? mailto:std-c++-request@ncar.ucar.edu                             ]





Author: Steve Clamage <stephen.clamage@eng.sun.com>
Date: 1997/05/16
Raw View
Fergus Henderson wrote:
>
> Steve Clamage <stephen.clamage@eng.sun.com> writes:
>
> >Fergus Henderson wrote:
> >>
> >> According to the Nov 96 WP (which I think is the same as CD2)
> >> you _can_ get equal.
> >>
> >>         int x, y;
> >>         assert(&x != &y);       // according to Nov 96 WP, this might fail
> >
> >According to the way I read 5.10 "Equality operators", pointers
> >to two complete objects of the same type that are not part of the
> >same array MUST compare unequal.
>
> Please quote chapter and verse.  Your copy of 5.10 must be different
> to the one I'm reading.  In my copy, 5.10 has only two paragraphs.
> The first paragraph says that equality operators behave the same
> as the relational operators (for which the behaviour in cases like
> this is unspecified).  The second paragraph deals only with
> pointers to members.

Oops. You are correct. The draft does NOT say that unrelated
complete objects must have addresses that compare unequal.

I thought the C standard made some guarantees about different
objects having different addresses, but it too does not.

--
Steve Clamage, stephen.clamage@eng.sun.com
---
[ comp.std.c++ is moderated.  To submit articles: try just posting with      ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu         ]
[ FAQ:      http://reality.sgi.com/employees/austern_mti/std-c++/faq.html    ]
[ Policy:   http://reality.sgi.com/employees/austern_mti/std-c++/policy.html ]
[ Comments? mailto:std-c++-request@ncar.ucar.edu                             ]





Author: James Kanze <james-albert.kanze@vx.cit.alcatel.fr>
Date: 1997/05/17
Raw View
Oleg Zabluda <zabluda@math.psu.edu> writes:

|>  James Kanze <james-albert.kanze@vx.cit.alcatel.fr> wrote:
|>  : Oleg Zabluda <zabluda@math.psu.edu> writes:
|>
|>  : |>  Ok, let me try to play a language lawer for a second. I think there
|>  : |>  is an internal contradiction in the C++ DWP related to string literals.
|>  : |>
|>  : |>  [lex.string] says:
|>  : |>
|>  : |>    An ordinary string literal has type "array of n const char"
|>  : |>    and static storage duration.
|>  : |>
|>  : |>  And later:
|>  : |>
|>  : |>    Whether  all  string  literals  are  distinct  (that is, are
|>  : |>    stored in nonoverlapping objects)  is  implementation-defined.
|>  : |>    The effect of attempting to modify a string literal is undefined.
|>  : |>
|>  : |>
|>  : |>  I don't see how the first one can be reconciled with the second one.
|>  : |>  C++ arrays are objects, right? So they can't have overlapped
|>  : |>  memory locations.
|>
|>  : The quoted text says that the type of a string literal is "array of
|>  : const char".  It doesn't say anything about whether two identical string
|>  : literals are the same object or not, or whether one is a sub-object of
|>  : the other.  I don't see the problem.
|>
|>  OK, maybe with identical strings you can argue that it's actually the
|>  same object. But if strings are different, they are different objects.

Not necessarily.  I'm not even strictly sure that they are objects,
although the fact that I can have an lvalue expression which refers to
them suggests very strongly that they are (as does the fact that they
have storage duration).

|>  Fergus Henderson showed that the there is no requirement that different
|>  objects occupy non-overlapping memory locations. And even the
|>  definition of 'complete objects' is very unclear.

Correct.  Strictly speaking, all the standard describes is a set of
conforming programs, and the possible observable behaviors of those
programs.  How the implementation goes about making this happen is its
business.

|>  Still, static const char arrays are not allowed to occupy overlapping
|>  memory locations.

Can you show me where in the standard it says this.  What the draft does
do is define the semantics of a variable declared "static char const[]",
and the (slightly different) semantics of string literals.  In practice,
the semantics of the declared variable are such that an implementation
cannot, in the general case, let them overlap.  (In fact, of course, it
can, if it can prove that letting them overlap will have no effect on
the observable behavior of the program.)  The semantics of string
literals are expressedly defined in such a way as to allow this overlap;
a program which counts on their being different is non-conforming, by
definition.

|>  String literals are allowed to. In my view, this
|>  means that string literals are not const char arrays with static
|>  storage duration. It's something else, although very similar. There
|>  is also this wierd (for a const char array) standard implicit type
|>  conversion "Hello" -> char*.

1. String literals are not named variables.  The semantic
characteristics which prevent const char arrays from overlapping comes
from the fact that they are named variables, not from the fact that they
are const char arrays.

2. I don't think that the "standard implicit type conversion" is in the
standard.  It may be in many compilers in order to support older code
(much like most compilers allow binding a temporary to a non-const
reference).

|>  : |>  It become even more unclear if you consider string literals,
|>  : |>  partially overlapping in the memory:
|>  : |>  "Hello, World" and "World". Or even crazier:
|>  : |>
|>  : |>  "Hello, World", "World", and "World\0x00, Hello.
|>
|>  : What isn't unclear?  Could you provide an example?
|>
|>  This is what I had in mind:
|>
|>      "Hello, World" <----------------- str1
|>        ^    "World\0, Hello" <---------str2
|>        |       ^        ^
|>        |       |        |
|>        p       q        r
|>
|>  Here string literals str1 and str2 are presumed to _always_ be put
|>  into the overlapping memory locations by the compiler, as indicated.
|>  DWP allows the compiler to give this guarantee. Now,
|>  p and q always point into the same object ("Hello World").
|>  q and r always point into the same object too ("World\0, Hello").
|>  Yet, p and r _never_ point into the same object. I see it as
|>  breaking the transitivity rule:

So "pointing into the same object" is not transitive.  But there is no
C++ operator for this, and it isn't a mathematical concept I'm familiar
with either.  Obviously, if you define arbitrary relationship operators,
they may or may not be transitive:-).

I see your point, however.  This particular relationship IS transitive
for all other pointers in C++, so it is somewhat surprising that it is
not transitive for string literals.  I would prefer to say that it is
not defined for string literals (much like > for arbitrary pointers).

|>  p < q == true, q < r == true,  p < r can be false.

The last operation is undefined.  Transitivity fails not on the operator
<, but on whether the operator is defined or not.

--
James Kanze      home:     kanze@gabi-soft.fr        +33 (0)1 39 55 85 62
                 office:   kanze@vx.cit.alcatel.fr   +33 (0)1 69 63 14 54
GABI Software, Sarl., 22 rue Jacques-Lemercier, F-78000 Versailles France
     -- Conseils en informatique industrielle --
---
[ comp.std.c++ is moderated.  To submit articles: try just posting with      ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu         ]
[ FAQ:      http://reality.sgi.com/employees/austern_mti/std-c++/faq.html    ]
[ Policy:   http://reality.sgi.com/employees/austern_mti/std-c++/policy.html ]
[ Comments? mailto:std-c++-request@ncar.ucar.edu                             ]





Author: fjh@mundook.cs.mu.OZ.AU (Fergus Henderson)
Date: 1997/05/17
Raw View
Steve Clamage <stephen.clamage@eng.sun.com> writes:

>> >Fergus Henderson wrote:
>> >>
>> >> According to the Nov 96 WP (which I think is the same as CD2)
>> >> you _can_ get equal.
>> >>
>> >>         int x, y;
>> >>         assert(&x != &y);       // according to Nov 96 WP, this might fail
>> >
>> >According to the way I read 5.10 "Equality operators", pointers
>> >to two complete objects of the same type that are not part of the
>> >same array MUST compare unequal.
...
>Oops. You are correct. The draft does NOT say that unrelated
>complete objects must have addresses that compare unequal.
>
>I thought the C standard made some guarantees about different
>objects having different addresses, but it too does not.

Nope, the C standard _does_ make that guarantee.  (Sorry, don't have it
on hand to quote right now, but I checked it the other day, so I'm
quite sure of it.)

The failure of the Nov 96 WP to make the same guarantee for C++
is an incompatibility with C.

Basically this is without question a mistake in the C++ DWP
(dating back to the ARM).  If an implementation were to really
behave in this way, a huge amount of code would break.

The obvious fix is to just take the corresponding sentence in the
C standard and cut and paste it into the C++ draft.

--
Fergus Henderson <fjh@cs.mu.oz.au>   |  "I have always known that the pursuit
WWW: <http://www.cs.mu.oz.au/~fjh>   |  of excellence is a lethal habit"
PGP: finger fjh@128.250.37.3         |     -- the last words of T. S. Garp.
---
[ comp.std.c++ is moderated.  To submit articles: try just posting with      ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu         ]
[ FAQ:      http://reality.sgi.com/employees/austern_mti/std-c++/faq.html    ]
[ Policy:   http://reality.sgi.com/employees/austern_mti/std-c++/policy.html ]
[ Comments? mailto:std-c++-request@ncar.ucar.edu                             ]





Author: James Kanze <james-albert.kanze@vx.cit.alcatel.fr>
Date: 1997/05/17
Raw View
jpotter@falcon.lhup.edu (John Potter) writes:

|>  I assume you are refering to the following from 5.9/2
|>
|>    --If two pointers p and q of the same type point to different
|>      objects that are not members of the same object or elements of the
|>      same array or to different functions, or if only one of them is
|>      null, the results of p<q, p>q, p<=q, and p>=q are unspecified.
|>
|>  And 5.10/1 extends this to == and != while 5.10/2 goes to great length
|>  to assure that pointers to member are comparable to null and never
|>  compare equal unless they actually point to the same member of the
|>  same type (function) or complete object (data) or are both null.
|>
|>  It does seem to be a mistake and even inconsistent.  There are no
|>  quidelines for reasonable "unspecified" behavior here.  An
|>  implementation could define all six operators to return false or true
|>  when the pointers are into different objects.  There could be some
|>  reason of which I am ignorant for this and, if not, it is not likely a
|>  simple editorial change to fix it.  I guess that we will have to
|>  depend upon the sanity of a reasonable implementation not the
|>  standard.

There is a very definite reason for it: on a segmented architecture, an
implementation can arrange for all pointers into the same object to have
the same segment, and only compare the offsets.

--
James Kanze      home:     kanze@gabi-soft.fr        +33 (0)1 39 55 85 62
                 office:   kanze@vx.cit.alcatel.fr   +33 (0)1 69 63 14 54
GABI Software, Sarl., 22 rue Jacques-Lemercier, F-78000 Versailles France
     -- Conseils en informatique industrielle --
---
[ comp.std.c++ is moderated.  To submit articles: try just posting with      ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu         ]
[ FAQ:      http://reality.sgi.com/employees/austern_mti/std-c++/faq.html    ]
[ Policy:   http://reality.sgi.com/employees/austern_mti/std-c++/policy.html ]
[ Comments? mailto:std-c++-request@ncar.ucar.edu                             ]





Author: jpotter@falcon.lhup.edu (John Potter)
Date: 1997/05/18
Raw View
On 17 May 97 17:29:53 GMT, James Kanze wrote:

: jpotter@falcon.lhup.edu (John Potter) writes:

: |>  I assume you are refering to the following from 5.9/2
: |>
: |>    --If two pointers p and q of the same type point to different
: |>      objects that are not members of the same object or elements of the
: |>      same array or to different functions, or if only one of them is
: |>      null, the results of p<q, p>q, p<=q, and p>=q are unspecified.
: |>
: |>  And 5.10/1 extends this to == and != while 5.10/2 goes to great length
: |>  to assure that pointers to member are comparable to null and never
: |>  compare equal unless they actually point to the same member of the
: |>  same type (function) or complete object (data) or are both null.
: |>
: |>  It does seem to be a mistake and even inconsistent.  There are no
: |>  quidelines for reasonable "unspecified" behavior here.  An
: |>  implementation could define all six operators to return false or true
: |>  when the pointers are into different objects.  There could be some
: |>  reason of which I am ignorant for this and, if not, it is not likely a
: |>  simple editorial change to fix it.  I guess that we will have to
: |>  depend upon the sanity of a reasonable implementation not the
: |>  standard.

: There is a very definite reason for it: on a segmented architecture, an
: implementation can arrange for all pointers into the same object to have
: the same segment, and only compare the offsets.

Of course, we have the intel 16bit model.  With BC++, if I arrange for
two pointers into two different objects to have the same offset, I get
 <  false
 <= true
 >= true
 >  false
 == false
 != true
which is a mathematical absurdity, but a very sane implementation.
The 5.9 operators give meaningless results and the 5.10 operators give
meaningful results.  The point is that with the current 5.10/1 the
results of == and != could be reversed giving mathematical consistency
and an insane implementation.  In fact, they could both return true
without violating 5.10/1.  BC++ is usable because they have a sane
implementation not because the draft requires it.  I think that this
is serious.

The other side of the story is that I can arrange (via the
implementation) to obtain two different pointers to the same object
and get for instance
 <  true
 <= true
 >= false
 >  false
 == false
 != true
which is mathematically consistent and a sane implementation since the
values of the two pointers are not the same.  However, when
dereferencing the two pointers does not yield the same object, I find
it surprising to someone who understands the underlying memory model.
I find this amusing not serious.

John
---
[ comp.std.c++ is moderated.  To submit articles: try just posting with      ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu         ]
[ FAQ:      http://reality.sgi.com/employees/austern_mti/std-c++/faq.html    ]
[ Policy:   http://reality.sgi.com/employees/austern_mti/std-c++/policy.html ]
[ Comments? mailto:std-c++-request@ncar.ucar.edu                             ]





Author: Oleg Zabluda <zabluda@math.psu.edu>
Date: 1997/05/10
Raw View
Ok, let me try to play a language lawer for a second. I think there
is an internal contradiction in the C++ DWP related to string literals.

[lex.string] says:

  An ordinary string literal has type "array of n const char"
  and static storage duration.

And later:

  Whether  all  string  literals  are  distinct  (that is, are
  stored in nonoverlapping objects)  is  implementation-defined.
  The effect of attempting to modify a string literal is undefined.


I don't see how the first one can be reconciled with the second one.
C++ arrays are objects, right? So they can't have overlapped
memory locations. Even if you make a special exception for
arrays that come from string literals, the operators ==, <, <=
and so on, for 'const char *' cease to be transitive. Pointers of the
same type, pointing to the same memory location are guaranteed to
compare equal (at least after conversion to void*), and the result
of applying operators <, <=, ... to the pointers pointing inside
the same container is guaranteed to produce meaningful results
(3.9.2 par 3,4; 5.9.2 par 2), yet the result of a comparison of
pointers pointing into different arrays is undefined (5.9.2 par 2).
This breaks transitivity for all of the comparison operators.

It become even more unclear if you consider string literals,
partially overlapping in the memory:
"Hello, World" and "World". Or even crazier:

"Hello, World", "World", and "World\0x00, Hello.


Another thing. I think that the warning that "The effect of
attempting to modify a string literal is undefined" is not needed.
Since an attempt to modify any constant with static storage duration
is already either prohibited or undefined. There are three ways to
modify an object, I know of. One is assignement (prohibited),
another is cast away const (undefined), and the third one is
in-place construction (undefined: 3.9 par. 9).

Oleg.
--
Life is a sexually transmitted, 100% lethal disease.
---
[ comp.std.c++ is moderated.  To submit articles: try just posting with      ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu         ]
[ FAQ:      http://reality.sgi.com/employees/austern_mti/std-c++/faq.html    ]
[ Policy:   http://reality.sgi.com/employees/austern_mti/std-c++/policy.html ]
[ Comments? mailto:std-c++-request@ncar.ucar.edu                             ]





Author: fjh@mundook.cs.mu.OZ.AU (Fergus Henderson)
Date: 1997/05/11
Raw View
Oleg Zabluda <zabluda@math.psu.edu> writes:

>Ok, let me try to play a language lawer for a second. I think there
>is an internal contradiction in the C++ DWP related to string literals.
>
>[lex.string] says:
>
>  An ordinary string literal has type "array of n const char"
>  and static storage duration.
>
>And later:
>
>  Whether  all  string  literals  are  distinct  (that is, are
>  stored in nonoverlapping objects)  is  implementation-defined.
>  The effect of attempting to modify a string literal is undefined.
>
>I don't see how the first one can be reconciled with the second one.
>C++ arrays are objects, right?

Right.  But not necessarily distinct objects.

>So they can't have overlapped memory locations.

How does that follow?

The C++ draft does not prohibit objects from having overlapping
memory locations, as far as I can tell.

>Even if you make a special exception for
>arrays that come from string literals, the operators ==, <, <=
>and so on, for 'const char *' cease to be transitive.

Those operators are already not transitive for any pointer type, in the
general case, because the results of comparisons between objects that
are not sub-objects of the same complete object are undefined.  (If you
stick to the well-defined cases, then the operators are transitive.)

>the result of a comparison of
>pointers pointing into different arrays is undefined (5.9.2 par 2).

Yep.

For `<' and `<=', that is intended, but for `==', that looks to me
like a serious mistake in the draft.

>Another thing. I think that the warning that "The effect of
>attempting to modify a string literal is undefined" is not needed.

True, but a small amount of redundancy is not a bad thing.

--
Fergus Henderson <fjh@cs.mu.oz.au>   |  "I have always known that the pursuit
WWW: <http://www.cs.mu.oz.au/~fjh>   |  of excellence is a lethal habit"
PGP: finger fjh@128.250.37.3         |     -- the last words of T. S. Garp.
---
[ comp.std.c++ is moderated.  To submit articles: try just posting with      ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu         ]
[ FAQ:      http://reality.sgi.com/employees/austern_mti/std-c++/faq.html    ]
[ Policy:   http://reality.sgi.com/employees/austern_mti/std-c++/policy.html ]
[ Comments? mailto:std-c++-request@ncar.ucar.edu                             ]





Author: Oleg Zabluda <zabluda@math.psu.edu>
Date: 1997/05/11
Raw View
Fergus Henderson <fjh@mundook.cs.mu.OZ.AU> wrote:
: Oleg Zabluda <zabluda@math.psu.edu> writes:

: >Ok, let me try to play a language lawer for a second. I think there
: >is an internal contradiction in the C++ DWP related to string literals.
: >
: >[lex.string] says:
: >
: >  An ordinary string literal has type "array of n const char"
: >  and static storage duration.
: >
: >And later:
: >
: >  Whether  all  string  literals  are  distinct  (that is, are
: >  stored in nonoverlapping objects)  is  implementation-defined.
: >  The effect of attempting to modify a string literal is undefined.
: >
: >I don't see how the first one can be reconciled with the second one.
: >C++ arrays are objects, right?

: Right.  But not necessarily distinct objects.

: >So they can't have overlapped memory locations.

: How does that follow?

: The C++ draft does not prohibit objects from having overlapping
: memory locations, as far as I can tell.

Yeh. Now that I've tried to find something in the standard that
says otherwise, I failed. I am still not sure if it can't be somehow
deducted indirectly, at least for complete objects (1.7 par 2).

If it is allowed, say for const objects with static storage duration
and trivial constructor and destructor, I think DWP must explicitly
say so. DWP also requres different objects to have different
addresses. Are there any exceptions for const objects with
static storage duration in the appropriate section (which one is
it, BTW?).

In short, string literals don't look like any other arrays I know.
For example, which of the following arrays can overlap?

const char a[] = "xy";
const char b[] = "xy";

const char* const c = "xy";
const char* const d = "xy";

Obviously a and b can't (??), while the string literals used to
initialize c and d can. This means that either string literals
are not really arrays, or there are two different 'arrays of char'
with two different sets of rules. Here is another exampe of that:

char* e = a;    // illegal, a is a const char[3]
char* f = "xy"; // legal, although deprecated. "xy" is also
                // a const char[3]


: >Even if you make a special exception for
: >arrays that come from string literals, the operators ==, <, <=
: >and so on, for 'const char *' cease to be transitive.

: Those operators are already not transitive for any pointer type, in the
: general case, because the results of comparisons between objects that
: are not sub-objects of the same complete object are undefined.  (If you
: stick to the well-defined cases, then the operators are transitive.)

Can you give one more example when a < b is guaranteed to be true,
b < c is guaranteed to be true, but a < c is undefined?

Oleg.
--
Life is a sexually transmitted, 100% lethal disease.
---
[ comp.std.c++ is moderated.  To submit articles: Try just posting with your
                newsreader.  If that fails, use mailto:std-c++@ncar.ucar.edu
  comp.std.c++ FAQ: http://reality.sgi.com/austern/std-c++/faq.html
  Moderation policy: http://reality.sgi.com/austern/std-c++/policy.html
  Comments? mailto:std-c++-request@ncar.ucar.edu
]





Author: jpotter@falcon.lhup.edu (John Potter)
Date: 1997/05/11
Raw View
On 11 May 97 11:28:35 GMT, Fergus Henderson wrote:

: Oleg Zabluda <zabluda@math.psu.edu> writes:


: >the result of a comparison of
: >pointers pointing into different arrays is undefined (5.9.2 par 2).

: Yep.

: For `<' and `<=', that is intended, but for `==', that looks to me
: like a serious mistake in the draft.

Are we looking at the same draft?  I can't find "undefined" in 5.9 or
5.10.  It is unspecified.  I sometimes get less and sometimes greater
for comparisons not in same object for the same code on different
implementations.  But never equal, nor a core dump.

John
---
[ comp.std.c++ is moderated.  To submit articles: Try just posting with your
                newsreader.  If that fails, use mailto:std-c++@ncar.ucar.edu
  comp.std.c++ FAQ: http://reality.sgi.com/austern/std-c++/faq.html
  Moderation policy: http://reality.sgi.com/austern/std-c++/policy.html
  Comments? mailto:std-c++-request@ncar.ucar.edu
]





Author: jpotter@falcon.lhup.edu (John E. Potter)
Date: 1997/05/11
Raw View
Oleg Zabluda (zabluda@math.psu.edu) wrote:
: Ok, let me try to play a language lawer for a second. I think there
: is an internal contradiction in the C++ DWP related to string literals.

: [lex.string] says:

:   An ordinary string literal has type "array of n const char"
:   and static storage duration.

: And later:

:   Whether  all  string  literals  are  distinct  (that is, are
:   stored in nonoverlapping objects)  is  implementation-defined.

Here is a simple test on this part.

#include <iostream.h>
int main () {
 char const* p = "Hello";
 char const* q = "Hello";
 cout << "Literals are " << ( p == q ? "" : "not ") << "pooled.\n";
 return 0;
 }

I tried four compilers and got two of each.  They all conform.  I once
got burned by assuming that they are pooled as in most assemblers.

I had trouble following the rest of your references.  Is 3.9.2 par 3
3.9 bullet 2 par 3, 3.9.2 bullet 3, or something else?  Are you using
CD2?

John
---
[ comp.std.c++ is moderated.  To submit articles: Try just posting with your
                newsreader.  If that fails, use mailto:std-c++@ncar.ucar.edu
  comp.std.c++ FAQ: http://reality.sgi.com/austern/std-c++/faq.html
  Moderation policy: http://reality.sgi.com/austern/std-c++/policy.html
  Comments? mailto:std-c++-request@ncar.ucar.edu
]





Author: "Paul D. DeRocco" <strip_these_words_pderocco@ix.netcom.com>
Date: 1997/05/11
Raw View
Oleg Zabluda wrote:

> [lex.string] says:
>
>   An ordinary string literal has type "array of n const char"
>   and static storage duration.
>
> And later:
>
>   Whether  all  string  literals  are  distinct  (that is, are
>   stored in nonoverlapping objects)  is  implementation-defined.
>   The effect of attempting to modify a string literal is undefined.
>
> I don't see how the first one can be reconciled with the second one.
> C++ arrays are objects, right? So they can't have overlapped
> memory locations. Even if you make a special exception for
> arrays that come from string literals, the operators ==, <, <=
> and so on, for 'const char *' cease to be transitive. Pointers of the
> same type, pointing to the same memory location are guaranteed to
> compare equal (at least after conversion to void*), and the result
> of applying operators <, <=, ... to the pointers pointing inside
> the same container is guaranteed to produce meaningful results
> (3.9.2 par 3,4; 5.9.2 par 2), yet the result of a comparison of
> pointers pointing into different arrays is undefined (5.9.2 par 2).
> This breaks transitivity for all of the comparison operators.
>
> It become even more unclear if you consider string literals,
> partially overlapping in the memory:
> "Hello, World" and "World". Or even crazier:
>
> "Hello, World", "World", and "World\0x00, Hello.

Even after all that, I still don't see where the problem is. Is it that
if "World" is really part of "Hello, World", then the two pointers can
be compared with < or >, while you would expect (because they are
supposedly separate arrays) that the result should be "undefined"? An
undefined result doesn't mean there _is_ no result, it just means that
the standard doesn't say what it should be. Or is it that you expect
"abc" == "abc" to be false because they're two separate arrays? What the
above clause is saying is simply that they might not be two
arrays--they're allowed to be the same array. It all works fine in
practice, as far as I can see.

> Another thing. I think that the warning that "The effect of
> attempting to modify a string literal is undefined" is not needed.
> Since an attempt to modify any constant with static storage duration
> is already either prohibited or undefined. There are three ways to
> modify an object, I know of. One is assignement (prohibited),
> another is cast away const (undefined), and the third one is
> in-place construction (undefined: 3.9 par. 9).

I agree.

--

Ciao,
Paul

(Please remove the "strip_these_words_" prefix from the return
address, which has been altered to foil junk mail senders.)
---
[ comp.std.c++ is moderated.  To submit articles: Try just posting with your
                newsreader.  If that fails, use mailto:std-c++@ncar.ucar.edu
  comp.std.c++ FAQ: http://reality.sgi.com/austern/std-c++/faq.html
  Moderation policy: http://reality.sgi.com/austern/std-c++/policy.html
  Comments? mailto:std-c++-request@ncar.ucar.edu
]





Author: Barry Margolin <barmar@bbnplanet.com>
Date: 1997/05/11
Raw View
In article <5ku2c9$ruu@r02n01.cac.psu.edu>,
Oleg Zabluda  <zabluda@math.psu.edu> wrote:
>[lex.string] says:

>  An ordinary string literal has type "array of n const char"
>  and static storage duration.

>And later:

>  Whether  all  string  literals  are  distinct  (that is, are
>  stored in nonoverlapping objects)  is  implementation-defined.
>  The effect of attempting to modify a string literal is undefined.
>
>
>I don't see how the first one can be reconciled with the second one.

I don't see the contradiction.  The first describes the type and storage
class of string literals, and the second describes the relationship between
multiple string literals.

>C++ arrays are objects, right? So they can't have overlapped
>memory locations. Even if you make a special exception for
>arrays that come from string literals, the operators ==, <, <=

The second statement makes precisely that exception.

>and so on, for 'const char *' cease to be transitive. Pointers of the
>same type, pointing to the same memory location are guaranteed to
>compare equal (at least after conversion to void*), and the result
>of applying operators <, <=, ... to the pointers pointing inside
>the same container is guaranteed to produce meaningful results
>(3.9.2 par 3,4; 5.9.2 par 2), yet the result of a comparison of
>pointers pointing into different arrays is undefined (5.9.2 par 2).
>This breaks transitivity for all of the comparison operators.

I don't see how the above statement invalidates any of that.  The
comparison operators still behave as expected when pointing inside the same
container.  Since you don't know whether two string literals overlap, a
portable program should not use a < or > comparison between pointers into
different string literals.

>It become even more unclear if you consider string literals,
>partially overlapping in the memory:
>"Hello, World" and "World". Or even crazier:
>
>"Hello, World", "World", and "World\0x00, Hello.

I'm not sure what the problem is here.  Suppose you have

char *a = "Hello, World";
char *b = "World";
char *c = "World";

The quote from the DWP says that the values of (b == a+7) and (b == c) are
implementation-defined.  The value of (a < b) is undefined, since the
string literals may be different objects.

>Another thing. I think that the warning that "The effect of
>attempting to modify a string literal is undefined" is not needed.

True, it's redundant.  Presumably left over from C, which didn't define
literals as being declared const char, but specified separately that you're
still not allowed to modify them.
--
Barry Margolin
BBN Corporation, Cambridge, MA
barmar@bbnplanet.com
(BBN customers, call (800) 632-7638 option 1 for support)
---
[ comp.std.c++ is moderated.  To submit articles: Try just posting with your
                newsreader.  If that fails, use mailto:std-c++@ncar.ucar.edu
  comp.std.c++ FAQ: http://reality.sgi.com/austern/std-c++/faq.html
  Moderation policy: http://reality.sgi.com/austern/std-c++/policy.html
  Comments? mailto:std-c++-request@ncar.ucar.edu
]





Author: Oleg Zabluda <zabluda@math.psu.edu>
Date: 1997/05/12
Raw View
John E. Potter <jpotter@falcon.lhup.edu> wrote:
: I had trouble following the rest of your references.  Is 3.9.2 par 3
: 3.9 bullet 2 par 3, 3.9.2 bullet 3, or something else?  Are you using
: CD2?

Yes, I am using CD2. 3.9.2 par 3 means section 3.9.2, paragraph with
number 2 to the left of it.

Oleg.
--
Life is a sexually transmitted, 100% lethal disease.
---
[ comp.std.c++ is moderated.  To submit articles: Try just posting with your
                newsreader.  If that fails, use mailto:std-c++@ncar.ucar.edu
  comp.std.c++ FAQ: http://reality.sgi.com/austern/std-c++/faq.html
  Moderation policy: http://reality.sgi.com/austern/std-c++/policy.html
  Comments? mailto:std-c++-request@ncar.ucar.edu
]





Author: James Kanze <james-albert.kanze@vx.cit.alcatel.fr>
Date: 1997/05/14
Raw View
Oleg Zabluda <zabluda@math.psu.edu> writes:

|>  Ok, let me try to play a language lawer for a second. I think there
|>  is an internal contradiction in the C++ DWP related to string literals.
|>
|>  [lex.string] says:
|>
|>    An ordinary string literal has type "array of n const char"
|>    and static storage duration.
|>
|>  And later:
|>
|>    Whether  all  string  literals  are  distinct  (that is, are
|>    stored in nonoverlapping objects)  is  implementation-defined.
|>    The effect of attempting to modify a string literal is undefined.
|>
|>
|>  I don't see how the first one can be reconciled with the second one.
|>  C++ arrays are objects, right? So they can't have overlapped
|>  memory locations.

The quoted text says that the type of a string literal is "array of
const char".  It doesn't say anything about whether two identical string
literals are the same object or not, or whether one is a sub-object of
the other.  I don't see the problem.

|>  Even if you make a special exception for
|>  arrays that come from string literals, the operators ==, <, <=
|>  and so on, for 'const char *' cease to be transitive. Pointers of the
|>  same type, pointing to the same memory location are guaranteed to
|>  compare equal (at least after conversion to void*), and the result
|>  of applying operators <, <=, ... to the pointers pointing inside
|>  the same container is guaranteed to produce meaningful results
|>  (3.9.2 par 3,4; 5.9.2 par 2), yet the result of a comparison of
|>  pointers pointing into different arrays is undefined (5.9.2 par 2).
|>  This breaks transitivity for all of the comparison operators.

I don't see where, except that in general, the operators are only
transitive where they are defined.  Since all string literals *may* be
separate objects, comparison for inequality between pointers based on
separate string literals is undefined behavior.

|>  It become even more unclear if you consider string literals,
|>  partially overlapping in the memory:
|>  "Hello, World" and "World". Or even crazier:
|>
|>  "Hello, World", "World", and "World\0x00, Hello.

What isn't unclear?  Could you provide an example?

|>  Another thing. I think that the warning that "The effect of
|>  attempting to modify a string literal is undefined" is not needed.

Agreed.  It is a left-over from the time when the type of a string
literal was char[], and not char const[].

--
James Kanze      home:     kanze@gabi-soft.fr        +33 (0)1 39 55 85 62
                 office:   kanze@vx.cit.alcatel.fr   +33 (0)1 69 63 14 54
GABI Software, Sarl., 22 rue Jacques-Lemercier, F-78000 Versailles France
     -- Conseils en informatique industrielle --
---
[ comp.std.c++ is moderated.  To submit articles: try just posting with      ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu         ]
[ FAQ:      http://reality.sgi.com/employees/austern_mti/std-c++/faq.html    ]
[ Policy:   http://reality.sgi.com/employees/austern_mti/std-c++/policy.html ]
[ Comments? mailto:std-c++-request@ncar.ucar.edu                             ]





Author: fjh@mundook.cs.mu.OZ.AU (Fergus Henderson)
Date: 1997/05/14
Raw View
Barry Margolin <barmar@bbnplanet.com> writes:

>>[lex.string] says:
>
>>  Whether  all  string  literals  are  distinct  (that is, are
>>  stored in nonoverlapping objects)  is  implementation-defined.
>>  The effect of attempting to modify a string literal is undefined.
>>
>Suppose you have
>
>char *a = "Hello, World";
>char *b = "World";
>char *c = "World";
>
>The quote from the DWP says that the values of (b == a+7) and (b == c) are
>implementation-defined.  The value of (a < b) is undefined, since the
>string literals may be different objects.

Nope, the quote from the DWP says that it is implementation-defined
whether all objects are distinct (stored in nonoverlapping objects).
If they are, then all three expressions have unspecified results,
according to the Nov 96 WP [*].  If they are not, the implementation
doesn't have to specify exactly which objects are overlapped, so the
results are still unspecified.  So either way, the results are
unspecified -- not implementation-defined or undefined.

[*] But the fact that `==' and `!=' on distinct objects is unspecified
is incompatibility with C and is surely a mistake in the Nov 96 WP.

--
Fergus Henderson <fjh@cs.mu.oz.au>   |  "I have always known that the pursuit
WWW: <http://www.cs.mu.oz.au/~fjh>   |  of excellence is a lethal habit"
PGP: finger fjh@128.250.37.3         |     -- the last words of T. S. Garp.
---
[ comp.std.c++ is moderated.  To submit articles: try just posting with      ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu         ]
[ FAQ:      http://reality.sgi.com/employees/austern_mti/std-c++/faq.html    ]
[ Policy:   http://reality.sgi.com/employees/austern_mti/std-c++/policy.html ]
[ Comments? mailto:std-c++-request@ncar.ucar.edu                             ]





Author: Oleg Zabluda <zabluda@math.psu.edu>
Date: 1997/05/14
Raw View
Fergus Henderson <fjh@mundook.cs.mu.OZ.AU> wrote:
: Oleg Zabluda <zabluda@math.psu.edu> writes:

: >In short, string literals don't look like any other arrays I know.

: Yes, that's true.  But that's a deliberate feature of C and C++, not
: a problem in the DWP.

: Basically, objects that you declare yourself have object identity
: (unique addresses) but string literals just have value semantics
: and aren't guaranteed to have a unique identity (address).

: [snip]

: >This means that either string literals are not really arrays,

: Nope, string literals are really arrays.

: >or there are two different 'arrays of char'
: >with two different sets of rules.

: Nope.  There is only one type 'array of char'.

OK, this is my first small experiment in acting as a language lawyer,
and I am not sure yet what the actual rules of the game are. You are a
well-known language lawyer, so I am very grateful for your comments.
But the rules, you play by, seem extremely wierd by my (math) standards.

You are saying something to the effect "A rectangle is a square with
non-equal sides". You say that static arrays are guranteed not to have
overlapping memory locations. Then you say that string literals are static
arrays. Then you say that they are allowed to have overlapping memory
locations. I think, that's crazy.

If I were to formulate the appropriate portion of the DWP, I would say
that string literals are _NOT_ char arrays. They have all the properties
of a static const char array with 3 (AFAIK) exceptions.

1. They may have overlapped memory locations.
2. There is a (deprecated) conversion "string literal" -> char*
3. You can use them to initialize char arrays.

Oleg.
--
Life is a sexually transmitted, 100% lethal disease.
---
[ comp.std.c++ is moderated.  To submit articles: try just posting with      ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu         ]
[ FAQ:      http://reality.sgi.com/employees/austern_mti/std-c++/faq.html    ]
[ Policy:   http://reality.sgi.com/employees/austern_mti/std-c++/policy.html ]
[ Comments? mailto:std-c++-request@ncar.ucar.edu                             ]





Author: jpotter@falcon.lhup.edu (John Potter)
Date: 1997/05/14
Raw View
On 13 May 1997 17:45:24 PDT, Fergus Henderson wrote:

: jpotter@falcon.lhup.edu (John Potter) writes:

: >On 11 May 97 11:28:35 GMT, Fergus Henderson wrote:

 [ result of pointer comparison in different arrays is unspecified ]

: >: For `<' and `<=', that is intended, but for `==', that looks to me
: >: like a serious mistake in the draft.

: >I sometimes get less and sometimes greater
: >for comparisons not in same object for the same code on different
: >implementations.  But never equal.

: According to the Nov 96 WP (which I think is the same as CD2)
: you _can_ get equal.

:  int x, y;
:  assert(&x != &y); // according to Nov 96 WP, this might fail

I assume you are refering to the following from 5.9/2

  --If two pointers p and q of the same type point to different
    objects that are not members of the same object or elements of the
    same array or to different functions, or if only one of them is
    null, the results of p<q, p>q, p<=q, and p>=q are unspecified.

And 5.10/1 extends this to == and != while 5.10/2 goes to great length
to assure that pointers to member are comparable to null and never
compare equal unless they actually point to the same member of the
same type (function) or complete object (data) or are both null.

It does seem to be a mistake and even inconsistent.  There are no
quidelines for reasonable "unspecified" behavior here.  An
implementation could define all six operators to return false or true
when the pointers are into different objects.  There could be some
reason of which I am ignorant for this and, if not, it is not likely a
simple editorial change to fix it.  I guess that we will have to
depend upon the sanity of a reasonable implementation not the
standard.

Been wrong before,
John
---
[ comp.std.c++ is moderated.  To submit articles: try just posting with      ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu         ]
[ FAQ:      http://reality.sgi.com/employees/austern_mti/std-c++/faq.html    ]
[ Policy:   http://reality.sgi.com/employees/austern_mti/std-c++/policy.html ]
[ Comments? mailto:std-c++-request@ncar.ucar.edu                             ]





Author: Oleg Zabluda <zabluda@math.psu.edu>
Date: 1997/05/14
Raw View
James Kanze <james-albert.kanze@vx.cit.alcatel.fr> wrote:
: Oleg Zabluda <zabluda@math.psu.edu> writes:

: |>  Ok, let me try to play a language lawer for a second. I think there
: |>  is an internal contradiction in the C++ DWP related to string literals.
: |>
: |>  [lex.string] says:
: |>
: |>    An ordinary string literal has type "array of n const char"
: |>    and static storage duration.
: |>
: |>  And later:
: |>
: |>    Whether  all  string  literals  are  distinct  (that is, are
: |>    stored in nonoverlapping objects)  is  implementation-defined.
: |>    The effect of attempting to modify a string literal is undefined.
: |>
: |>
: |>  I don't see how the first one can be reconciled with the second one.
: |>  C++ arrays are objects, right? So they can't have overlapped
: |>  memory locations.

: The quoted text says that the type of a string literal is "array of
: const char".  It doesn't say anything about whether two identical string
: literals are the same object or not, or whether one is a sub-object of
: the other.  I don't see the problem.

OK, maybe with identical strings you can argue that it's actually the
same object. But if strings are different, they are different objects.

Fergus Henderson showed that the there is no requirement that different
objects occupy non-overlapping memory locations. And even the
definition of 'complete objects' is very unclear.

Still, static const char arrays are not allowed to occupy overlapping
memory locations. String literals are allowed to. In my view, this
means that string literals are not const char arrays with static
storage duration. It's something else, although very similar. There
is also this wierd (for a const char array) standard implicit type
conversion "Hello" -> char*.

: |>  Even if you make a special exception for
: |>  arrays that come from string literals, the operators ==, <, <=
: |>  and so on, for 'const char *' cease to be transitive. Pointers of the
: |>  same type, pointing to the same memory location are guaranteed to
: |>  compare equal (at least after conversion to void*), and the result
: |>  of applying operators <, <=, ... to the pointers pointing inside
: |>  the same container is guaranteed to produce meaningful results
: |>  (3.9.2 par 3,4; 5.9.2 par 2), yet the result of a comparison of
: |>  pointers pointing into different arrays is undefined (5.9.2 par 2).
: |>  This breaks transitivity for all of the comparison operators.

: I don't see where, except that in general, the operators are only
: transitive where they are defined.  Since all string literals *may* be
: separate objects, comparison for inequality between pointers based on
: separate string literals is undefined behavior.

: |>  It become even more unclear if you consider string literals,
: |>  partially overlapping in the memory:
: |>  "Hello, World" and "World". Or even crazier:
: |>
: |>  "Hello, World", "World", and "World\0x00, Hello.

: What isn't unclear?  Could you provide an example?

This is what I had in mind:

    "Hello, World" <----------------- str1
      ^    "World\0, Hello" <---------str2
      |       ^        ^
      |       |        |
      p       q        r

Here string literals str1 and str2 are presumed to _always_ be put
into the overlapping memory locations by the compiler, as indicated.
DWP allows the compiler to give this guarantee. Now,
p and q always point into the same object ("Hello World").
q and r always point into the same object too ("World\0, Hello").
Yet, p and r _never_ point into the same object. I see it as
breaking the transitivity rule:

p < q == true, q < r == true,  p < r can be false.

Oleg.
--
Life is a sexually transmitted, 100% lethal disease.
---
[ comp.std.c++ is moderated.  To submit articles: try just posting with      ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu         ]
[ FAQ:      http://reality.sgi.com/employees/austern_mti/std-c++/faq.html    ]
[ Policy:   http://reality.sgi.com/employees/austern_mti/std-c++/policy.html ]
[ Comments? mailto:std-c++-request@ncar.ucar.edu                             ]





Author: Steve Clamage <stephen.clamage@eng.sun.com>
Date: 1997/05/14
Raw View
Fergus Henderson wrote:
>
> According to the Nov 96 WP (which I think is the same as CD2)
> you _can_ get equal.
>
>         int x, y;
>         assert(&x != &y);       // according to Nov 96 WP, this might fail

According to the way I read 5.10 "Equality operators", pointers
to two complete objects of the same type that are not part of the
same array MUST compare unequal.

BTW, the November draft and the December CD2 are not the same.
An error was made in assembling the chapters of the November
draft, and one of the included chapters was an uncorrected
earlier version. The December draft corrected that error.
(I don't remember which chapter was wrong, but it wasn't
chapter 5.)
--
Steve Clamage, stephen.clamage@eng.sun.com
---
[ comp.std.c++ is moderated.  To submit articles: try just posting with      ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu         ]
[ FAQ:      http://reality.sgi.com/employees/austern_mti/std-c++/faq.html    ]
[ Policy:   http://reality.sgi.com/employees/austern_mti/std-c++/policy.html ]
[ Comments? mailto:std-c++-request@ncar.ucar.edu                             ]





Author: James Kanze <james-albert.kanze@vx.cit.alcatel.fr>
Date: 1997/05/14
Raw View
Oleg Zabluda <zabluda@math.psu.edu> writes:

|>  Fergus Henderson <fjh@mundook.cs.mu.OZ.AU> wrote:
|>  : Oleg Zabluda <zabluda@math.psu.edu> writes:
|>
|>  : >Ok, let me try to play a language lawer for a second. I think there
|>  : >is an internal contradiction in the C++ DWP related to string literals.
|>  : >
|>  : >[lex.string] says:
|>  : >
|>  : >  An ordinary string literal has type "array of n const char"
|>  : >  and static storage duration.
|>  : >
|>  : >And later:
|>  : >
|>  : >  Whether  all  string  literals  are  distinct  (that is, are
|>  : >  stored in nonoverlapping objects)  is  implementation-defined.
|>  : >  The effect of attempting to modify a string literal is undefined.
|>  : >
|>  : >I don't see how the first one can be reconciled with the second one.
|>  : >C++ arrays are objects, right?
|>
|>  : Right.  But not necessarily distinct objects.
|>
|>  : >So they can't have overlapped memory locations.
|>
|>  : How does that follow?
|>
|>  : The C++ draft does not prohibit objects from having overlapping
|>  : memory locations, as far as I can tell.
|>
|>  Yeh. Now that I've tried to find something in the standard that
|>  says otherwise, I failed. I am still not sure if it can't be somehow
|>  deducted indirectly, at least for complete objects (1.7 par 2).

I think it is "understood".  The normal interpretation of "different
object" means that modifying one will not result in a modification of
another, so obviously, modifiable objects cannot overlap, even if the
standard doesn't say so explicitly.  Since this is implicit, I would
expect it to apply to modifiable objects as well unless the standard
explicitly said otherwise (as it does in the case of string literals).

|>  If it is allowed, say for const objects with static storage duration
|>  and trivial constructor and destructor, I think DWP must explicitly
|>  say so. DWP also requres different objects to have different
|>  addresses. Are there any exceptions for const objects with
|>  static storage duration in the appropriate section (which one is
|>  it, BTW?).

If there were a problem, this would be it.  Consider:

 assert( &("world"[2]) != &("Hello, world"[9]) ) ;

If the two string literals are different objects, then so are their
sub-objects, and so they must have different addresses.  However, the
draft doesn't say that string literals are different objects, or even
that they are objects, period.  All it says is that they have TYPE
char[].

(Actually, it's not all that clear.  A string literal used in an
expression is an lvalue, and an lvalue normally designates an object.
Still, I think that the intent is more than clear.)

|>  In short, string literals don't look like any other arrays I know.

That's because they aren't arrays:-).  They only have the type "array of
char".  (Seriously: I think that they are arrays, and they are objects,
and the draft gives them explicit rights to violate some of the
guarantees that are otherwise given.)

|>  For example, which of the following arrays can overlap?
|>
|>  const char a[] = "xy";
|>  const char b[] = "xy";
|>
|>  const char* const c = "xy";
|>  const char* const d = "xy";
|>
|>  Obviously a and b can't (??), while the string literals used to
|>  initialize c and d can. This means that either string literals
|>  are not really arrays, or there are two different 'arrays of char'
|>  with two different sets of rules.

Well, since the draft explicitly states a different set of rules for the
two cases, I don't see the problem.

Note that the problem isn't really one of "overlapping" per se; since
the values aren't modifiable, no program can tell.  The problem is one
of addresses, and what addresses must compare differently.  Thus, for
example, in the above, it is guaranteed that "a != b", and for all legal
values of i and j, "a[ i ] != b[ j ]".  There is no such guarantee
concerning the values of c and d.

|>  Here is another exampe of that:
|>
|>  char* e = a;    // illegal, a is a const char[3]
|>  char* f = "xy"; // legal, although deprecated. "xy" is also
|>                  // a const char[3]

This is historical.  Let's face it, string literals aren't exactly a
rare and exotic feature, and any change in them is likely to break
programs seriously.  Originally, there was no "const", and string
literals had type char[].  And of course, existing (C) code was just
full of things like the second line.

The C standards committee introduced const mainly as a way of specifying
that something could be put into write protected memory (ROM on embedded
systems, or a shared code segment under Unix, or...).  The C standards
committee also decided that string literals could go into write
protected memory (and weren't required to have distinct addresses).
Logically, there was never any doubt that they should thus have had type
"char const[]".  However, they weren't willing to break the 99% of all C
programs which contained things like the above, so string literals
became a special case: a non-const lvalue that wasn't modifiable.

By the time C++ became mainstream, const existed, and early in C++,
const correctness became a theme.  Since assigning a string literal to a
"char*" implies that it can be modified, it shouldn't occur in modern
programs.  Even so, there was long discussion concerning this in the
standards committee, and it was with some hesitation that the change was
made.

FWIW: even making string literals non-modifiable was a major step.  In
K&R (pre-ANSI) C, string literals were guaranteed to be distinct AND
modifiable.  In fact, despite this guarantee in K&R I, some compilers
did share them, and exploiting the fact that they were distinct and
modifiable was likely to cause a program to fail on some platforms.  It
was also thought that programs which did modify them were probably not
that frequent, and certainly not very maintainable anyway (since you
couldn't trust the contents of the string literal to be equal to what
the listing said).  Never the less, Sun (and certainly others) still
keep string literals distinct and put them in writable memory, to avoid
breaking legacy code.  (Proving the customer an idiot has never been a
successful commercial policy, even in the cases where it's true.)

|>  : >Even if you make a special exception for
|>  : >arrays that come from string literals, the operators ==, <, <=
|>  : >and so on, for 'const char *' cease to be transitive.
|>
|>  : Those operators are already not transitive for any pointer type, in the
|>  : general case, because the results of comparisons between objects that
|>  : are not sub-objects of the same complete object are undefined.  (If you
|>  : stick to the well-defined cases, then the operators are transitive.)
|>
|>  Can you give one more example when a < b is guaranteed to be true,
|>  b < c is guaranteed to be true, but a < c is undefined?

No.  But this is equally true for pointers into string literals.  The
results of a < b is only defined if a and b point into the same object.
Since distinct string literals may be distinct objects, the results are
not defined if a and b point into different string literals.

--
James Kanze      home:     kanze@gabi-soft.fr        +33 (0)1 39 55 85 62
                 office:   kanze@vx.cit.alcatel.fr   +33 (0)1 69 63 14 54
GABI Software, Sarl., 22 rue Jacques-Lemercier, F-78000 Versailles France
     -- Conseils en informatique industrielle --
---
[ comp.std.c++ is moderated.  To submit articles: try just posting with      ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu         ]
[ FAQ:      http://reality.sgi.com/employees/austern_mti/std-c++/faq.html    ]
[ Policy:   http://reality.sgi.com/employees/austern_mti/std-c++/policy.html ]
[ Comments? mailto:std-c++-request@ncar.ucar.edu                             ]