Thread

Topic: character array initialization

Author: "mauro" <m.russo@uniplan.it>
Date: Tue, 11 Jan 2011 09:59:34 CST Raw View


   Dears,

recently I noted that all C++ compilers I use [some commercial, some free]
do not accept the following character array declaration

1)
char MyArray[2] = { "a"[0], "b"[0] };

whereas they accept all following ones :

2)
char MyArray[2] = { ("a")[0], "b"[0] };

3)
char MyArray[2] = { 'a', "b"[0] };

4)
char MyArray[2] = { ("a")[0], 'b' };

I discussed the question with corresponding IDE supports. One of them
finally posted me the following message

>>>>
char MyArray[2] = { "a"[0], 'b' };

This is not list-initialisation.  It appears to be list-initialisation until
it is closer inspected.  It is essentially writing char MyArray[2] = {
"a" };.  If the superfluous braces are removed it is then char MyArray[2] =
"a";.

The compiler correctly identifies the invalid character [ and gives a
suitable error message so there is not a problem.
<<<<

So, I downloaded some standard documents [in particular n3225.pdf
[N3225=10-0215]
from http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2010/#mailing2010-11
and n1905.pdf  [N1905=05-0165] (I forgot what page from)]]
which are appropriate for my goal.

In paragraph    "8.5.2 Character arrays" (p. 209) of n3225.pdf reads

"A char array (whether plain char, signed char, or unsigned char), char16_t
array, char32_t array,
or wchar_t array can be initialized by a narrow character literal, char16_t
string literal, char32_t
string literal, or wide string literal, respectively, or by an
appropriately-typed string literal
enclosed in braces. ..."

and the paragraph    "8.5.2 Character arrays" (p. 171) of n1905.pdf  reads

"A char array (whether plain char, signed char, or unsigned char) be
initialized by a
string-literal (optionally enclosed in braces); a wchar_t array can be
initialized by a
wide string-literal (optionally enclosed in braces); ..."

As first, I note that he word "optionally" disappeared in the newer document
n3225.pdf

Aynway, whether the optionality is alive or not, I believe that the
declaration 1)
should be accepted, because it does not generate any ambiguity.

Maybe the question is related to "how to interpretate" a string literal,
that is as a comma-separated char list (i mean a text-substitution, just
like
a macro) or as a stand-alone thing. Now I have no experience about how
the standard treats this question.
Anyway, if the declaration 1) is rejected then I believe that declarations
2)
and 3) should also be rejected because of "b"[0].

Before to ask your opinion, I explain why I needed that particular
declaration.
Well, in many IDE there are some automatically defined macros. In my case I
wanted to use a __DATE__ macro, which obviously gives the date at the
compilation time in a string-literal form, but I wanted to reorganize the
char list
and so I tried something like

char MyArray[MyDateLength] = { __DATE__[MyFirstDateChar],
__DATE__[MySecondDateChar], ... }

Regards,
Mauro Russo.



--
[ comp.std.c++ is moderated.  To submit articles, try posting with your ]
[ newsreader.  If that fails, use mailto:std-cpp-submit@vandevoorde.com ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]

Author: "mauro" <m.russo@uniplan.it>
Date: Tue, 11 Jan 2011 15:19:04 CST Raw View

An other support posted me this text :

>>>
This is a bug in "[THEIR IDE] extension" of the standard.
The standard doesn't allow you to initiate with "xx"[1]
The workaround is to use the parentheses .
<<<

Does someone believe the standard should explicitely address
that my declaration example is compliant?

char MyArray[2] = { "a"[0], "b"[0] };
and also
[example 5)] char MyArray[2] = { "a"[0], 'b' };

Regards,
Mauro.


"mauro" <m.russo@uniplan.it> ha scritto nel messaggio
news:igeift$e0j$1@speranza.aioe.org...
>
>
>   Dears,
>
> recently I noted that all C++ compilers I use [some commercial, some free]
> do not accept the following character array declaration
>
> 1)
> char MyArray[2] = { "a"[0], "b"[0] };
>
> whereas they accept all following ones :
>
> 2)
> char MyArray[2] = { ("a")[0], "b"[0] };
>
> 3)
> char MyArray[2] = { 'a', "b"[0] };
>
> 4)
> char MyArray[2] = { ("a")[0], 'b' };
>
> I discussed the question with corresponding IDE supports. One of them
> finally posted me the following message
>
>>>>>
> char MyArray[2] = { "a"[0], 'b' };
>
> This is not list-initialisation.  It appears to be list-initialisation
> until
> it is closer inspected.  It is essentially writing char MyArray[2] = {
> "a" };.  If the superfluous braces are removed it is then char MyArray[2]
> =
> "a";.
>
> The compiler correctly identifies the invalid character [ and gives a
> suitable error message so there is not a problem.
> <<<<
>
> So, I downloaded some standard documents [in particular n3225.pdf
> [N3225=10-0215]
> from
> http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2010/#mailing2010-11
> and n1905.pdf  [N1905=05-0165] (I forgot what page from)]]
> which are appropriate for my goal.
>
> In paragraph    "8.5.2 Character arrays" (p. 209) of n3225.pdf reads
>
> "A char array (whether plain char, signed char, or unsigned char),
> char16_t
> array, char32_t array,
> or wchar_t array can be initialized by a narrow character literal,
> char16_t
> string literal, char32_t
> string literal, or wide string literal, respectively, or by an
> appropriately-typed string literal
> enclosed in braces. ..."
>
> and the paragraph    "8.5.2 Character arrays" (p. 171) of n1905.pdf  reads
>
> "A char array (whether plain char, signed char, or unsigned char) be
> initialized by a
> string-literal (optionally enclosed in braces); a wchar_t array can be
> initialized by a
> wide string-literal (optionally enclosed in braces); ..."
>
> As first, I note that he word "optionally" disappeared in the newer
> document
> n3225.pdf
>
> Aynway, whether the optionality is alive or not, I believe that the
> declaration 1)
> should be accepted, because it does not generate any ambiguity.
>
> Maybe the question is related to "how to interpretate" a string literal,
> that is as a comma-separated char list (i mean a text-substitution, just
> like
> a macro) or as a stand-alone thing. Now I have no experience about how
> the standard treats this question.
> Anyway, if the declaration 1) is rejected then I believe that declarations
> 2)
> and 3) should also be rejected because of "b"[0].
>
> Before to ask your opinion, I explain why I needed that particular
> declaration.
> Well, in many IDE there are some automatically defined macros. In my case
> I
> wanted to use a __DATE__ macro, which obviously gives the date at the
> compilation time in a string-literal form, but I wanted to reorganize the
> char list
> and so I tried something like
>
> char MyArray[MyDateLength] = { __DATE__[MyFirstDateChar],
> __DATE__[MySecondDateChar], ... }
>
> Regards,
> Mauro Russo.
>

--
[ comp.std.c++ is moderated.  To submit articles, try posting with your ]
[ newsreader.  If that fails, use mailto:std-cpp-submit@vandevoorde.com ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]

Author: petergo@microsoft.UUCP (Peter GOLDE)
Date: 15 Nov 90 18:40:56 GMT Raw View

In article <513@taumet.com> steve@taumet.com (Stephen Clamage) writes:
>petergo@microsoft.UUCP (Peter GOLDE) writes:
>
>>In section 8.4.2, the C++ standard disallows the legal ANSI C
>>initialization:
>>char cv[4] = "asdf";
>
>I agree that this restriction will break some existing code.  But how
>often is it essential to use a fixed-size array of characters without
>the terminating null?  To cater to those special cases, IMHO very rare,
>C introduces weird semantics for array initialization.  In addition, it
>seems to me that
> char cv[4] = "asdf";
>is more likely to be an error than a deliberate attempt to save one byte.

A good argument.  I might even accept it, were I designing a language
myself.  But many of the best C language experts in the country
(the ANSI C committee) considered this argument, and rejected it.
The issue should be closed now.

BTW, just because string initialization of the type mentioned
is "very rare" in YOUR code, don't think that it's very rare
in other peoples code.  0 termination is not the best string
representation for many applications.

---
Peter Golde        petergo%microsoft@uunet.uu.net

Author: jimad@microsoft.UUCP (Jim ADCOCK)
Date: 16 Nov 90 20:19:51 GMT Raw View

In article <59110@microsoft.UUCP> petergo@microsoft.UUCP (Peter GOLDE) writes:
|In article <513@taumet.com> steve@taumet.com (Stephen Clamage) writes:
|>petergo@microsoft.UUCP (Peter GOLDE) writes:
|>
|>>In section 8.4.2, the C++ standard disallows the legal ANSI C
|>>initialization:
|>>char cv[4] = "asdf";
|>
|>I agree that this restriction will break some existing code.  But how
|>often is it essential to use a fixed-size array of characters without
|>the terminating null?  To cater to those special cases, IMHO very rare,
|>C introduces weird semantics for array initialization.  In addition, it
|>seems to me that
|> char cv[4] = "asdf";
|>is more likely to be an error than a deliberate attempt to save one byte.
|
|A good argument.  I might even accept it, were I designing a language
|myself.  But many of the best C language experts in the country
|(the ANSI C committee) considered this argument, and rejected it.
|The issue should be closed now.
|
|BTW, just because string initialization of the type mentioned
|is "very rare" in YOUR code, don't think that it's very rare
|in other peoples code.  0 termination is not the best string
|representation for many applications.

In this and in other areas where C has muddled type rules,
perhaps backwards compatibility and other "special" needs can be
met via the extern "C" construct?  In particular strings could be
interpreted according to the rules of the language requested?

extern "Pascal" { char cv[] = "asdf"; }

for example?

I'd put my vote in for C++ trying to clean up type rules somewhat,
rather than trying to maintain the last iota of backwards compatibility
[with K&R C, ANSI C, C++ 1.2, C++ 2.0, ....????]  The drive to maintain
backwards compatibility has made some of the type inference rules so
obtuse that compiler writers can't figure them out, let alone programmers
actually use them intelligently.

In my opinion C++ is not C, and the needs of object oriented programmers
are sufficiently different than the needs of C programmers that fine
grained details of ANSI-C ought to be considered again, rather than
accepted unthinkingly as "THE" answer.  Let's try to clean things up a
little bit around the edges.

Author: steve@taumet.com (Stephen Clamage)
Date: 18 Nov 90 01:27:32 GMT Raw View

petergo@microsoft.UUCP (Peter GOLDE) writes:

>BTW, just because string initialization of the type mentioned
>is "very rare" in YOUR code, don't think that it's very rare
>in other peoples code.  0 termination is not the best string
>representation for many applications.

A good point.  Nonetheless, 0 termination is the standard in C.
There is no support anywhere else in the language for strings
which are not 0-terminated.  You have to write special code to
handle them wherever such are used.
--

Steve Clamage, TauMetric Corp, steve@taumet.com

Author: cimshop!davidm@uunet.UU.NET (David S. Masterson)
Date: 18 Nov 90 01:56:12 GMT Raw View

>>>>> On 16 Nov 90 20:19:51 GMT, jimad@microsoft.UUCP (Jim ADCOCK) said:

Jim> In my opinion C++ is not C, and the needs of object oriented programmers
Jim> are sufficiently different than the needs of C programmers that fine
Jim> grained details of ANSI-C ought to be considered again, rather than
Jim> accepted unthinkingly as "THE" answer.  Let's try to clean things up a
Jim> little bit around the edges.

I think I'd agree with this, but let's make it a question instead.  Is there
anyone out there doing something with C++ where they need absolute ANSI-C
compatibility at compile time?  What are the benefits of taking this draconian
an approach to a language that's still growing?  To me, the ability to link
ANSI-C functions with C++ code is more than sufficient and, in some cases,
helps improve the code by separating programming styles.  Its nice for
programmer learning curve for there to be a significant overlap in ANSI-C and
C++, but there is already divergence, so why not attempt to improve things?
--
====================================================================
David Masterson     Consilium, Inc.
(415) 691-6311     640 Clyde Ct.
uunet!cimshop!davidm    Mtn. View, CA  94043
====================================================================
"If someone thinks they know what I said, then I didn't say it!"

Author: jamiller@hpcupt1.cup.hp.com (Jim Miller)
Date: 20 Nov 90 19:13:21 GMT Raw View

>programmer learning curve for there to be a significant overlap in ANSI-C and
>C++, but there is already divergence, so why not attempt to improve things?
>--
>David Masterson     Consilium, Inc.


Because C code is being brought into C++ wholesale in some cases.
An earlier thread *seemed* to complain about ANY divergence from ANSI C or
maybe even K&R I C (at least that's what I felt some comments said).

IMO, anything that would give people (managers or programmers) reasons NOT
to adopt C++ should be avoided.  These reasons, mind you, need not
be valid, just there.  One big reason is/would be the dreaded
"incompatability", just there being such things (never mind they might
not happen) keeps people away.

The reason I supported the  char x[4]="abcd";  incompatability is because
is is *usally* an error.  If it were not, I'd be against it.  Even so I
would be willing to support not making the change if it seems that anyone
is really bothered about it (by "really" I mean code is affected).  I would
also be forced to back off by anyone pushing the "no incompatability"
banner in the interest of acceptance.

   jim - do I have to be consistant in my arguments? - miller
   jamiller@hpmpeb7.cup.hp.com
   (a.k.a James A. Miller; Jim the JAM; stupid; @!?$$!; ... )
   Anything I say will be used against me ...
   But my company doesn't know or approve or condone anything of mine here.

Author: daves@ex.heurikon.com (Dave Scidmore)
Date: 21 Nov 90 23:19:07 GMT Raw View

In article <59110@microsoft.UUCP> petergo@microsoft.UUCP (Peter GOLDE) writes:
|In article <513@taumet.com> steve@taumet.com (Stephen Clamage) writes:
|>petergo@microsoft.UUCP (Peter GOLDE) writes:
|>
|>>In section 8.4.2, the C++ standard disallows the legal ANSI C
|>>initialization:
|>>char cv[4] = "asdf";
|>
|>I agree that this restriction will break some existing code.  But how
|>often is it essential to use a fixed-size array of characters without
|>the terminating null?  To cater to those special cases, IMHO very rare,
|>C introduces weird semantics for array initialization.  In addition, it
|>seems to me that
|> char cv[4] = "asdf";
|>is more likely to be an error than a deliberate attempt to save one byte.
|
|A good argument.  I might even accept it, were I designing a language
|myself.  But many of the best C language experts in the country
|(the ANSI C committee) considered this argument, and rejected it.
|The issue should be closed now.

This argument is based on the false assumption that the ANSI C committie
was trying to adopt the ideal set of 'C' language features as a standard.
It was not, in fact the ANSI 'C' specification specificaly states that the
ANSI 'C' committies charter was to CODEFY EXISTING PRACTICE. It would be
far more correct to say "many of the best C language experts in the country
(the ANSI C committee) considered this argument and found it did not correspond
to existing practice".
    The ANSI committie did not set out to enhance or correct language features.
The goals for the development of C++ are far different. Since the existing
practice criteria for development was invalid, the goal was primarily to
enhance the language and correct percieved problems with the original 'C'
language.
 Because of stronger type checking and other changes in the language
it is nearly impossible to port 'C' code to 'C++' anyway, this is just another
area of incompatibility to deal with.
--
Dave Scidmore, Heurikon Corp.
dave.scidmore@heurikon.com

Author: petergo@microsoft.UUCP (Peter GOLDE)
Date: 10 Nov 90 00:32:31 GMT Raw View

In section 8.4.2, the C++ standard disallows the legal ANSI C
initialization:

char cv[4] = "asdf";

Since C++ is supposed to be as close as possible to ANSI C
without compromising its features, I don't see what purpose
this restriction serves.  This is a useful feature which
ANSI C very deliberately put into the standard; having
C++ not allow it is a plain nuisance.

Now I have to recode my strings using brace notation: what
a pain!

Peter Golde  petergo%microsoft@uunet.uu.net

Author: steve@taumet.com (Stephen Clamage)
Date: 10 Nov 90 17:34:16 GMT Raw View

petergo@microsoft.UUCP (Peter GOLDE) writes:

>In section 8.4.2, the C++ standard disallows the legal ANSI C
>initialization:
>char cv[4] = "asdf";

First of all, there is no C++ standard yet.  Mr Golde seems to be referring
to E&S, a reference book which at the moment does not correspond entirely
to any commercial C++ compiler.  The ANSI C++ X3J16 standards committee
will eventually produce a C++ standard, and it will look at lot like E&S
without the annotations.  But E&S is not the C++ standard.

>Now I have to recode my strings using brace notation: what a pain!

I agree that this restriction will break some existing code.  But how
often is it essential to use a fixed-size array of characters without
the terminating null?  To cater to those special cases, IMHO very rare,
C introduces weird semantics for array initialization.  In addition, it
seems to me that
 char cv[4] = "asdf";
is more likely to be an error than a deliberate attempt to save one byte.
--

Steve Clamage, TauMetric Corp, steve@taumet.com

Author: jbuck@galileo.berkeley.edu (Joe Buck)
Date: 10 Nov 90 20:56:44 GMT Raw View

In article <58962@microsoft.UUCP>, petergo@microsoft.UUCP (Peter GOLDE) writes:
> In section 8.4.2, the C++ standard disallows the legal ANSI C
> initialization:
>
> char cv[4] = "asdf";

I just checked out that section.  Gack!  Why the hell did Stroustrup
do that?  The rationale is that the string "asdf" really has five
characters (trailing null), but the meaning is obvious and ANSI C
allows it.

Ellis and Stroustrup's commentary says (p. 153):

 In this, C++ differs from ANSI C, where the example is allowed
 and is intended to be a convenience to the programmer.

What possible rationale is there for this change?

--
Joe Buck
jbuck@galileo.berkeley.edu  {uunet,ucbvax}!galileo.berkeley.edu!jbuck

Author: jimad@microsoft.UUCP (Jim ADCOCK)
Date: 12 Nov 90 19:47:30 GMT Raw View

In article <39503@ucbvax.BERKELEY.EDU> jbuck@galileo.berkeley.edu (Joe Buck) writes:
|In article <58962@microsoft.UUCP>, petergo@microsoft.UUCP (Peter GOLDE) writes:
|> In section 8.4.2, the C++ standard disallows the legal ANSI C
|> initialization:
|>
|> char cv[4] = "asdf";
|
|I just checked out that section.  Gack!  Why the hell did Stroustrup
|do that?  The rationale is that the string "asdf" really has five
|characters (trailing null), but the meaning is obvious and ANSI C
|allows it.
|
|Ellis and Stroustrup's commentary says (p. 153):
|
| In this, C++ differs from ANSI C, where the example is allowed
| and is intended to be a convenience to the programmer.
|
|What possible rationale is there for this change?

A possible rational would be if C++ were attempting to move towards
stricter type checking of array types than ANSI-C.  In that regard,
however, C++ seems to be promulgating the confusion about whether arrays
are exact types or inexact in their first dimension.  Note in the above
example, the first dimension of an array is required to match exactly.
However, in function parameters, the first dimension is ignored --
a parameter char[10] is considered analogous to a char*.

I'd vote for cleaning this up by making arrays exact types everywhere
including their first dimension, retaining [sigh] implicit conversions
of arrays to pointers to their first members -- but only when assigning
an array to a pointer. To retain historical behavior then, people would
have to declare parameters as char*'s and not char[10]'s.

An alternate approach would be to always consider arrays inexact in their
first parameter, except when allocating storage.  This weakens type
safety considerably.

Either way, choose one or the other!  Let's not have rules implying
inexact array types in some schenerios, and rules implying exact array types
in other schenerios.

Author: jbuck@galileo.berkeley.edu (Joe Buck)
Date: 12 Nov 90 19:59:12 GMT Raw View

petergo@microsoft.UUCP (Peter GOLDE) writes:
>In section 8.4.2, the C++ standard disallows the legal ANSI C
>initialization:
>char cv[4] = "asdf";
In article <513@taumet.com>, steve@taumet.com (Stephen Clamage) writes:
> First of all, there is no C++ standard yet.  Mr Golde seems to be referring
> to E&S, a reference book which at the moment does not correspond entirely
> to any commercial C++ compiler.  The ANSI C++ X3J16 standards committee
> will eventually produce a C++ standard, and it will look at lot like E&S
> without the annotations.  But E&S is not the C++ standard.

True; however, the standards committee has adopted E&S as the base document
for the standard.  This probably means that any nit in E&S that is not
screamed about by the user community will wind up in the standard.

> To cater to those special cases, IMHO very rare,
> C introduces weird semantics for array initialization.  In addition, it
> seems to me that
>  char cv[4] = "asdf";
> is more likely to be an error than a deliberate attempt to save one byte.

The same argument could have been made against this construct in ANSI C;
however, it is permitted there.  Gratuitous incompatibilities with ANSI
C which yield no significant advantage should be avoided; they only make
it harder for people to convert from C to C++.

--
Joe Buck
jbuck@galileo.berkeley.edu  {uunet,ucbvax}!galileo.berkeley.edu!jbuck

Author: jamiller@hpcupt1.cup.hp.com (Jim Miller)
Date: 12 Nov 90 21:19:49 GMT Raw View

>seems to me that
> char cv[4] = "asdf";
>is more likely to be an error than a deliberate attempt to save one byte.
>--
>
>Steve Clamage, TauMetric Corp, steve@taumet.com

I second that emotion.

   jim miller
   jamiller@hpmpeb7.cup.hp.com
   (a.k.a James A. Miller; Jim the JAM; stupid; @!?$$!; ... )
   Anything I say will be used against me ...
   But my company doesn't know or approve or condone anything of mine here.