Topic: Thoughts about a standard string_argument to unify char* and


Author: "Alf P. Steinbach" <alfps@start.no>
Date: Sun, 29 Apr 2007 17:57:05 CST
Raw View
* Pablo Halpern:
> On Mar 31, 3:05 pm, "Andrei Alexandrescu (See Website For Email)"
> <SeeWebsiteForEm...@erdani.org> wrote:
>> Wow. It would be great if the language could distinguish (by e.g.
>> assigning a different type) a literal from a const char*.
>
> I was thinking about exactly this problem earlier today.  Such a
> feature would most often be associated with string literals, but could
> be useful for any kind of pointer-to-statically-allocated object.  I
> was thinking of a syntax something like this (using basic_string) as
> an example:
>
> basic_string(static const charT* s) {
>     // s is a string literal or a pointer to a const array of char, so
> we don't need to
>     // copy its contents.
> }
>
> basic_string(const charT* s) {
>    // s is NOT a string literal.  We must copy its contents
> }
>
> The idea is that, within an argument declaration, the word "static" is
> used to modify a pointer or reference so that it will only match (and
> will be the preferred match) if the actual argument is a pointer or
> reference to statically-allocated storage.  An important restriction,
> though, is that a pointer to non-const static storage will NOT match
> an argument of type pointer-to-static-const-object.  In other-words,
> the normal cvq promotions will not occur for pointer arguments that
> are declared static.  That restriction does make it a bit weird.
>
> Any opinions/refinements?

Regarding weirdness, consider

   char const * const data[] = { "a", "b", "c" };

   void foo( char static const * static const * strings ) { ... }

   int main() { foo( data ); }

The possible occurrence of multiple "static" keywords in a declarator is
an additional weirdness.  But perhaps it's better (no introduction of
new keyword) than writing

   void foo( char static_const * static_const strings ) { ... }

In another direction, a template based syntax has been proposed for this
kind of thing (except that as far as I can see the problem of character
string literals, which do not have external linkage, was not discussed)
in 2004, by Daniel Frey, <url: http://preview.tinyurl.com/yoo3m2>.

Repeating Daniel's original example, where "ICE" means "Integral
Constant Expression"  --  some generalization needed for strings:

   void f( int i ) // #1
   {
       // Used if f is called with a non-ICE
   }

   template< int I > void f( const int I ) // #2
   {
       // Used if f is called with an ICE != 2
   }

   template<> void f<2>( const int ) // #3
   {
       // Specialized if f is called with an ICE == 2
   }

   int main()
   {
       const int i = 2;
       f(i); // calls #1
       f(1); // calls #2
       f(2); // calls #3
   }

And the ensuing discussion (see URL above) brought to view that
essentially the same mechanism had been suggested in 2002 by Aleksey
Gurtovoy, <url: http://tech.groups.yahoo.com/group/boost/message/24271>.

 From that perspective what's needed seems to be not new syntax, but
just new semantics in order to make this mechanism work for integral
values and in order to make it deal conveniently with literal strings
and perhaps  --  ouch  --  floating point values.

In a third direction, this may not be needed.  First, it's complicating
the language just to make a generally premature optimization less
unsafe.  Second, that safety might be achieved in other ways, such as
registering (pointers to) static data with a static data repository, so
that it can be determined efficiently at run-time whether the data is
static or not  --  if measurements show that the copying is too costly,
with that scheme it may be possible to fix it by registering the data,
e.g. by using macros to declare things.

Disclaimer: I haven't really thought this through, just Googled.

Cheers,

- Alf

--
A: Because it messes up the order in which people normally read text.
Q: Why is it such a bad thing?
A: Top-posting.
Q: What is the most annoying thing on usenet and in e-mail?

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]





Author: Daniel <danne_esset@hotmail.com>
Date: Fri, 30 Mar 2007 09:06:34 CST
Raw View
Gennaro Prota Wrote:
> On Wed, 28 Mar 2007 16:09:26 GMT, Daniel Svensson wrote:
>> Many functions like ifstream::open() should be operloaded in C++09  to
>> handle std::string as well plain char pointers.
> Do you find that to be a good idea?
I think this is a good idea to be able to use std::string where you
earlier was forced to use either a c-string or call string::c_str().

>> To not force mutliple
>> definitions (and implementations) of the same function,
> What do you mean by "multiple definitions"? (As to the
> implementation(s), it's all the usual dance of forwarding to a common
> "back-end" function --not a real problem IMHO)
I understand that, but the produced executables can become bigger then
needed and slightly less code would be required.

I made a test dll exporting 100 simple functions. One version of the dll
used a string_arg class and another made it the normal way using
overloading resulting in 200 functions.  In implementing the function I
used exactly the same code in the char* and string_arg functions. While
the std::string version uses the const char* function.

Then I made a program (dlluser) that called each of these functions.
Here are the different sizes of the files produced using VS 8.0 Release.

            |using string_arg     |  using overloading   |  Difference
--------------------------------------------------------------
dll         | 12 288 byte         |  22 528 byte          |  >10kB
dlluser   | 15 360 byte         |  21 504 byte          |  >6kB

I have realized that this is probably not the best solution to use in
standard C++ since to benefit from it old functions would have to change
their function signature. Still new functions like the filesystem and
regex library may make good use of it.

Daniel Svensson

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]





Author: Alberto Ganesh Barbati <AlbertoBarbati@libero.it>
Date: Fri, 30 Mar 2007 11:27:09 CST
Raw View
Gennaro Prota ha scritto:
> On Wed, 28 Mar 2007 16:09:26 GMT, Daniel Svensson wrote:
>
>> Many functions like ifstream::open() should be operloaded in C++09  to
>> handle std::string as well plain char pointers.
>
> Do you find that to be a good idea?

Personally I think so. In fact this trick is so simple and convenient
that I would like it proposed for boostification.

>
>> To not force mutliple
>> definitions (and implementations) of the same function,
>
> What do you mean by "multiple definitions"? (As to the
> implementation(s), it's all the usual dance of forwarding to a common
> "back-end" function --not a real problem IMHO)
>

This trick has a narrower scope than forwarding and because of that it's
much simpler to use and apply. Writing one function that works for both
char* and strings is just a one-token change and it allows you to forget
putting c_str() all over the place. I find this very useful. Don't you
hate to write:

  ostringstream path;
  // compose path
  ifstream file(path.str().c_str()); // ugly

or worse:

  string name;
  ifstream file((name + ".txt").c_str()); // ugly

Well, if the c_str() could be left out, I would be much happier.

[Note: please let's not focus on ifstreams and pathnames specifically...
I know there are better alternatives like boost::filesystem, but cases
like the one above pop up in a lot of places.]

It's also easily extensible:

template <class T>
const char* get_c_str(const T& str)
{
   return str.c_str();
}

class string_arg
{
    const char* _str;

public:
    string_arg(const char* str)
        : _str(str)
    {}

    template <class T>
    string_arg(const T& str)
        : _str(get_c_str(str))
    {}

    operator const char*() const { return _str; }
    const char* c_str() const    { return _str; }
};

By specializing the get_c_str() template, user-provided string-like
classes can also be supported. The library need not even be aware of
that, one single signature per function needs to be provided and will
work in any case.

Just my opinion,

Ganesh

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]





Author: "Andrei Alexandrescu (See Website For Email)" <SeeWebsiteForEmail@erdani.org>
Date: Sat, 31 Mar 2007 13:05:24 CST
Raw View
Mathias Gaunard wrote:
> On Mar 28, 6:09 pm, Danne_es...@hotmail.com (Daniel Svensson) wrote:
>> Many functions like ifstream::open() should be operloaded in C++09  to
>> handle std::string as well plain char pointers. To not force mutliple
>> definitions (and implementations) of the same function, one may solve
>> this by implementing a class with the sole purpose of letting the user
>> use either string or char* without any special overhead.
>
> A better solution would be to only provide a constructor with a string
> type and design the string type so that constructing one from a
> literal doesn't copy.
> That would mean the string type is immutable.

Wow. It would be great if the language could distinguish (by e.g.
assigning a different type) a literal from a const char*.

As of OP's question, don't we need a total of four overloads (for
wchar_t and wstring as well)?


Andrei

P.S. The typo "operloaded" suggests the word "operloading" as a
contraction for "operator overloading" :o).

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]





Author: SeeWebsiteForEmail@erdani.org ("Andrei Alexandrescu (See Website For Email)")
Date: Sun, 1 Apr 2007 19:53:26 GMT
Raw View
James Kanze wrote:
> On Mar 31, 9:05 pm, "Andrei Alexandrescu (See Website For Email)"
> <SeeWebsiteForEm...@erdani.org> wrote:
>> Mathias Gaunard wrote:
>>> On Mar 28, 6:09 pm, Danne_es...@hotmail.com (Daniel Svensson) wrote:
>>>> Many functions like ifstream::open() should be operloaded in C++09  to
>>>> handle std::string as well plain char pointers. To not force mutliple
>>>> definitions (and implementations) of the same function, one may solve
>>>> this by implementing a class with the sole purpose of letting the user
>>>> use either string or char* without any special overhead.
>
>>> A better solution would be to only provide a constructor with a string
>>> type and design the string type so that constructing one from a
>>> literal doesn't copy.
>>> That would mean the string type is immutable.
>
>> Wow. It would be great if the language could distinguish (by e.g.
>> assigning a different type) a literal from a const char*.
>
> It can.  A literal is not a const char*, but a char const[N].
> Which are two distinct types in C++.
>
> Of course, the implicit conversion makes it somewhat different
> to exploit the difference (as does the two iterator idiom,
> because if I want iterators, I can't directly use a string
> literal, but have to declare a veriable).

And as Mathias mentioned, literals further convert to char* for C
compatibility. But even if that were deprecated, you can't share
addresses of literals for two reasons:

1. You can't tell a literal from a mutable char[N] because "const" is at
the same time a type qualifier and a storage class. Consider:

auto a = "Hi!"; // type of a is const char[4]
char b[4];
const char (&c)[4] = b;
// a and c are indistinguishable, yet c's content can be mutated

2. You can't tell a static const char[N] from a stack-allocated const
char[N]. This is because static is a storage class and not a type qualifier.

This all means that you can't add a constructor to string taking a const
char[N] that actually just stores the pointer in knowledge that that
pointer refers to immutable memory of static duration.


Andrei

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]





Author: alfps@start.no ("Alf P. Steinbach")
Date: Mon, 2 Apr 2007 02:32:03 GMT
Raw View
* Andrei Alexandrescu (See Website For Email):
>
> And as Mathias mentioned, literals further convert to char* for C
> compatibility. But even if that were deprecated, you can't share
> addresses of literals for two reasons:
>
> 1. You can't tell a literal from a mutable char[N] because "const" is at
> the same time a type qualifier and a storage class. Consider:
>
> auto a = "Hi!"; // type of a is const char[4]
> char b[4];
> const char (&c)[4] = b;
> // a and c are indistinguishable, yet c's content can be mutated
>
> 2. You can't tell a static const char[N] from a stack-allocated const
> char[N]. This is because static is a storage class and not a type
> qualifier.
>
> This all means that you can't add a constructor to string taking a const
> char[N] that actually just stores the pointer in knowledge that that
> pointer refers to immutable memory of static duration.

Well, you can, it's even a FAQ.  Well, all right, it's not a FAQ item in
its own right, just a part of a FAQ item, and that item isn't
specifically about detecting literals.  But, it applies:

   "The (even easier) non-technical approach is to put a big fat ugly
   comment next to the class definition. The comment could say, for
   example, // We'll fire you if you [...] or even just [...]. Some
   programmers balk at this because it is enforced by people rather than
   by technology, but don't knock it on face value: it is quite effective
   in practice."

The C++ standard library isn't as type safe as it could be, it is in
most cases very pragmatically type unsafe, relying on programmers being
sensible rather than the compiler saving them from doing silly things.

A /standard/ practical, efficient immutable string carrier class that
made the pragmatic choice of treating char const[n] as a literal  --
unless otherwise instructed  --  would IMHO be great.

But once such a class is proposed, SomeOne will probably start arguing
that it should have tread safe reference counting (for the case where
it's initialized with a non-literal), customizable destruction, real
string content instead of just a sequence of encoding values, etc. ad
nauseam, so much baggage that the proposal would never fly.

I'd like the customizable destruction (otherwise that wheel has to be
reinvented every time you get a string from some C API that requires the
string to be destroyed using a special function), but nothing else.

--
A: Because it messes up the order in which people normally read text.
Q: Why is it such a bad thing?
A: Top-posting.
Q: What is the most annoying thing on usenet and in e-mail?

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]





Author: AlbertoBarbati@libero.it (Alberto Ganesh Barbati)
Date: Mon, 2 Apr 2007 15:09:06 GMT
Raw View
Gennaro Prota ha scritto:
> On Sun,  1 Apr 2007 14:40:30 CST, James Kanze wrote:
>>
>> (And I have a question: I interpreted Gennaro's question to
>> concern only whether the functions in question should be
>> overloaded.  You seem to interpret it as concerning the
>> proposted technique.  So which is it?)

I think the technique in itself has a value and should be considered in
the design of future library additions.

About existing classes, as you pointed out, replacing the const char*
signature is not feasible because of backward compatibility, but adding
a new signature might be feasible. Of course, it should be considered
case by case whether such signature should be added or not.

>
> Whether it *should* be overloaded. Actually even more: whether a
> constructor, or any other member, taking a basic_string, should exist
> at all. Am I paranoid about coupling?
>

By using this technique, with the additional indirection of the
get_c_str() accessor, there would not be such a strict coupling with
basic_string, as far as I see it. In fact we provide the user with the
capability to use his/her own string-like classes seamlessly.

Ganesh

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]





Author: greghe@pacbell.net (Greg Herlihy)
Date: Mon, 2 Apr 2007 22:41:54 GMT
Raw View


On 4/1/07 1:39 PM, in article
1175349363.714091.265050@p77g2000hsh.googlegroups.com, "James Kanze"
<james.kanze@gmail.com> wrote:

> On Mar 30, 11:39 pm, gennaro.pr...@yahoo.com (Gennaro Prota) wrote:
>> On Fri, 30 Mar 2007 10:46:36 CST, James Kanze wrote:
>>> On Mar 29, 4:55 pm, gennaro.pr...@yahoo.com (Gennaro Prota) wrote:
>>>> On Wed, 28 Mar 2007 16:09:26 GMT, Daniel Svensson wrote:
>>>>> Many functions like ifstream::open() should be operloaded in C++09  to
>>>>> handle std::string as well plain char pointers.
>
>>>> Do you find that to be a good idea?
>
>>> The committee apparently does; it's already in the draft.
>
>> What about you? Do you find it to be a good idea?
>
> Given the current starting point, yes.  As I said, you can't do
> away with the char const* functions.  And you certainly want to
> support both std::string and string literals; anything less
> would be perverse.  As I also said, when I implemented my own
> versions of such things, way, way back in pre-standard days, I
> actually used something like what the original poster suggested.
> I don't think that this is an option today.  (I'm not sure that
> it was the best option even then, but it was an option.  And it
> worked pretty well for my style of programming, in the contexts
> I used it.)

What's the rationale for retaining a "const char *" constructor after a
"const std::string&" constructor has been added to a Standard Library class?
In other words, why not replace the current constructor instead of
overloading it?

It seems to me that the only kind of program that would break if the const
char * constructor were eliminated, would have to contain code so contrived
and so contrary to widely-accepted "best practices" that the chances that
any "real-world" C++ program would be adversely affected by eliminating the
current constructor - are close to none.

Greg

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]





Author: pete@versatilecoding.com (Pete Becker)
Date: Tue, 3 Apr 2007 16:25:05 GMT
Raw View
Greg Herlihy wrote:
>
> What's the rationale for retaining a "const char *" constructor after a
> "const std::string&" constructor has been added to a Standard Library class?
> In other words, why not replace the current constructor instead of
> overloading it?
>

Efficiency.

--

 -- Pete
Roundhouse Consulting, Ltd. (www.versatilecoding.com)
Author of "The Standard C++ Library Extensions: a Tutorial and
Reference." (www.petebecker.com/tr1book)

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]