Thread

Topic: Discussion: type_info

Author: sirwillard@my-deja.com
Date: 1999/12/24 Raw View

In article <83rv2e$iso$1@nnrp1.deja.com>,
  Gene Bushuyev <gbush@my-deja.com> wrote:
> In article <83o4gf$quk$1@nnrp1.deja.com>,
>   sirwillard@my-deja.com wrote:
> >
> > In article <83kjmg$cru$1@nnrp1.deja.com>,
> >   Gene Bushuyev <gbush@my-deja.com> wrote:
> > >
> > > In article <m3bt7qf9fs.fsf@grady.MtDiablo.com>,
> > >   Alan Hadsell <ahadsell@MtDiablo.com> wrote:
> > > [snip]
> > > >
> > > > The compiler can't tell when it's compiling a class whether it
> needs
> > > > to generate its type_info, because it doesn't necessarily know
> whether
> > > > anyone will use typeid() on it.
> > >
> > > It doesn't need to generate the type_info per se. It can generate
a
> > > code to produce type_info on demand.
> >
> > OK, give us some code that will do this.  Remember that it must meet
> > the following requirements:
> >
> > 1.  It must generate unique strings for every type.
> > 2.  It must insure the generated strings are the same for the types on
> > every run and across platforms and across library boundaries in the
> > case of dynamic libraries.
> > 3.  It must retain a small fool print on the overall code.
> >
>
> Ok, here is something from the top of my head. Since the biggest
> problem is with the long template names, they can be easily compressed.
> For example, compiler can create unique static names for all non-
> template types, and for template types use the pointers to the
> type_info of all the constituent parts, and recreate template names on
> demand.
> One can go even further and apply one of the existing compression
> algorithms to the type names. If everage size of a type name can be
> reduced to less than 100 bytes the problem with memory would be
> practically non-existent.

Compression techniques usually require large sets of data to get any
decent compression.  However, we can assume that our "names" will
follow certain patterns that normal data might not, allowing us to
achieve large compression with small data sets.  So, compression sounds
appropriate.  However, even if I go beyond what you said and say that
we'll compress the names to under 50 bytes, with just around 20 of
these we've bloated the code by a full Kilobyte.  A lot of us would
consider this unacceptable for any compiler, but for embedded compilers
this is way beyond unacceptable.  It would be my guess that this is
precisely the reason why the standard allows name and raw_name to
return anything, including a NULL pointer.  I'd be willing to bet that
we might even see compilers with switches that allow us to insure the
NULL pointer generation for these strings in order to optimize our code
size when we don't have any need for these facilities.  A mandatory
string per type is just too much bloat, IMHO, and honestly it serves no
real purpose to use strings.

Sent via Deja.com http://www.deja.com/
Before you buy.

[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: James Kuyper <kuyper@wizard.net>
Date: 1999/12/24 Raw View

Gene Bushuyev wrote:
....
> Ok, ok. You can call a thing that links compiled translation units a
> linker. Though it's hardly a linker that we used to. In order to
> support exported templates this linker still has to be a compiler.
> My point was just to note that compiler/linker distinction is
> dissapearing due to overlapping of their functions.

In terms of actual functionality some overlap has occurred, but
conceptually the 'export' keyword has actually helped return us to the
status quo before templates were introduced. It allows you to separate
the declaration and implementation of templated classes in essentially
the same way that you could always separate the declaration and
implementation of ordinary classes. This means that you can hide the
details of the implementation from the users of your code, reducing
their temptation to write code that's dependent on those details.
Furthermore, it means that their translation units needn't be
re-compiled, but only re-linked, if you change the exported definition.
Separate compilation is the big advantage; in the case of exported
templates, that advantage has been retained at the price of moving into
the linker some of the functionality that used to belong only in the
compiler.

[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: sirwillard@my-deja.com
Date: 1999/12/21 Raw View

In article <83kjmg$cru$1@nnrp1.deja.com>,
  Gene Bushuyev <gbush@my-deja.com> wrote:
>
> In article <m3bt7qf9fs.fsf@grady.MtDiablo.com>,
>   Alan Hadsell <ahadsell@MtDiablo.com> wrote:
> [snip]
> >
> > The compiler can't tell when it's compiling a class whether it needs
> > to generate its type_info, because it doesn't necessarily know whether
> > anyone will use typeid() on it.
>
> It doesn't need to generate the type_info per se. It can generate a
> code to produce type_info on demand.

OK, give us some code that will do this.  Remember that it must meet
the following requirements:

1.  It must generate unique strings for every type.
2.  It must insure the generated strings are the same for the types on
every run and across platforms and across library boundaries in the
case of dynamic libraries.
3.  It must retain a small fool print on the overall code.

I've been generous here and left out the requirement that the generated
string be human readable and meaningful to the type, which is the
implied (but not enforced) semantic of type_info::name.  These are the
characteristics everyone in this thread seems to want.  I think it's an
unreasonable desire and return back to my original proposal to just
generate a unique integral value per type with no gaurantees about
generating the same value except during a single run.

> [snip]
> > So only the linker has enough information to decide whether to include
> > a particular type_info object, and it can only do so if it can
> > determine the set of all types that might be returned from all
> > typeid() calls in the program.
>
> Does the standard say anything about the existance of a linker? This
> division is the heritage of the past I hope we are seeing the end of it.

The end of it how?  A linker must exist (even if you simply move it
inside the compiler, there's still a linkage phase) and the OPs point
remains.  Only the linking phase will allow for this optimization.
Unfortunately, with dynamic link libraries things get ugly indeed and
the optimization becomes impossible.  So we're left with your
proposal... only we need proof that such code can be generated.

Sent via Deja.com http://www.deja.com/
Before you buy.

[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: James Kuyper <kuyper@wizard.net>
Date: 1999/12/22 Raw View

Gene Bushuyev wrote:
...
> Does the standard say anything about the existance of a linker? This
> division is the heritage of the past I hope we are seeing the end of it.

Yes, it does. Many of the rules in the standard refer to translation
units; many others refer instead to entire programs. The standard
defines something called linkage which determines whether an object or
function can be visible in different translation units. The process of
linking translation units together into a complete program is described
in the standard.

The standard does not require that likage occur seperately from
translation, any more than it requires translation phase 1 to be
performed before phase 2. However, it does require a seperate set of
operations, that could be placed in a linker, and it's appropriate to
talk about those operations as being performed by a linker, whether or
not linkage is implemented using a seperate program.

I remember reading about a compiler which implemented something similar
to C++, but which treated all the code that made up the entire program
as a single translation unit. Of course, this made it non-conforming,
because it violates most of the rules that distinguish translation units
from entire programs. However, there's an awful
lot of code out there that wouldn't break if translated that way.
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: Gene Bushuyev <gbush@my-deja.com>
Date: 1999/12/23 Raw View

In article <83o4gf$quk$1@nnrp1.deja.com>,
  sirwillard@my-deja.com wrote:
>
> In article <83kjmg$cru$1@nnrp1.deja.com>,
>   Gene Bushuyev <gbush@my-deja.com> wrote:
> >
> > In article <m3bt7qf9fs.fsf@grady.MtDiablo.com>,
> >   Alan Hadsell <ahadsell@MtDiablo.com> wrote:
> > [snip]
> > >
> > > The compiler can't tell when it's compiling a class whether it
needs
> > > to generate its type_info, because it doesn't necessarily know
whether
> > > anyone will use typeid() on it.
> >
> > It doesn't need to generate the type_info per se. It can generate a
> > code to produce type_info on demand.
>
> OK, give us some code that will do this.  Remember that it must meet
> the following requirements:
>
> 1.  It must generate unique strings for every type.
> 2.  It must insure the generated strings are the same for the types on
> every run and across platforms and across library boundaries in the
> case of dynamic libraries.
> 3.  It must retain a small fool print on the overall code.
>

Ok, here is something from the top of my head. Since the biggest
problem is with the long template names, they can be easily compressed.
For example, compiler can create unique static names for all non-
template types, and for template types use the pointers to the
type_info of all the constituent parts, and recreate template names on
demand.
One can go even further and apply one of the existing compression
algorithms to the type names. If everage size of a type name can be
reduced to less than 100 bytes the problem with memory would be
practically non-existent.
-------------------------------------
Gene Bushuyev


Sent via Deja.com http://www.deja.com/
Before you buy.
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: Gene Bushuyev <gbush@my-deja.com>
Date: 1999/12/23 Raw View

In article <385E94F0.7157AEA4@wizard.net>,
  James Kuyper <kuyper@wizard.net> wrote:
> Gene Bushuyev wrote:
> ...
> > Does the standard say anything about the existance of a linker? This
> > division is the heritage of the past I hope we are seeing the end
of it.
>
> Yes, it does. Many of the rules in the standard refer to translation
> units; many others refer instead to entire programs. The standard
> defines something called linkage which determines whether an object or
> function can be visible in different translation units. The process of
> linking translation units together into a complete program is
described
> in the standard.

Ok, ok. You can call a thing that links compiled translation units a
linker. Though it's hardly a linker that we used to. In order to
support exported templates this linker still has to be a compiler.
My point was just to note that compiler/linker distinction is
dissapearing due to overlapping of their functions.

-------------------------------------
Gene Bushuyev

Sent via Deja.com http://www.deja.com/
Before you buy.

[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: Alan Hadsell <ahadsell@MtDiablo.com>
Date: 1999/12/17 Raw View

phalpern@newview.org (Pablo Halpern) writes:

[snip]

> Producing a long, complete and portable name does not need to cause code
> bloat for programs that don't use the feature. If you never call
> typeid() for a class or any of its base classes, the compiler doesn't
> have to generate the typeinfo structure for it at all. If you never call
> type_info::name(), the compiler (or linker) does not need to include the
> names. The only reasons I can think of to call name() is for debugging,
> persistence, or networking. If you don't need to do one of these things,
> you don't get any penalty from having long, complete, portable names.
> (The current specification of type_info::name() doesn't support
> persistence or networking at all and does't provide portable support for
> debugging, so there's just about no reason to call type_info::name()
> today.)

[snip]

The compiler can't tell when it's compiling a class whether it needs
to generate its type_info, because it doesn't necessarily know whether
anyone will use typeid() on it.

The compiler also can't tell which classes to generate type_infos for
when it's compiling a typeid() call, because it doesn't know what the
dynamic type of the object it's invoked on will be.

So only the linker has enough information to decide whether to include
a particular type_info object, and it can only do so if it can
determine the set of all types that might be returned from all
typeid() calls in the program.

Of course, one obvious special case is if there are *no* typeid()
calls at all; in that case no type_info objects need to be included...

...unless, of course, you want to allow for typeid() calls in
dynamically loaded libraries.

--
Alan Hadsell
"Whatever does not kill me makes me stranger".

[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: Gene Bushuyev <gbush@my-deja.com>
Date: 1999/12/20 Raw View

In article <m3bt7qf9fs.fsf@grady.MtDiablo.com>,
  Alan Hadsell <ahadsell@MtDiablo.com> wrote:
[snip]
>
> The compiler can't tell when it's compiling a class whether it needs
> to generate its type_info, because it doesn't necessarily know whether
> anyone will use typeid() on it.

It doesn't need to generate the type_info per se. It can generate a
code to produce type_info on demand.

[snip]
> So only the linker has enough information to decide whether to include
> a particular type_info object, and it can only do so if it can
> determine the set of all types that might be returned from all
> typeid() calls in the program.

Does the standard say anything about the existance of a linker? This
division is the heritage of the past I hope we are seeing the end of it.

-------------------------------------
Gene Bushuyev

Sent via Deja.com http://www.deja.com/
Before you buy.

[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: Hyman Rosen <hymie@prolifics.com>
Date: 1999/12/14 Raw View

Gene Bushuyev <gbush@my-deja.com> writes:
> template<class T> class A{};
> typeid(A<int>).name() returns "A<int>"
> typeid(A).name() returns "template<class T> class A"

A is not a type, and AFAIK typeid(A) is illegal.
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: phalpern@newview.org (Pablo Halpern)
Date: 1999/12/14 Raw View

sirwillard@my-deja.com wrote:

>
>In article <384c8866.77375470@news.internetconnect.net>,
>  phalpern@newview.org (Pablo Halpern) wrote:
>
>> I wish that the long, complete name were required, because they make it
>> easier to build persistence and network-object mechanisms. But long
>> names take a long time to hash, so it would also be nice to have a fast
>> hashing function.
>
>I don't want any such thing.  Building persistence and network-object
>mechanisms without a unique name in type_info is not very difficult.
>Just to make it a little easier is not a good reason to modify
>type_info in a way that will cause tremendous code bloat.  This is
>especially true since most programs don't make use of these mechanisms
>any way.  IOW, you've severely bloated programs with functionality
>never used.

Producing a long, complete and portable name does not need to cause code
bloat for programs that don't use the feature. If you never call
typeid() for a class or any of its base classes, the compiler doesn't
have to generate the typeinfo structure for it at all. If you never call
type_info::name(), the compiler (or linker) does not need to include the
names. The only reasons I can think of to call name() is for debugging,
persistence, or networking. If you don't need to do one of these things,
you don't get any penalty from having long, complete, portable names.
(The current specification of type_info::name() doesn't support
persistence or networking at all and does't provide portable support for
debugging, so there's just about no reason to call type_info::name()
today.)

But as you say somewhere else, this thread is not about persistence, so
I won't push my desire for complete, portable names any further here.

---------------------------------------------------------------
Pablo Halpern                              phalpern@newview.org
Check out my new book, The C++ Standard Library from Scratch at
http://www.halpernwightsoftware.com/stdlib-scratch

[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: James Kuyper <kuyper@wizard.net>
Date: 1999/12/14 Raw View

Pablo Halpern wrote:
...
> (The current specification of type_info::name() doesn't support
> persistence or networking at all and does't provide portable support for
> debugging, so there's just about no reason to call type_info::name()
> today.)

I agree that it could be more useful, but you're overstating the
problems.
You can use type_info::name() portably in the construction of an error
message. That part of the message will be more useful on some
implementations than on others, but that doesn't mean it's necessarily
useless.
There's no portable support for persistence, but type_info::name() can
be used on some implementations as a basis for an
implementation-specific persistence mechanism.

[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: sirwillard@my-deja.com
Date: 1999/12/11 Raw View

In article <82nk8i$hh6$1@nnrp1.deja.com>,
  Gene Bushuyev <gbush@my-deja.com> wrote:
> In article <82gmer$gqm$1@nnrp1.deja.com>,
>   sirwillard@my-deja.com wrote:
> [snip]
> > > They
> > > decided not to for some compelling reason. I suggest the same
> reason would
> > > apply to providing any hashable state.
> >
> > I already gave the most reasonable reason and it most definately
can't
> > be applied as a reason to not provide any hashing state.
> >
> > > Providing any other hashable state
> > > would be no easier than providing the name.
> >
> > Not at all true.
>
> In fact it is true. Moreover, this another key for hashing still
> doesn't have anything to build upon but the class name. See below.

I looked below, and you don't address this.  Providing an integral hash
value is straight forward and simple to implement given the
requirements already required for type_info.  I addressed how in this
very thread.

> [snip]
> >
> > No, let's not use strings for hashing at all if we can, thank you
very
> > much.  I've given two compelling reasons for this:  memory
> requirements
> > and the speed problems with string hashing functions.
>
> There are no big memory requirements. Even if my project contains 1000
> classes and all their names are 1000 chars long, it's just 1MB of
> virtual memory. For such a project it's absolutely unsignificant
amount.
> Speed also doesn't matter that much, because storing/loading
persistent
> objects is not frequent operation. Usually, it is done as a response
to
> user's input.

*sighs*  1 MB of virtual memory is a HUGE amount.  For many
applications this would be unforgivable, especially if the
functionality is never used.  Also don't forget this memory will also
be included in the static size of the executable.  As for saying speed
doesn't matter... that's just plain silly.  Hashing is used because it
provides faster lookups than tree based approaches like we already have
with the standard map and set classes.  Speed is the ONLY reason to
provide hashing.  Large string hash values would reduce the speed
advantage of a hash to none, or possibly even make it worse.  Further,
you mention persistent objects, but I never did.  Hashing is not a
function of persistance (though it can be employed by persistant
mechanisms).

> [snip]
>
> > I'm not familiar with the "hashing standards can of worms" you are
> > referring to.  Maybe we differ in opinion because of this.  I've yet
> to
> > see an algorithm yet that didn't reduce a hash down to an integral
> > type, which is the obvious thing to me.
>
> I don't see what obvious algorithm you have in mind. And reading other
> threads with similar proposals I see that people do not understand how
> persistent objects are used.

This thread is NOT about peristant objects.  It's about the design of
type_info, and the subthread about the lack of compatibility with
hashing.  This is NOT persistant objects.  I fully understand how
persistant objects are used, but I don't feel that type_info needs to
provide functionality for persistant mechanisms.

> What is needed to create persistent objects is some invariant that
> doesn't depend on compiler, the number of libraries and the order of
> their loading. Persistent object must have the same key every time the
> program is executed, even if it was edited and recompiled in between
> the runs. You can't assume any pointers will have the same absolute or
> relative values. You can't assume templates will be instantiating at
> the same order or that you will have the same number of them.
> The only invariant is the class name, but type_info blew this only
> chance of making persistent objects.

The only state that appears to be available is the name, but the name
is not unique, as you point out.  However, a good persistance mechanism
is probably going to use something other than a class name as the
object id.  Unique names are problematic at best, and a burden on the
system at worst.  In any event, type_info is not an appropriate place
to look for persistance mechanisms, IMHO.  I'm NOT suggesting we modify
type_info to provide automatic persistance.  It provides enough
functionality as it stands today to build robust persistance mechanisms
that won't burden classes and/or projects with code bloat when
persistance is not required.

> At present the ONLY portable way
> of building persistent objects is to add a unique static string or
> other type key to every class that is intended to be persistent. This
> is error prone and boring. The task that compiler could easily perform
> if the standard provided strict requirements on the uniqueness of
> type_info::name().

Which it won't do (at least I certainly hope they won't).  The VAST
majority of applications do not make use of persistance, and the cost
of unique names within RTTI mechanisms for them is too high.  I'd even
consider it too high for those that need it.  Something along the lines
of a GUID would be more palatable.

As for having such a thing in type_info... I don't think so.  The
compiler really can't be expected to produce such keys (i.e. keys that
are unique but identical across runs and even across platforms).  I can
think of no algorithm which could be employed which will gaurantee this
sort of thing.  Besides, this is the most minor of issues one will have
to address within a persistance mechanisms.

> As to your proposed integral value as a key. It won't work unless you
> somehow derive this value from class name. Possible, but doesn't make
> much sense.

Why won't it work?  You don't explain or back up this assertion at
all.  In fact, I don't buy the assertion at all.  I've already given a
proposal that would create the integral value as a key that doesn't
rely on the class name at all.  I could easily roll an implementation
tomorrow to support this idea (and the implementation would be portable
by todays standards).  The only problem is mine would have to be built
on top of the current type_info, and so would suffer in performance.
If we modified type_info directly, there'd be no such performance
problems.  The only hair in the soup here is one you've made up.
Persistance facilities directly in type_info was not a goal, nor should
it be, IMHO.

Sent via Deja.com http://www.deja.com/
Before you buy.
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: "Andrei Alexandrescu" <andrewalex@hotmail.com>
Date: 1999/12/12 Raw View

Dave Harris <scorp@btinternet.com> wrote in message
news:memo.19991210152032.62499C@btinternet.com...
[discussion about type_info::name]

I think a reasonable set of requirements for type_info::name would be:

* for a primitive type, type_info::name returns the name of that type.
The position of const, presence of spaces etc. will be as follows:
yadda, yadda, yadda.
* for a class visible from the global namespace, type_info::name
returns the name of the class, appropriately qualified. A prepending
double colon will exist for all classes. For example, given:

class A { ... };
namespace N
{
    class B { class C; };
}

the names of the three classes will be: "::A", "::N::B", "::N::B::C".

The fact that a class is inaccessible due to protection does not
affect the rule above.

* for local classes which are invisible from the global namespace,
type_info::name will return a string that's unique for each local
class.

This set of changes would allow a number of important things:

1. It is compatible with the old rules, which basically guaranteed
nothing about type_info::name. That was easy.

2. It allows hashing and sorting based upon type names.

3. It allows the creator of a class know in advance what
type_info::name will return for that class. This is important because
it allows the class' creator to register that class with some services
that use type_info. It establishes a mutual simple long-distance
protocol.


Andrei




[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: Gene Bushuyev <gbush@my-deja.com>
Date: 1999/12/13 Raw View

In article <s55a88v5378@news.supernews.com>,
  "Andrei Alexandrescu" <andrewalex@hotmail.com> wrote:
>
> Dave Harris <scorp@btinternet.com> wrote in message
> news:memo.19991210152032.62499C@btinternet.com...
> [discussion about type_info::name]
>
> I think a reasonable set of requirements for type_info::name would be:
>
> * for a primitive type, type_info::name returns the name of that type.
> The position of const, presence of spaces etc. will be as follows:
> yadda, yadda, yadda.
> * for a class visible from the global namespace, type_info::name
> returns the name of the class, appropriately qualified. A prepending
> double colon will exist for all classes. For example, given:
>
> class A { ... };
> namespace N
> {
>     class B { class C; };
> }
>
> the names of the three classes will be: "::A", "::N::B", "::N::B::C".
>
[snip]
I think it's a good suggestion. Except, I would drop the leading "::"
because otherwise all the classes would have :: prepended to the name.
Also, for templates, it makes sense to return full qualified name of
instantiated class; and use the name as it appears in declaration of
template if it wasn't instantiated. E.g.
template<class T> class A{};
typeid(A<int>).name() returns "A<int>"
typeid(A).name() returns "template<class T> class A"
It isn't realy a problem if the names are very long. I guess all
compilers already have this functionality since when error occures you
can see the error messages with full names of templates, even when such
a name is 10K long. Actually, compiler doesn't have to store static
strings for instantiated templates names. It can include a piece of
code to produce such name on demand.
-------------------------------------
Gene Bushuyev


Sent via Deja.com http://www.deja.com/
Before you buy.


[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: sirwillard@my-deja.com
Date: 1999/12/09 Raw View

In article <384c8866.77375470@news.internetconnect.net>,
  phalpern@newview.org (Pablo Halpern) wrote:
>
> scorp@btinternet.com (Dave Harris) wrote:
> >
> >sirwillard@my-deja.com () wrote:
> >Contrariwise, if you have a perfect hash you can make a unique
string
> >directly from it. A hex representation of a smallish integer could
be
> >smaller than the integer itself. The two seem to me to be more or
less
> >equivalent. If you can have one you can have the other.
>
> Yes, but it seems reasonable that an implementation might provide a
> name() function that returns meaningful but non-unique name (e.g. an
> abbreviated name without namespaces, cv qualifiers, or even template
> parameters). By requiring the hash to be built on the name, you
require
> that the name either be a meaningless hex number or a potentially very
> long meaningful name.

This was precisely my point.  I must not have made it very clear in the
wording.

> I wish that the long, complete name were required, because they make
it
> easier to build persistence and network-object mechanisms. But long
> names take a long time to hash, so it would also be nice to have a
fast
> hashing function.

I don't want any such thing.  Building persistence and network-object
mechanisms without a unique name in type_info is not very difficult.
Just to make it a little easier is not a good reason to modify
type_info in a way that will cause tremendous code bloat.  This is
especially true since most programs don't make use of these mechanisms
any way.  IOW, you've severely bloated programs with functionality
never used.

> >> Actually, for most implementations I would think it would just be a
> >> matter of exposing the state that already exists to provide
> >> functionality for "before".  The only real change is a limitation
as to
> >> what this state can be.
> >
> >Yes... a limitation I dislike. If we're to promote hashing, than I
think
> >I'd prefer an opaque get_hash() function. It could use counters if
> >convenient, or the name, or some internal representation of the type
that
> >happened to be in use anyway - whatever suited the implementation.
>
> We're in agreement here.

As am I.  Again, I guess the way I worded things left some question as
to what I meant.  It would be folly to expose state directly instead of
through an accessor method.


Sent via Deja.com http://www.deja.com/
Before you buy.
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: Gene Bushuyev <gbush@my-deja.com>
Date: 1999/12/09 Raw View

In article <82gmer$gqm$1@nnrp1.deja.com>,
  sirwillard@my-deja.com wrote:
[snip]
> > They
> > decided not to for some compelling reason. I suggest the same
reason would
> > apply to providing any hashable state.
>
> I already gave the most reasonable reason and it most definately can't
> be applied as a reason to not provide any hashing state.
>
> > Providing any other hashable state
> > would be no easier than providing the name.
>
> Not at all true.

In fact it is true. Moreover, this another key for hashing still
doesn't have anything to build upon but the class name. See below.

[snip]
>
> No, let's not use strings for hashing at all if we can, thank you very
> much.  I've given two compelling reasons for this:  memory
requirements
> and the speed problems with string hashing functions.

There are no big memory requirements. Even if my project contains 1000
classes and all their names are 1000 chars long, it's just 1MB of
virtual memory. For such a project it's absolutely unsignificant amount.
Speed also doesn't matter that much, because storing/loading persistent
objects is not frequent operation. Usually, it is done as a response to
user's input.

[snip]

> I'm not familiar with the "hashing standards can of worms" you are
> referring to.  Maybe we differ in opinion because of this.  I've yet
to
> see an algorithm yet that didn't reduce a hash down to an integral
> type, which is the obvious thing to me.
>

I don't see what obvious algorithm you have in mind. And reading other
threads with similar proposals I see that people do not understand how
persistent objects are used.
What is needed to create persistent objects is some invariant that
doesn't depend on compiler, the number of libraries and the order of
their loading. Persistent object must have the same key every time the
program is executed, even if it was edited and recompiled in between
the runs. You can't assume any pointers will have the same absolute or
relative values. You can't assume templates will be instantiating at
the same order or that you will have the same number of them.
The only invariant is the class name, but type_info blew this only
chance of making persistent objects. At present the ONLY portable way
of building persistent objects is to add a unique static string or
other type key to every class that is intended to be persistent. This
is error prone and boring. The task that compiler could easily perform
if the standard provided strict requirements on the uniqueness of
type_info::name().
As to your proposed integral value as a key. It won't work unless you
somehow derive this value from class name. Possible, but doesn't make
much sense.

-------------------------------------
Gene Bushuyev

Sent via Deja.com http://www.deja.com/
Before you buy.
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: scorp@btinternet.com (Dave Harris)
Date: 1999/12/10 Raw View

phalpern@newview.org (Pablo Halpern) wrote:
> Yes, but it seems reasonable that an implementation might provide a
> name() function that returns meaningful but non-unique name (e.g. an
> abbreviated name without namespaces, cv qualifiers, or even template
> parameters). By requiring the hash to be built on the name, you require
> that the name either be a meaningless hex number or a potentially very
> long meaningful name.

Firstly, I don't agree such a name() function would be reasonable. Eg some
idioms use large numbers of small namespaces, so throwing away the
namespace portion of the name would make the name useless. For example, I
put each enum into its own namespace:

    namespace Color { enum Type { red, green, blue, end }; }
    namespace Side { enum Type { left, right, top, bottom, end }; }

A name() which returned only "Type" for all types wouldn't be much good to
me. (This example doesn't use RTTI but I hope you get the point.)

Secondly, you have an excluded middle in your last sentence. There's no
reason why a unique ID couldn't be appended to your abbreviated names.
Thus we might have "Type$1", "Type$2" etc. This would produce names which
were meaningful, short and unique.

Thirdly, for hashing, uniqueness is actually over-specified. All we
require is that strcpy( a.name(), b.name() ) != 0 implies a != b. So your
abbreviated names would be correct for hashing. They would lead to more
collisions than we would like, but I think that should be a quality of
implementation issue.

As it stands, type_info::name() is all but useless IMHO. If it were
tightened up enough to be useful, that would also mitigate the hashing
problem to the point where I wouldn't be bothered about it. This seems the
more economical route to go.

  Dave Harris, Nottingham, UK | "Weave a circle round him thrice,
      brangdon@cix.co.uk      |   And close your eyes with holy dread,
                              |  For he on honey dew hath fed
 http://www.bhresearch.co.uk/ |   And drunk the milk of Paradise."

[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: Francis Glassborow <francis@robinton.demon.co.uk>
Date: 1999/12/08 Raw View

In article <82ju8c$jbf$1@oak.prod.itd.earthlink.net>, AllanW <AllanW@my-
deja.com> writes
>AFAIK, any system that has a many-to-one relationship between pointers
>and addresses also has some concept that I'll call "normalization" which
>is already used when comparing two pointers for equality. This is by
>definition platform-dependant, so we can't give out algorithms, but we
>can give an example:
>
>    Change the bit pattern so that when it is interpreted as a pointer, the
>    address does not change, but when it is interpreted as an integer, the
>    value is lower. Continue until this change is no longer possible, and
>    then the bit pattern represents the normalized value.
>
>Naturally, for any real platform there will be equivalent algorithms that
>do not involve iteration.

There is no problem with comparing pointers for equality (even without
normalisation) however requiring that less than shall always 'work' is
requiring a complete ordering of addresses and that is something else.
>

Francis Glassborow      Journal Editor, Association of C & C++ Users
64 Southfield Rd
Oxford OX4 1PA          +44(0)1865 246490
All opinions are mine and do not represent those of any organisation


[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: James Kuyper <kuyper@wizard.net>
Date: 1999/12/08 Raw View

Michiel Salters wrote:
...
> What do you mean, being unable to hash pointers? AFAIK, there is
> no built-in C/C++ hash, and if you are rolling your own, one
> can be added just as easily for pointers. The hashed values won't be
> portable, which is fine since the pointers values themselves
> weren't to begin with. Other than that, I think it is perfectly
> portable.

Two pointers of the same type to the same object needn't have the same
bit pattern. I'd expect that to mess up most attempts to portably hash
pointers. Of course such pointers must compare equal, but the compiler
could perform a comparison mechanism that uses only part of the
information in the pointer to determine equivalence.
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: "James Kuyper Jr." <kuyper@wizard.net>
Date: 1999/12/08 Raw View

Francis Glassborow wrote:
....
> There is no problem with comparing pointers for equality (even without
> normalisation) however requiring that less than shall always 'work' is
> requiring a complete ordering of addresses and that is something else.

std::less<T*> is required to yield a total order, even if the built-in
'operator<' doesn't. Thus, the necessary work has already been done by
the implementation. Unfortunately, even though there's almost certainly
something associated with each pointer that could be used to hash it,
the standard doesn't require implementations to make that something
available to users.

[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: sirwillard@my-deja.com
Date: 1999/12/08 Raw View

In article <384c8866.77375470@news.internetconnect.net>,
  phalpern@newview.org (Pablo Halpern) wrote:
>
> scorp@btinternet.com (Dave Harris) wrote:
> >
> >sirwillard@my-deja.com () wrote:
> >Contrariwise, if you have a perfect hash you can make a unique string
> >directly from it. A hex representation of a smallish integer could be
> >smaller than the integer itself. The two seem to me to be more or less
> >equivalent. If you can have one you can have the other.
>
> Yes, but it seems reasonable that an implementation might provide a
> name() function that returns meaningful but non-unique name (e.g. an
> abbreviated name without namespaces, cv qualifiers, or even template
> parameters). By requiring the hash to be built on the name, you require
> that the name either be a meaningless hex number or a potentially very
> long meaningful name.

This was precisely my point.  I must not have made it very clear in the
wording.

> I wish that the long, complete name were required, because they make it
> easier to build persistence and network-object mechanisms. But long
> names take a long time to hash, so it would also be nice to have a fast
> hashing function.

I don't want any such thing.  Building persistence and network-object
mechanisms without a unique name in type_info is not very difficult.
Just to make it a little easier is not a good reason to modify
type_info in a way that will cause tremendous code bloat.  This is
especially true since most programs don't make use of these mechanisms
any way.  IOW, you've severely bloated programs with functionality
never used.

> >> Actually, for most implementations I would think it would just be a
> >> matter of exposing the state that already exists to provide
> >> functionality for "before".  The only real change is a limitation as to
> >> what this state can be.
> >
> >Yes... a limitation I dislike. If we're to promote hashing, than I think
> >I'd prefer an opaque get_hash() function. It could use counters if
> >convenient, or the name, or some internal representation of the type that
> >happened to be in use anyway - whatever suited the implementation.
>
> We're in agreement here.

As am I.  Again, I guess the way I worded things left some question as
to what I meant.  It would be folly to expose state directly instead of
through an accessor method.

Sent via Deja.com http://www.deja.com/
Before you buy.

[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: sirwillard@my-deja.com
Date: 1999/12/08 Raw View

In article <82ifvb$nif$1@trex.antw.online.be>,
  "TiTi" <Tha.Main.Man@Fraggin.Bastard.Com> wrote:
> > process.  During the design of type_info, name would have been the last
> > choice for several reasons:
> > 1.  In order to gaurantee unique name strings the string must become
> > much larger than is reasonable to store.  The mangled names (and
> > remember that name would be reasonable to return unmangled) for
> > template types already cause warnings in one known compiler because it
> > grew larger than 256 bytes.
>
> This is (AFAIK) only in debug-mode for MSVC compilers. I doubt if the same
> occurs for release mode.

Nope, it occurs in both modes (the generated name is identical
regardless of the mode you compile in).  You only receive a warning in
Debug because this is the only mode in which the mangled name is
written out to an actual file (i.e. not held in memory), where the
length is limited to 256 bytes.

> > 2.  String hashing algorithms are expensive, especially as the length
> > of the string grows (see 1 above).  Further, it's impossible to
> > gaurantee a perfect hash, which is desirable if not required.
>
> It is possible to construct hashkey algorithms for strings which are
> calculated in constant time (e.g. only use N chars from M char string
> (0<N<=M), hashsize is HS, I'm just trying something here:
>
>     unsigned hashkey(const char* pChar, int size)
>     {
>         if(size<=0)
>             return 0;
>         unsigned index = 0, result = 0, to_where = MIN( N, size );
>         for(int i = 0; i < to_where; i++)
>         {
>             result = ( (result << 5) + ( (unsigned) pChar[index] ) ) % HS;
>             index = ( ( index << 5 ) + ( (unsigned) pChar[index] ) ) %
> to_where;
>         }
>         return result;
>     }
>
> Anyway, I'm not telling that this is an ideal hash algorithm, but it's
> calculated in constant time however.

True, however:

1) It's still slower than typical hashing functions where the data type
is a numeric type to begin with.

2) Such algorithms have a greater likelyhood of clashes (i.e. non-
unique hash values).

3) This still doesn't address the problem of space efficiency.

> > 3.  The "collation" ordering required by "before" neatly aligns itself
> > with a method for providing a true hash with little expense.  In other
> > words, a simple integer type can be used, which is incremented with
> > each type added to the system.  This integer is a "perfect hash", as
> > well as well suited for implementing the collation ordering.
>
> Indeed, both for hash tables and for trees, this would come in handy.

Though for trees, the current "before" and equality operators are
sufficient.

> > No, let's not use strings for hashing at all if we can, thank you very
> > much.  I've given two compelling reasons for this:  memory requirements
> > and the speed problems with string hashing functions.
>
> Depends in what system you work on (speeds + memory), and what the algorithm
> is you're working on(!). In some algorithms, you can't just store things in
> a simple list, or you can wait for days to get some results (O(n^2) or
> worse). And calculating a hash key can be done faster than traversing a
> whole list to find the item (which still needs comparison!), especially if
> we're talking about lots 'o' data.

I think you miss my point here.  I'm not saying do without hashing, I'm
saying do without hashing on a string value.  I'm also talking
specifics here (i.e. in regards to type_info and the changes required
to allow hashing).

Sent via Deja.com http://www.deja.com/
Before you buy.

[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: scorp@btinternet.com (Dave Harris)
Date: 1999/12/06 Raw View

sirwillard@my-deja.com () wrote:
> I hope this was just a guess, and not the true reason.

They were all guesses. I wasn't anywhere near the committee.

> First of all, to design for something "that might be added by a future
> version" is a slippery slope to begin with.  Secondly, if anyone is
> confused as to what the behavior of operator< should be (it seems
> obvious to me) then the standard and all other documentation will
> clearly explain things.

It's not obvious to me. In fact, it means almost nothing; before() is an
arbitrary ordering that might not even be consistent from one run to
another. When I first read the description I was somewhat surprised.

The point is not so much what might be added in future, but what
expectations a programmer might reasonably have. Good design should
minimise surprise. Operator overloading should really be used when the
meaning is obvious.

Documentation can help, of course, but I don't think good documentation is
an excuse for bad design. Even as it is, the documentation has to take
pains to disabuse our reasonable expectations.

> > I would think any implementor who was able to provide a good
> > hashing value would also be able to provide unique names.
>
> This doesn't follow logically.  First of all, on MANY platforms,
> type_info is defined by the compiler vendor while the container classes
> (including the theoretical STL extension, hash_map) are provided by a
> third party.  In these cases it's not only possible, but likely, that
> although string hashing is available in the library, type_info::name
> does not provide a unique string.

That's true but irrelevant. You asked:

   Why was there no thought put into providing some state within
   type_info that could be portably used for hashing functions?

The name would be the obvious state to use. Although it's true that we
cannot use the name today because they are not guaranteed unique, it's
surely obvious that the committee thought about making them unique. They
decided not to for some compelling reason. I suggest the same reason would
apply to providing any hashable state. Providing any other hashable state
would be no easier than providing the name.

We both agree that the standard is not very helpful to programmers who
want to use hashing. My preferred fix would be to require unique strings
(since the name() interface is already in place and names would be useful
in other ways). Your preferred fix, apparently, is that names should be
left as they are and more state added just for hashing. I don't see what
advantage that approach has.

> As for the conventions that might be chosen in the future...
> those conventions would not be compromised by the inclusion of state
> info that could be hashed.  I don't believe that type_info was "right
> to avoid pre-empting those decisions" because of this.

What state info do you have in mind?

It seems to me that either you have something equivalent to a string or
else you use opaque state accessed through a hash function.

"Equivalent to a string" in that we don't know how many bits the
implementation will need, so it must be variable length. A variable length
array of char would be fine. Perhaps we could have name() and
hashable_name(), where the latter is guaranteed unique and the former
isn't. But that's silly; let's just place more requirements on name() and
use it for both tasks.

If instead you intend opaque state accessed through a hash function, you
have to decide what this function will be called and what type it returns.
This opens up the hashing standards can of worms.

Is there some 3rd approach I've missed? What would you have had the
committee actually do?

  Dave Harris, Nottingham, UK | "Weave a circle round him thrice,
      brangdon@cix.co.uk      |   And close your eyes with holy dread,
                              |  For he on honey dew hath fed
 http://www.bhresearch.co.uk/ |   And drunk the milk of Paradise."
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: Bill Wade <bill.wade@stoner.com>
Date: 1999/12/06 Raw View

sirwillard@my-deja.com wrote in message <81u646$kln$1@nnrp1.deja.com>...

>... hashing is a common programming technique.  [ strongly implies: ] RTTI
should
> provide state info consistent with the implementation of hashing
algorithms.

Complain about that if you will, but first raise a stink about being unable
to hash pointers (of any type) in either standard C or C++.  Most
non-trivial programs have many more objects than types.  I'd be interested
to hear about platforms where
   pointer == pointer
and
  less(pointer,pointer)
can be efficiently implemented but
  hash(pointer)
is expensive.
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: sirwillard@my-deja.com
Date: 1999/12/06 Raw View

In article <memo.19991201143603.18269G@btinternet.com>,
  brangdon@cix.co.uk wrote:
> sirwillard@my-deja.com () wrote:
> > I hope this was just a guess, and not the true reason.
>
> They were all guesses. I wasn't anywhere near the committee.
>
> > First of all, to design for something "that might be added by a future
> > version" is a slippery slope to begin with.  Secondly, if anyone is
> > confused as to what the behavior of operator< should be (it seems
> > obvious to me) then the standard and all other documentation will
> > clearly explain things.
>
> It's not obvious to me. In fact, it means almost nothing; before() is an
> arbitrary ordering that might not even be consistent from one run to
> another. When I first read the description I was somewhat surprised.

What you are saying implies that the "obvious meaning would be that of
the relationship within an inheritance tree".  This obviously doesn't
work, since not all types are part of an inheritance tree.  The
behaviour of "before" most certainly seems obvious to me, including the
fact that the results may be different between runs.  Even still, this
isn't the "obvious" behavior I was referring to.  What seems even more
obvious to me is that were type_info to contain both a "before" as well
as a "parent_of" or "subclass_of" or what ever, that operator< would
behave as "before".

> The point is not so much what might be added in future, but what
> expectations a programmer might reasonably have. Good design should
> minimise surprise. Operator overloading should really be used when the
> meaning is obvious.

Which it is.  In this case it gives a collating order to types, which
reasonably can only be gauranteed for a single run.

> Documentation can help, of course, but I don't think good documentation is
> an excuse for bad design. Even as it is, the documentation has to take
> pains to disabuse our reasonable expectations.

I would agree with this, but this is a case where only the
documentation can prevent misunderstandings.  It's already been done
for "before" and would be no different for "operator<".

> > > I would think any implementor who was able to provide a good
> > > hashing value would also be able to provide unique names.
> >
> > This doesn't follow logically.  First of all, on MANY platforms,
> > type_info is defined by the compiler vendor while the container classes
> > (including the theoretical STL extension, hash_map) are provided by a
> > third party.  In these cases it's not only possible, but likely, that
> > although string hashing is available in the library, type_info::name
> > does not provide a unique string.
>
> That's true but irrelevant. You asked:
>
>    Why was there no thought put into providing some state within
>    type_info that could be portably used for hashing functions?
>
> The name would be the obvious state to use.

Actually, no it wouldn't.  It's the obvious state to try and use when
no other state is included, which is why we have so many people trying
to use name today.  However, this is an "after design" thought
process.  During the design of type_info, name would have been the last
choice for several reasons:

1.  In order to gaurantee unique name strings the string must become
much larger than is reasonable to store.  The mangled names (and
remember that name would be reasonable to return unmangled) for
template types already cause warnings in one known compiler because it
grew larger than 256 bytes.  This is an extreme waste of space just to
gaurantee uniqueness.
2.  String hashing algorithms are expensive, especially as the length
of the string grows (see 1 above).  Further, it's impossible to
gaurantee a perfect hash, which is desirable if not required.
3.  The "collation" ordering required by "before" neatly aligns itself
with a method for providing a true hash with little expense.  In other
words, a simple integer type can be used, which is incremented with
each type added to the system.  This integer is a "perfect hash", as
well as well suited for implementing the collation ordering.

I'm sure there's all kinds of issues with #3 that would need to be
worked out, but it's still a better answer than using the name as the
hashing state.

> Although it's true that we
> cannot use the name today because they are not guaranteed unique, it's
> surely obvious that the committee thought about making them unique.

Again, I don't think this is obvious.  I'd also be quite surprised if
they did consider this.

> They
> decided not to for some compelling reason. I suggest the same reason would
> apply to providing any hashable state.

I already gave the most reasonable reason and it most definately can't
be applied as a reason to not provide any hashing state.

> Providing any other hashable state
> would be no easier than providing the name.

Not at all true.

> We both agree that the standard is not very helpful to programmers who
> want to use hashing. My preferred fix would be to require unique strings
> (since the name() interface is already in place and names would be useful
> in other ways).

I hope I've convinced you to change your opinion here.

> Your preferred fix, apparently, is that names should be
> left as they are and more state added just for hashing. I don't see what
> advantage that approach has.

Actually, for most implementations I would think it would just be a
matter of exposing the state that already exists to provide
functionality for "before".  The only real change is a limitation as to
what this state can be.

> > As for the conventions that might be chosen in the future...
> > those conventions would not be compromised by the inclusion of state
> > info that could be hashed.  I don't believe that type_info was "right
> > to avoid pre-empting those decisions" because of this.
>
> What state info do you have in mind?
>
> It seems to me that either you have something equivalent to a string or
> else you use opaque state accessed through a hash function.
>
> "Equivalent to a string" in that we don't know how many bits the
> implementation will need, so it must be variable length. A variable length
> array of char would be fine. Perhaps we could have name() and
> hashable_name(), where the latter is guaranteed unique and the former
> isn't. But that's silly; let's just place more requirements on name() and
> use it for both tasks.

No, let's not use strings for hashing at all if we can, thank you very
much.  I've given two compelling reasons for this:  memory requirements
and the speed problems with string hashing functions.

> If instead you intend opaque state accessed through a hash function, you
> have to decide what this function will be called and what type it returns.
> This opens up the hashing standards can of worms.
>
> Is there some 3rd approach I've missed? What would you have had the
> committee actually do?

I'm not familiar with the "hashing standards can of worms" you are
referring to.  Maybe we differ in opinion because of this.  I've yet to
see an algorithm yet that didn't reduce a hash down to an integral
type, which is the obvious thing to me.

Sent via Deja.com http://www.deja.com/
Before you buy.

[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: scorp@btinternet.com (Dave Harris)
Date: 1999/12/07 Raw View

sirwillard@my-deja.com () wrote:
> The behaviour of "before" most certainly seems obvious to me [...]

What is "obvious" will of course be subjective and it's not too surprising
if we differ. We really need someone from the committee to comment on
whether this was a factor at all.

> 3.  The "collation" ordering required by "before" neatly aligns itself
> with a method for providing a true hash with little expense.  In other
> words, a simple integer type can be used, which is incremented with
> each type added to the system.  This integer is a "perfect hash", as
> well as well suited for implementing the collation ordering.

It seems to me that this requires some kind of uniqueness to work. If the
same type is introduced independently by several different DLLs, you need
to somehow avoid incrementing the counter for the second and subsequent
DLLs.

So I think something like the long unique bit-patterns must be involved
somewhere along the line to make hashing work. Given that, it seems the
simplest step is to make them available. Adding an integer field is an
optimisation. I'd be quite glad if it was there but it's not fundamental.

Contrariwise, if you have a perfect hash you can make a unique string
directly from it. A hex representation of a smallish integer could be
smaller than the integer itself. The two seem to me to be more or less
equivalent. If you can have one you can have the other.

> > Although it's true that we
> > cannot use the name today because they are not guaranteed unique, it's
> > surely obvious that the committee thought about making them unique.
>
> Again, I don't think this is obvious.  I'd also be quite surprised if
> they did consider this.

Then I have more faith in the committee than you :-)

> Actually, for most implementations I would think it would just be a
> matter of exposing the state that already exists to provide
> functionality for "before".  The only real change is a limitation as to
> what this state can be.

Yes... a limitation I dislike. If we're to promote hashing, than I think
I'd prefer an opaque get_hash() function. It could use counters if
convenient, or the name, or some internal representation of the type that
happened to be in use anyway - whatever suited the implementation.

> I'm not familiar with the "hashing standards can of worms" you are
> referring to.  Maybe we differ in opinion because of this.  I've yet to
> see an algorithm yet that didn't reduce a hash down to an integral
> type, which is the obvious thing to me.

It just involves a number of decisions. Which integral type? int, unsigned
int, long, unsigned long, size_t, a new hash_t are all reasonable choices.
What name should the function have? hash(), get_hash(), hash_value()...
Should it be a member function? Whatever choices we make should be
compatible with std::hash_map<>, which is a template which hasn't been
designed yet. It's all work for the committee to thrash out and reach a
consensus on.

  Dave Harris, Nottingham, UK | "Weave a circle round him thrice,
      brangdon@cix.co.uk      |   And close your eyes with holy dread,
                              |  For he on honey dew hath fed
 http://www.bhresearch.co.uk/ |   And drunk the milk of Paradise."

[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: Michiel Salters <salters@lucent.com>
Date: 1999/12/07 Raw View

Bill Wade <bill.wade@stoner.com> wrote:

> Complain about [hashing] if you will, but first raise a stink about being unable
> to hash pointers (of any type) in either standard C or C++.  Most
> non-trivial programs have many more objects than types.  I'd be interested
> to hear about platforms where
>    pointer == pointer
> and
>   less(pointer,pointer)
> can be efficiently implemented but
>   hash(pointer)
> is expensive.

What do you mean, being unable to hash pointers? AFAIK, there is
no built-in C/C++ hash, and if you are rolling your own, one
can be added just as easily for pointers. The hashed values won't be
portable, which is fine since the pointers values themselves
weren't to begin with. Other than that, I think it is perfectly
portable.

Yes, you *will* need to cast the pointer to an (array of)
arithmetic type, but the essence of hashing is bit juggling.
Besides, you're only using the hash for comparisons.

Michiel Salters
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: "TiTi" <Tha.Main.Man@Fraggin.Bastard.Com>
Date: 1999/12/07 Raw View

> process.  During the design of type_info, name would have been the last
> choice for several reasons:
> 1.  In order to gaurantee unique name strings the string must become
> much larger than is reasonable to store.  The mangled names (and
> remember that name would be reasonable to return unmangled) for
> template types already cause warnings in one known compiler because it
> grew larger than 256 bytes.

This is (AFAIK) only in debug-mode for MSVC compilers. I doubt if the same
occurs for release mode.



> 2.  String hashing algorithms are expensive, especially as the length
> of the string grows (see 1 above).  Further, it's impossible to
> gaurantee a perfect hash, which is desirable if not required.



It is possible to construct hashkey algorithms for strings which are
calculated in constant time (e.g. only use N chars from M char string
(0<N<=M), hashsize is HS, I'm just trying something here:


    unsigned hashkey(const char* pChar, int size)
    {
        if(size<=0)
            return 0;
        unsigned index = 0, result = 0, to_where = MIN( N, size );
        for(int i = 0; i < to_where; i++)
        {
            result = ( (result << 5) + ( (unsigned) pChar[index] ) ) % HS;
            index = ( ( index << 5 ) + ( (unsigned) pChar[index] ) ) %
to_where;
        }
        return result;
    }


Anyway, I'm not telling that this is an ideal hash algorithm, but it's
calculated in constant time however.



> 3.  The "collation" ordering required by "before" neatly aligns itself
> with a method for providing a true hash with little expense.  In other
> words, a simple integer type can be used, which is incremented with
> each type added to the system.  This integer is a "perfect hash", as
> well as well suited for implementing the collation ordering.



Indeed, both for hash tables and for trees, this would come in handy.



> No, let's not use strings for hashing at all if we can, thank you very
> much.  I've given two compelling reasons for this:  memory requirements
> and the speed problems with string hashing functions.




Depends in what system you work on (speeds + memory), and what the algorithm
is you're working on(!). In some algorithms, you can't just store things in
a simple list, or you can wait for days to get some results (O(n^2) or
worse). And calculating a hash key can be done faster than traversing a
whole list to find the item (which still needs comparison!), especially if
we're talking about lots 'o' data.




TiTi
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: "TiTi" <Tha.Main.Man@Fraggin.Bastard.Com>
Date: 1999/12/07 Raw View

A little error in prev. post:

    unsigned hashkey(const char* pChar, int size)
    {
        if(size<=0)
            return 0;
        unsigned index = 0, result = 0, to_where = MIN( N, size );
        for(int i = 0; i < to_where; i++)
        {
            result = ( (result << 5) + ( (unsigned) pChar[index] ) ) % HS;
            index = ( ( index << 5 ) + ( (unsigned) pChar[index] ) ) % size;
        }
        return result;


Ciao
TiTi
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: phalpern@newview.org (Pablo Halpern)
Date: 1999/12/07 Raw View

scorp@btinternet.com (Dave Harris) wrote:

>
>sirwillard@my-deja.com () wrote:
>
>> 3.  The "collation" ordering required by "before" neatly aligns itself
>> with a method for providing a true hash with little expense.  In other
>> words, a simple integer type can be used, which is incremented with
>> each type added to the system.  This integer is a "perfect hash", as
>> well as well suited for implementing the collation ordering.
>
>It seems to me that this requires some kind of uniqueness to work. If the
>same type is introduced independently by several different DLLs, you need
>to somehow avoid incrementing the counter for the second and subsequent
>DLLs.
>
>So I think something like the long unique bit-patterns must be involved
>somewhere along the line to make hashing work. Given that, it seems the
>simplest step is to make them available. Adding an integer field is an
>optimisation. I'd be quite glad if it was there but it's not fundamental.

No, a simple integer counter would work. If type_info provided a hash(),
function, the hash counter could be computed at run-time. Different DLLs
would share the same hash counter for each type. No long-bit-strings
would be needed.

>Contrariwise, if you have a perfect hash you can make a unique string
>directly from it. A hex representation of a smallish integer could be
>smaller than the integer itself. The two seem to me to be more or less
>equivalent. If you can have one you can have the other.

Yes, but it seems reasonable that an implementation might provide a
name() function that returns meaningful but non-unique name (e.g. an
abbreviated name without namespaces, cv qualifiers, or even template
parameters). By requiring the hash to be built on the name, you require
that the name either be a meaningless hex number or a potentially very
long meaningful name.

I wish that the long, complete name were required, because they make it
easier to build persistence and network-object mechanisms. But long
names take a long time to hash, so it would also be nice to have a fast
hashing function.

>> Actually, for most implementations I would think it would just be a
>> matter of exposing the state that already exists to provide
>> functionality for "before".  The only real change is a limitation as to
>> what this state can be.
>
>Yes... a limitation I dislike. If we're to promote hashing, than I think
>I'd prefer an opaque get_hash() function. It could use counters if
>convenient, or the name, or some internal representation of the type that
>happened to be in use anyway - whatever suited the implementation.

We're in agreement here.

>> I'm not familiar with the "hashing standards can of worms" you are
>> referring to.  Maybe we differ in opinion because of this.  I've yet to
>> see an algorithm yet that didn't reduce a hash down to an integral
>> type, which is the obvious thing to me.
>
>It just involves a number of decisions. Which integral type? int, unsigned
>int, long, unsigned long, size_t, a new hash_t are all reasonable choices.
>What name should the function have? hash(), get_hash(), hash_value()...
>Should it be a member function? Whatever choices we make should be
>compatible with std::hash_map<>, which is a template which hasn't been
>designed yet. It's all work for the committee to thrash out and reach a
>consensus on.

Yeah, well, since neither hash_map nor type_info::hash are in today's
standard, it seems reasonable but not imperative that they be
coordinated in a future standard. Remember that the hash<> function
object could just be specialized for type_info if the hash types are not
already identical. As far as picking a type, the above choices are
indeed reasonable. Picking from a set of reasonable choices doesn't have
to be a can of worms. The committee could just pick one and get on with
life.

---------------------------------------------------------------
Pablo Halpern                              phalpern@newview.org
Check out my new book, The C++ Standard Library from Scratch at
http://www.halpernwightsoftware.com/stdlib-scratch

[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: Francis Glassborow <francis@robinton.demon.co.uk>
Date: 1999/12/07 Raw View

In article <384BBE4F.ABDC393C@lucent.com>, Michiel Salters
<salters@lucent.com> writes
>What do you mean, being unable to hash pointers?

Well there is a small problem, many distinct bit patterns can represent
the same pointer (consider the way pointers work in segmented
architectures) so the relationship between pointers and addresses is
many:one.  How do you propose to provide your unique hash?



Francis Glassborow      Journal Editor, Association of C & C++ Users
64 Southfield Rd
Oxford OX4 1PA          +44(0)1865 246490
All opinions are mine and do not represent those of any organisation


[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: sirwillard@my-deja.com
Date: 1999/12/07 Raw View

In article <memo.19991206234642.50639A@btinternet.com>,
  brangdon@cix.co.uk wrote:
>
> sirwillard@my-deja.com () wrote:
> > The behaviour of "before" most certainly seems obvious to me [...]
>
> What is "obvious" will of course be subjective and it's not too surprising
> if we differ. We really need someone from the committee to comment on
> whether this was a factor at all.

Whether it's obvious or not is really irrelevant.  The behavior of
before is really the only reasonable behavior it could have.  Other
behaviors may seem more obvious, but when thought through you'll find
they don't really work.

> > 3.  The "collation" ordering required by "before" neatly aligns itself
> > with a method for providing a true hash with little expense.  In other
> > words, a simple integer type can be used, which is incremented with
> > each type added to the system.  This integer is a "perfect hash", as
> > well as well suited for implementing the collation ordering.
>
> It seems to me that this requires some kind of uniqueness to work.

Obviously.

> If the
> same type is introduced independently by several different DLLs, you need
> to somehow avoid incrementing the counter for the second and subsequent
> DLLs.

This is already dealt with by implementors today.  The integer ID
doesn't have to be assigned at compile time.  This can be done at run
time, delaying the assignment until the first time it's requested.
This makes the "uniqueness" specification easy.

> So I think something like the long unique bit-patterns must be involved
> somewhere along the line to make hashing work. Given that, it seems the
> simplest step is to make them available. Adding an integer field is an
> optimisation. I'd be quite glad if it was there but it's not fundamental.

Types are already distinct today, regardless of whether or not the
library is statically linked or dynamically loaded.  The techniques
used to make this so are all that's needed.  "Long unique bit-patterns"
are not needed.  A simple integral type is all we need, with the value
assigned at run time not compile time.  The only concern is having
enough values available in the integral type to represent all types
that occur (or at least that are queried) within a run.  I suppose the
standard might define an arbitrarily long bit sequence instead of an
integral type for this reason, though this seems like over kill.

> Contrariwise, if you have a perfect hash you can make a unique string
> directly from it. A hex representation of a smallish integer could be
> smaller than the integer itself. The two seem to me to be more or less
> equivalent. If you can have one you can have the other.

This is quite different than your original proposal to have unique name
strings.  The "name" is human readable, while what you are suggesting
is simply an unbounded integral type.

> > > Although it's true that we
> > > cannot use the name today because they are not guaranteed unique, it's
> > > surely obvious that the committee thought about making them unique.
> >
> > Again, I don't think this is obvious.  I'd also be quite surprised if
> > they did consider this.
>
> Then I have more faith in the committee than you :-)

How do you figure?  Although the standard doesn't define precisely what
name should return, it's implied that it will be a human readable
representation of what the class is.  Requiring it to be unique would
mean that this would become a quite lengthy string, and would mean a
lot of code bloat just for RTTI.  I have a LOT of faith that even if
this were considered by the committee, it would have been shot down as
soon as it was brought up.  However, we are both second guessing the
committee.

> > Actually, for most implementations I would think it would just be a
> > matter of exposing the state that already exists to provide
> > functionality for "before".  The only real change is a limitation as to
> > what this state can be.
>
> Yes... a limitation I dislike. If we're to promote hashing, than I think
> I'd prefer an opaque get_hash() function. It could use counters if
> convenient, or the name, or some internal representation of the type that
> happened to be in use anyway - whatever suited the implementation.

I'm sorry, but a get_hash() function is *precisely* what I'm talking
about.  I surely hope you didn't think I was requesting the standard to
define a public int to be part of the standard interface for
type_info.  There should always be accessor functions to state
information, not direct access.  The limitation I was referring to was
that the state info must some how be translatable into a unique
integral type.  This is a stronger limitation than currently exists,
but it's not an unreasonable one.

> > I'm not familiar with the "hashing standards can of worms" you are
> > referring to.  Maybe we differ in opinion because of this.  I've yet to
> > see an algorithm yet that didn't reduce a hash down to an integral
> > type, which is the obvious thing to me.
>
> It just involves a number of decisions. Which integral type? int, unsigned
> int, long, unsigned long, size_t, a new hash_t are all reasonable
choices.
> What name should the function have? hash(), get_hash(), hash_value ()...
> Should it be a member function? Whatever choices we make should be
> compatible with std::hash_map<>, which is a template which hasn't been
> designed yet. It's all work for the committee to thrash out and reach a
> consensus on.

*sighs*  This is the dressing, not the heart of the matter.  Of course
there are numerous design issues that would have to be discussed by the
committe after the choice to include hashing support, but these
decisions are really not that difficult or earth shattering.  If you
want a recommendation, make the type a hash_t value, so that
implementors can choose the integral type most appropriate (i.e.
provides the most values) for their platform and call the accessor
hash_value.

Sent via Deja.com http://www.deja.com/
Before you buy.

[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: "AllanW" <AllanW@my-deja.com>
Date: 1999/12/07 Raw View

> In article <384BBE4F.ABDC393C@lucent.com>, Michiel Salters
> <salters@lucent.com> writes
> >What do you mean, being unable to hash pointers?

Francis Glassborow <francis@robinton.demon.co.uk> wrote in message
news:7QYbIRArmTT4Ew2J@robinton.demon.co.uk...
> Well there is a small problem, many distinct bit patterns can represent
> the same pointer (consider the way pointers work in segmented
> architectures) so the relationship between pointers and addresses is
> many:one.  How do you propose to provide your unique hash?

AFAIK, any system that has a many-to-one relationship between pointers
and addresses also has some concept that I'll call "normalization" which
is already used when comparing two pointers for equality. This is by
definition platform-dependant, so we can't give out algorithms, but we
can give an example:

    Change the bit pattern so that when it is interpreted as a pointer, the
    address does not change, but when it is interpreted as an integer, the
    value is lower. Continue until this change is no longer possible, and
    then the bit pattern represents the normalized value.

Naturally, for any real platform there will be equivalent algorithms that
do not involve iteration.



[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: sirwillard@my-deja.com
Date: 1999/11/30 Raw View

In article <memo.19991126195718.59965A@btinternet.com>,
  brangdon@cix.co.uk wrote:
>
> sirwillard@my-deja.com () wrote:
> > 1.  Useage of type_info within an associative container is so common
> > that the designers went the extra mile to allow it by including
> > type_info::before for collating, yet they didn't design the class
with
> > copy construction capabilities.  There are techniques that would
have
> > allowed this and gauranteed that slices would not have been created.
> > The simplest of these is the "pimpl" idiom (which is basically what
my
> > adaptor does).  Why wasn't this done?
>
> Presumably a matter of not paying for what you don't use. "before()"
ought
> to be trivial to implement. If a vendor can at least manage unique
names
> they can use those to provide the ordering, which has minimal cost.
Using
> "pimpl" has a cost per type_info.

I can understand the logic here, but I don't buy it.  The cost per
instance is too trivial to be important, and to use type_info in a way
clearly provided for (by the existance of type_info::before) you must
resort to the "pimpl" method or provide your own predicate for collated
containers.  I just feel that the "cost" is too small to provide a
design that only goes half way.

> > 2.  If collation capabilities were so important that it was added to
> > the design, why were natural collation operations not provided?  Why
> > type_info::before instead of type_info::operator<?
>
> Perhaps because of potential confusion with type_info::sub_class_of
().

I hope this was just a guess, and not the true reason.

> Although sub-class testing isn't yet part of type_info, it's
something
> that might be added by a future version of the standard, and possibly
as a
> non-portable extension today. If we have:
>      type_info::before();
>      type_info::sub_class_of();
>      type_info::operator<();
>
> users might be confused over which operator<() actually did.

First of all, to design for something "that might be added by a future
version" is a slippery slope to begin with.  Secondly, if anyone is
confused as to what the behavior of operator< should be (it seems
obvious to me) then the standard and all other documentation will
clearly explain things.

> > 3.  The C++ language has a boolean type, bool, yet the native class
> > type_info uses int returns instead.  Why?
>
> This one looks like a mistake to me.

Yes, someone else pointed it out.  I'm gleaning my info about the
standard from usenet postings and my libraries documentation rather
than from the standard itself.  I know I shouldn't be doing this, but I
just have not justified the expense of obtaining the standard yet.  In
any event, my vendors implementation has this one wrong, so I made the
wrong assumption.

> > 4.  Why was there no thought put into providing some state within
> > type_info that could be portably used for hashing functions?  I
realize
> > this would have been more complex, but surely it would be possible.
>
> Use type_info::name(). I would think any implementor who was able to
> provide a good hashing value would also be able to provide unique
names.

This doesn't follow logically.  First of all, on MANY platforms,
type_info is defined by the compiler vendor while the container classes
(including the theoretical STL extension, hash_map) are provided by a
third party.  In these cases it's not only possible, but likely, that
although string hashing is available in the library, type_info::name
does not provide a unique string.  This is a red herring suggestion and
is just as flawed as the suggestion to use type_info::name as the key
to a normal map.

> Recall also that the STL doesn't provide hashing anywhere, as far as
I can
> see. Type_info is being consistent with that. When hashing gets
added, as
> I hope it will be in 5 or 10 years time, it will presumably come with
a
> bunch of conventions about the name of the hash function, the type of
the
> hashed value, etc. Type_info is right to avoid pre-empting those
> decisions.

I'm fully aware that the STL (or more specifically, the C++ standard)
defines no hashing capabilities.  However, hashing is a common
programming technique.  Whether or not the library provides direct
hashing capabilities seems irrelevant to the decision about whether or
not RTTI should provide state info consistent with the implementation
of hashing algorithms.  As for the conventions that might be chosen in
the future... those conventions would not be compromised by the
inclusion of state info that could be hashed.  I don't believe that
type_info was "right to avoid pre-empting those decisions" because of
this.

Sent via Deja.com http://www.deja.com/
Before you buy.
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: kanze@gabi-soft.de
Date: 1999/11/30 Raw View

"TiTi" <TiTi@mad.scientist.com> writes:

|>  > Recall also that the STL doesn't provide hashing anywhere, as far
|>  >as I can see. Type_info is being consistent with that.  When
|>  >hashing gets added, as I hope it will be in 5 or 10 years time, it
|>  >will presumably come with a bunch of conventions about the name of
|>  >the hash function, the type of the hashed value, etc. Type_info is
|>  >right to avoid pre-empting those decisions.

|>  Are you out of your mind? 5 to 10 years, for something that should
|>  have been in in the first place?

5 to 10 years, because that is when the next version of the C++ standard
will appear, and until then, things can't be added.

|>  Then I don't think that you use it
|>  very often. I used it in Java, SmallTalk & C++, over the last 5
|>  years, and I could not have done things so fast without 'em. 5 to 10
|>  years? I can't believe it!!

Nothing is stoping you from using hashing today, in C++.  I use it
regularly as well, in C++ and Java.

--
James Kanze   +49 (069) 63 19 86 27   mailto:kanze@gabi-soft.de
Beratung in objektorientierter Datenverarbeitung --
                     -- Conseils en informatique orient   e objet
Ziegelh   ttenweg 17a, 60598 Frankfurt, Germany
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: "James Kuyper Jr." <kuyper@wizard.net>
Date: 1999/11/30 Raw View

TiTi wrote:
....
> >When hashing gets added, as I hope it will be in 5 or 10 years time, it will presumably come with a
> > bunch of conventions about the name of the hash function, the type of the
> > hashed value, etc. Type_info is right to avoid pre-empting those
> > decisions.
>
> Are you out of your mind? 5 to 10 years, for something that should have been
> in in the first place? Then I don't think that you use it very often. I used
> it in Java, SmallTalk & C++, over the last 5 years, and I could not have
> done things so fast without 'em. 5 to 10 years? I can't believe it!!

Sorry - but that's the way ISO standardization proceedures are set up.
For about 5 years, the only thing that it's appropriate for the
committee to worry about is defects in the standard; that period is
meant to encourage people to spend the time to learn how to use the
current version of C++, rather than trying to fix each and every
possible problem with a new language feature. Then another 5 years are
allowed to fully work out the details of the changes for the next
release of the standard. The last release was in mid-1998; expect the
next one sometime around 2008.

In the meantime, many existing implementations have hashing containers
as extensions. Learn how they work, gain experience with them, and bring
that experience into the discussions for the next release of the
standard.
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: scorp@btinternet.com (Dave Harris)
Date: 1999/11/30 Raw View

TiTi@mad.scientist.com (TiTi) wrote:
> Are you out of your mind? 5 to 10 years, for something that should have
> been in in the first place?

I didn't comment on whether it should have been there in the first place.
Given that it wasn't, 5 years is about the earliest such a change can be
considered.


> Then I don't think that you use it very often. I used it in Java,
> SmallTalk & C++, over the last 5 years, and I could not have
> done things so fast without 'em. 5 to 10 years? I can't believe it!!

I use it a lot when it's available, but I can get by with the various
kinds of binary search which the STD does provide. A map is a map and I
rarely care how it is implemented.

  Dave Harris, Nottingham, UK | "Weave a circle round him thrice,
      brangdon@cix.co.uk      |   And close your eyes with holy dread,
                              |  For he on honey dew hath fed
 http://www.bhresearch.co.uk/ |   And drunk the milk of Paradise."
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: Steve Clamage <stephen.clamage@sun.com>
Date: 1999/12/01 Raw View

TiTi wrote:
>
> > Recall also that the STL doesn't provide hashing anywhere, as far as I can
> see. Type_info is being consistent with that.
> >When hashing gets added, as I hope it will be in 5 or 10 years time, it will presumably come with a
> > bunch of conventions about the name of the hash function, the type of the
> > hashed value, etc. Type_info is right to avoid pre-empting those
> > decisions.
>
> Are you out of your mind? 5 to 10 years, for something that should have been
> in in the first place? Then I don't think that you use it very often. I used
> it in Java, SmallTalk & C++, over the last 5 years, and I could not have
> done things so fast without 'em. 5 to 10 years? I can't believe it!!

First of all, there is no relationship between type_info and the
STL containers and algorithms. The type_info class is not a
container in the STL sense.

Second, type_info was deliberately not designed to support general
type queries of the kind available in, for example, Smalltalk.

Third, the original STL did not contain any hash tables. There
was a proposal to add hash tables, but it was submitted far too
late to get it into the standard.

Since the standard is only 18 months old, it will be some time before
a new version will worked on. It took 10 years for a new version of
the C standard, for example.

But nothing prevents hash table classes and algorithms from being
developed between now and whenver the C++ standard is next revised.
Ideally, new features will have been in use and bugs worked out so
that the next standard can simply codify them.

--
Steve Clamage, stephen.clamage@sun.com
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: "James Kuyper Jr." <kuyper@wizard.net>
Date: 1999/12/01 Raw View

sirwillard@my-deja.com wrote:
....
> Yes, someone else pointed it out.  I'm gleaning my info about the
> standard from usenet postings and my libraries documentation rather
> than from the standard itself.  I know I shouldn't be doing this, but I
> just have not justified the expense of obtaining the standard yet.  In

It's only $18 (in the US, anyway). You can easily  waste more time
searching your alternate sources than it would take to earn that much
money, particularly if you're currently working as a C++ programmer.
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: fjh@cs.mu.OZ.AU (Fergus Henderson)
Date: 1999/12/01 Raw View

"James Kuyper Jr." <kuyper@wizard.net> writes:

>sirwillard@my-deja.com wrote:
>....
>> I just have not justified the expense of obtaining the standard yet.
>
>It's only $18 (in the US, anyway).

It's now available from www.standards.com.au:
the price is AUD $437, which works out to be about USD $275.

--
Fergus Henderson <fjh@cs.mu.oz.au>  |  "I have always known that the pursuit
WWW: <http://www.cs.mu.oz.au/~fjh>  |  of excellence is a lethal habit"
PGP: finger fjh@128.250.37.3        |     -- the last words of T. S. Garp.

[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: "J.Barfurth" <techview@bfk-net.de>
Date: 1999/11/27 Raw View

TiTi <Serra.Angel@Serra.Shrine> schrieb in im Newsbeitrag:
81g9vn$hmq$1@trex.antw.online.be...

Someone wrote:
> > So you can't rely on the collating order of the pointer within the
> > map.  Instead, we'll have to use the collation order of the type_info
> > object itself.  A logical idea would be to simply specialize
> > std::less<type_info*>.  In fact, most of the people who realized you
> > couldn't use the pointer collation suggested doing just this.  However,
> > in one thread it was pointed out that according to the standard this
> > would produce "undefined behavior", specifically because the standard
> > library impementors are free to provide their own specialization for
> > any built in types, and if they did you'd violate the "one definition
> > rule".  So this is out.
> > I could have resorted to defining a new less_type functor, which would
> > have worked.  However, this would require all uses of type_info within
> > containers with collation ordering to provide an extra parameter to the
> > template.  Instead of a map<type_info*, T> you'd need a map<type_info*,
> > T, less_type>.  Although this isn't that bad, I'd prefer not doing this.

> With some map that is a hashed map, you could use
>
>     template <class key>
>     unsigned HashKey(key _key) {return static_cast<unsigned>(key);} //
just
[...]
>     template<> unsigned HashKey<type_info*>(const type_info* ti)

You missed the point, I think.

He want's to do it portably, using standard containers.

If, in some future version of the standard, your HashKey (and the hashed
map) become part of the Standard C++ Library, the same restrictions would
apply: type_info* is considered as "type, compounded from built in types" (I
think the wording is similar); therefore user code must not specialize a
standard template for it.

>      {
>                 // use "ti->name" property (or other properties) to calc
hash key.
>         The name is (I hope) unique.
>     }

As the OP also mentioned, the name cannot be used. It is not guaranteed to
be unique. As Ms. Bonnard has pointed out in another thread
    char const* typeinfo::name() const { return ""; }
would be a conforming implementation.

OTOH, because of this (neither are there any other suitable properties of
typeinfo) you couldn't implement HashKey<type_info> or HashKey<type_info*>
in a meaningful way in user code. So I'd expect the implementation to
provide one. Alas, they don't provide the obvious specialization for
std::less<type_info*> :-(

-- J   rg Barfurth

[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: "J.Barfurth" <techview@bfk-net.de>
Date: 1999/11/27 Raw View


<sirwillard@my-deja.com> schrieb in im Newsbeitrag:
81ecdl$pq$1@nnrp1.deja.com...

> struct type_info_key
> {
>    type_info_key(const type_info& type) : type(&type) { }
>    bool operator==(const type_info_key& other) const {
>       return !!type->operator==(*other.type);
>    }
>    bool operator<(const type_info_key& other) const {
>       return !!type->before(*other.type);
>    }
>    const type_info* type;
> };

What makes you use all those "!!" ?

Ah, I see:

> 3.  The C++ language has a boolean type, bool, yet the native class
> type_info uses int returns instead.  Why?

My PDF copy of the standard uses bool as the return type of
type_info::before() and type_info::operator==.

 -- J   rg Barfurth




[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: sirwillard@my-deja.com
Date: 1999/11/27 Raw View

In article <81g9vn$hmq$1@trex.antw.online.be>,
  "TiTi" <Serra.Angel@Serra.Shrine> wrote:
> Hi there,
>
> > I needed type_info to implement a class factory, something I'm sure
> > we've all done before.
>
> Nope never wrote one, I'm more of an "object factory"-kinda guy.
Although I
> would love to see a class factory some time (spawning classes that
could get
> compiled @ run time, seems tricky).

*sighs*  The factory does indead create objects, not classes.

> > The first thought I had was to use type_info::name, making the key
> > human readable.  The problem is that the standard allows an
> > implementation to return anything for this method, with no gaurantee
> > that the name will be unique for each type.  So the name is out.
>
> How typical.

Between the previous comment and this one, it sure sounds like you're
trying to call into question my own competence.

> > The next thought was to use the type_info itself in the map.  The
> type_info
> > class does not have copy construct capabilities.  So, we try storing a
> pointer
> > to one instead.
>
> > So you can't rely on the collating order of the pointer within the
> > map.  Instead, we'll have to use the collation order of the type_info
> > object itself.  A logical idea would be to simply specialize
> > std::less<type_info*>.  In fact, most of the people who realized you
> > couldn't use the pointer collation suggested doing just this.  However,
> > in one thread it was pointed out that according to the standard this
> > would produce "undefined behavior", specifically because the standard
> > library impementors are free to provide their own specialization for
> > any built in types, and if they did you'd violate the "one definition
> > rule".  So this is out.
> > I could have resorted to defining a new less_type functor, which would
> > have worked.  However, this would require all uses of type_info within
> > containers with collation ordering to provide an extra parameter to the
> > template.  Instead of a map<type_info*, T> you'd need a map<type_info*,
> > T, less_type>.  Although this isn't that bad, I'd prefer not doing this.
>
> With some map that is a hashed map, you could use
>
>     template <class key>
>     unsigned HashKey(key _key) {return static_cast<unsigned> (key);} // just
> some stupid code
>
>     template <class key, class value>
>     class SomeHashedMap {// ... };
>
>     class Value {};
>
>     template<> unsigned HashKey<type_info*>(const type_info* ti)
>     {
>         // use "ti->name" property (or other properties) to calc hash key.
> The name is (I hope) unique.
>     }

First you seem to ridicule my thought of using name, then you turn
around and do just that.  Let me reiterate again:  you can not use name
or raw_name or the type_info address via pointer or reference, because
none of them are gauranteed to be unique by the standard.  In fact, in
most implementations I've dealt with they are not unique values.  There
is *NO* unique value within type_info that can be used for collation,
short of type_info itself, which is non-copyable.

> > So, instead I resort to an adapter.
> > struct type_info_key
> > {
> >    type_info_key(const type_info& type) : type(&type) { }
> >    bool operator==(const type_info_key& other) const {
> >       return !!type->operator==(*other.type);
> >    }
> >    bool operator<(const type_info_key& other) const {
> >       return !!type->before(*other.type);
> >    }
> >    const type_info* type;
> > };
>
> Nice.

Not really.  I can point out several holes in this adapter.  The only
thing "nice" about it is that it will fix the problem at hand, i.e. it
will be insertable within std::map as a key.  This idiom would have
been nice only if it were the idiom employed by type_info in the first
place (where the true type information was in an implementation defined
class which is unusable by client code).  Adapters work, but they're
never perfect solutions, and in this case I question why type_info was
designed to require adaptation in the first place.

> > Now we can use std::map<type_info_key, T> (note that we no longer need
> > to use a pointer).  Not the most intuitive path to arrive at something
> > so simple, but that's what we have with the design and possible
> > implementations of type_info.  However, I'm left with several questions
> > about type_info's design now.
>
> Can't really answer any, but I agree. Someone should answer.

It was designed by committee, so likely there is no "answer".  However,
I hope to generate some discussion about it, and possibly a revision or
addition to the standard in the future.

Sent via Deja.com http://www.deja.com/
Before you buy.

[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: scorp@btinternet.com (Dave Harris)
Date: 1999/11/27 Raw View

sirwillard@my-deja.com () wrote:
> 1.  Useage of type_info within an associative container is so common
> that the designers went the extra mile to allow it by including
> type_info::before for collating, yet they didn't design the class with
> copy construction capabilities.  There are techniques that would have
> allowed this and gauranteed that slices would not have been created.
> The simplest of these is the "pimpl" idiom (which is basically what my
> adaptor does).  Why wasn't this done?

Presumably a matter of not paying for what you don't use. "before()" ought
to be trivial to implement. If a vendor can at least manage unique names
they can use those to provide the ordering, which has minimal cost. Using
"pimpl" has a cost per type_info.

> 2.  If collation capabilities were so important that it was added to
> the design, why were natural collation operations not provided?  Why
> type_info::before instead of type_info::operator<?

Perhaps because of potential confusion with type_info::sub_class_of().
Although sub-class testing isn't yet part of type_info, it's something
that might be added by a future version of the standard, and possibly as a
non-portable extension today. If we have:
     type_info::before();
     type_info::sub_class_of();
     type_info::operator<();

users might be confused over which operator<() actually did.

> 3.  The C++ language has a boolean type, bool, yet the native class
> type_info uses int returns instead.  Why?

This one looks like a mistake to me.

> 4.  Why was there no thought put into providing some state within
> type_info that could be portably used for hashing functions?  I realize
> this would have been more complex, but surely it would be possible.

Use type_info::name(). I would think any implementor who was able to
provide a good hashing value would also be able to provide unique names.

Recall also that the STL doesn't provide hashing anywhere, as far as I can
see. Type_info is being consistent with that. When hashing gets added, as
I hope it will be in 5 or 10 years time, it will presumably come with a
bunch of conventions about the name of the hash function, the type of the
hashed value, etc. Type_info is right to avoid pre-empting those
decisions.

  Dave Harris, Nottingham, UK | "Weave a circle round him thrice,
      brangdon@cix.co.uk      |   And close your eyes with holy dread,
                              |  For he on honey dew hath fed
 http://www.bhresearch.co.uk/ |   And drunk the milk of Paradise."

[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: "TiTi" <TiTi@mad.scientist.com>
Date: 1999/11/29 Raw View

> Recall also that the STL doesn't provide hashing anywhere, as far as I can
see. Type_info is being consistent with that.
>When hashing gets added, as I hope it will be in 5 or 10 years time, it will presumably come with a
> bunch of conventions about the name of the hash function, the type of the
> hashed value, etc. Type_info is right to avoid pre-empting those
> decisions.



Are you out of your mind? 5 to 10 years, for something that should have been
in in the first place? Then I don't think that you use it very often. I used
it in Java, SmallTalk & C++, over the last 5 years, and I could not have
done things so fast without 'em. 5 to 10 years? I can't believe it!!


TiTi
PS: you ruined my day (but it's only monday ...)




[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: smeyers@aristeia.com (Scott Meyers)
Date: 1999/11/24 Raw View

On 23 Nov 1999 16:35:38 GMT, sirwillard@my-deja.com wrote:
....
> different references.  A lot of the examples I found on DejaNews that
> used type_info within a map used the pointer, and in fact the errata
> for Item 31 of More Effective C++ by Scott Meyers recommends a simple
> std::map<std::pair<type_info*, type_info*>, HitFunctionPtr>.  So many
> people have gotten this one wrong.
....
> So you can't rely on the collating order of the pointer within the
> map.  Instead, we'll have to use the collation order of the type_info
> object itself.  A logical idea would be to simply specialize
> std::less<type_info*>.  In fact, most of the people who realized you
> couldn't use the pointer collation suggested doing just this.  However,
> in one thread it was pointed out that according to the standard this
> would produce "undefined behavior", specifically because the standard
> library impementors are free to provide their own specialization for
> any built in types, and if they did you'd violate the "one definition
> rule".  So this is out.

Just for the record, I don't think it's out for the design I sketch in
the MEC++ errata, because it's okay to specialize less for
pair<const type_info*, const type_info*>, which is the type I propose
putting into the map.  That is, I agree that you can't specialize std::less
for a pointer type, but I believe you can specialize it for a std::pair
containing pointer types.

Scott

--
Scott Meyers, Ph.D.                  smeyers@aristeia.com
Software Development Consultant      http://www.aristeia.com/
Visit http://meyerscd.awl.com/ to demo the Effective C++ CD

[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: "James Kuyper Jr." <kuyper@wizard.net>
Date: 1999/11/26 Raw View

Scott Meyers wrote:
....
> Just for the record, I don't think it's out for the design I sketch in
> the MEC++ errata, because it's okay to specialize less for
> pair<const type_info*, const type_info*>, which is the type I propose
> putting into the map.  That is, I agree that you can't specialize std::less
> for a pointer type, but I believe you can specialize it for a std::pair
> containing pointer types.

17.4.3.1 p1: "... Such a specialization ... results in undefined
behavior unless the declaration depends on a user-defined name of
external linkage ..."

I didn't see any user-defined names in your example.

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: "TiTi" <Serra.Angel@Serra.Shrine>
Date: 1999/11/26 Raw View

Hi there,


> I needed type_info to implement a class factory, something I'm sure
> we've all done before.


Nope never wrote one, I'm more of an "object factory"-kinda guy. Although I
would love to see a class factory some time (spawning classes that could get
compiled @ run time, seems tricky).


> The first thought I had was to use type_info::name, making the key
> human readable.  The problem is that the standard allows an
> implementation to return anything for this method, with no gaurantee
> that the name will be unique for each type.  So the name is out.


How typical.


> The next thought was to use the type_info itself in the map.  The
type_info
> class does not have copy construct capabilities.  So, we try storing a
pointer
> to one instead.

> So you can't rely on the collating order of the pointer within the
> map.  Instead, we'll have to use the collation order of the type_info
> object itself.  A logical idea would be to simply specialize
> std::less<type_info*>.  In fact, most of the people who realized you
> couldn't use the pointer collation suggested doing just this.  However,
> in one thread it was pointed out that according to the standard this
> would produce "undefined behavior", specifically because the standard
> library impementors are free to provide their own specialization for
> any built in types, and if they did you'd violate the "one definition
> rule".  So this is out.
> I could have resorted to defining a new less_type functor, which would
> have worked.  However, this would require all uses of type_info within
> containers with collation ordering to provide an extra parameter to the
> template.  Instead of a map<type_info*, T> you'd need a map<type_info*,
> T, less_type>.  Although this isn't that bad, I'd prefer not doing this.





With some map that is a hashed map, you could use

    template <class key>
    unsigned HashKey(key _key) {return static_cast<unsigned>(key);} // just
some stupid code

    template <class key, class value>
    class SomeHashedMap {// ... };

    class Value {};

    template<> unsigned HashKey<type_info*>(const type_info* ti)
    {
        // use "ti->name" property (or other properties) to calc hash key.
The name is (I hope) unique.
    }

    int main()
    {
        SomeHashedMap<type_info*, Value> someMap;
        // ...
        return 0;
    }





> So, instead I resort to an adapter.
> struct type_info_key
> {
>    type_info_key(const type_info& type) : type(&type) { }
>    bool operator==(const type_info_key& other) const {
>       return !!type->operator==(*other.type);
>    }
>    bool operator<(const type_info_key& other) const {
>       return !!type->before(*other.type);
>    }
>    const type_info* type;
> };



Nice.



> Now we can use std::map<type_info_key, T> (note that we no longer need
> to use a pointer).  Not the most intuitive path to arrive at something
> so simple, but that's what we have with the design and possible
> implementations of type_info.  However, I'm left with several questions
> about type_info's design now.



Can't really answer any, but I agree. Someone should answer.



TiTi














































---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: sirwillard@my-deja.com
Date: 1999/11/23 Raw View

I've recently done some research into type_info which has lead me to
several questions.  I'm going to walk through my explorations of this
type.  If anyone finds things within my analysis that are wrong, please
point them out to me.

I needed type_info to implement a class factory, something I'm sure
we've all done before.  It's something that should have been simple,
but while coming up with my final implementation I ran across several
surprising (to me at least) things.  In order to implement the class
factory I needed a std::map (or some other associative container) to
map from the type_info to a creation function.  This is where the fun
begins.

The first thought I had was to use type_info::name, making the key
human readable.  The problem is that the standard allows an
implementation to return anything for this method, with no gaurantee
that the name will be unique for each type.  So the name is out.

The next thought was to use the type_info itself in the map.  Most of
you can guess the problem I had with the first attempt.  The type_info
class does not have copy construct capabilities.  I understand why this
is... an implementation may use type_info derivatives and copy
constructing one would cause a slice.  In any event, storing one of
these things within a container is out.   So, we try storing a pointer
to one instead.

The standard gaurantees the reference returned by typeid will be valid
until the end of the program execution, so storing a pointer should be
safe.  However, a bit of research on DejaNews enlightens me to some of
the subtleties that exist in the standard.  The biggest of which is
that although you get a reference that's valid until the end of the
program, you may get a different reference every time you call typeid.
In fact, with dynamic link libraries you almost certainly will get
different references.  A lot of the examples I found on DejaNews that
used type_info within a map used the pointer, and in fact the errata
for Item 31 of More Effective C++ by Scott Meyers recommends a simple
std::map<std::pair<type_info*, type_info*>, HitFunctionPtr>.  So many
people have gotten this one wrong.

So you can't rely on the collating order of the pointer within the
map.  Instead, we'll have to use the collation order of the type_info
object itself.  A logical idea would be to simply specialize
std::less<type_info*>.  In fact, most of the people who realized you
couldn't use the pointer collation suggested doing just this.  However,
in one thread it was pointed out that according to the standard this
would produce "undefined behavior", specifically because the standard
library impementors are free to provide their own specialization for
any built in types, and if they did you'd violate the "one definition
rule".  So this is out.

I could have resorted to defining a new less_type functor, which would
have worked.  However, this would require all uses of type_info within
containers with collation ordering to provide an extra parameter to the
template.  Instead of a map<type_info*, T> you'd need a map<type_info*,
T, less_type>.  Although this isn't that bad, I'd prefer not doing this.

So, instead I resort to an adapter.

struct type_info_key
{
   type_info_key(const type_info& type) : type(&type) { }
   bool operator==(const type_info_key& other) const {
      return !!type->operator==(*other.type);
   }
   bool operator<(const type_info_key& other) const {
      return !!type->before(*other.type);
   }
   const type_info* type;
};

Now we can use std::map<type_info_key, T> (note that we no longer need
to use a pointer).  Not the most intuitive path to arrive at something
so simple, but that's what we have with the design and possible
implementations of type_info.  However, I'm left with several questions
about type_info's design now.

1.  Useage of type_info within an associative container is so common
that the designers went the extra mile to allow it by including
type_info::before for collating, yet they didn't design the class with
copy construction capabilities.  There are techniques that would have
allowed this and gauranteed that slices would not have been created.
The simplest of these is the "pimpl" idiom (which is basically what my
adaptor does).  Why wasn't this done?

2.  If collation capabilities were so important that it was added to
the design, why were natural collation operations not provided?  Why
type_info::before instead of type_info::operator<?

3.  The C++ language has a boolean type, bool, yet the native class
type_info uses int returns instead.  Why?

And finally, the age old question I've seen asked over and over:

4.  Why was there no thought put into providing some state within
type_info that could be portably used for hashing functions?  I realize
this would have been more complex, but surely it would be possible.

If the design followed the thoughts in questions 1-3 than the use of
type_info would be more logical and intuitive and we wouldn't see so
many people making some of the mistakes I pointed out.  A simple
std::map<type_info, T> would be valid "out of the box".  Implementation
of this alternative type_info wouldn't be any more complicated (in
fact, my type_info_key is basically an implementation of the
alternative).  At the very least, it seems that the standard library
should provide an adaptor such as mine.


Sent via Deja.com http://www.deja.com/
Before you buy.


[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]