Thread

Topic: Simple Complete Hashing

Author: Chris Jefferson <chris@bubblescope.net>
Date: Wed, 09 Jan 2013 10:17:36 +0000 Raw View

This is a multi-part message in MIME format.
--------------030708060001040000000702
Content-Type: text/plain; charset=ISO-8859-1; format=flowed

I have been meaning to write up something about hashing for a long time.
This week, when I discovered that users cannot write a hash function for
iterators of standard containers (you can't do &*it to get a raw pointer
on past-the-end iterators) motivated me to write up my thoughts, and
encourage discussion.

A better written up, alternative proposal is at:
http://open-std.org/JTC1/SC22/WG21/docs/papers/2012/n3333.html
<http://open-std.org/JTC1/SC22/WG21/docs/papers/2012/n3333.html#>
I am interested if anyone believes this alternative proposal is worth
serious consideration.

The main difference is that while N3333 introduces a new function
hash_value(), and std::hash<T> forwards to hash_value(t). The
implementation of hash_value(t) is complicated by the fact that it has
to worry about conversions to bool, and adl. I suggest instead:

1) Introduce a default implementation of std::hash as follows:

template<typename T>
struct hash { size_t operator()(const T& t) { return t.hash(); } };

This provides users with the ability to get a hash by adding a hash
member function to their class. They can still specialise std::hash if
they like. This implementation is much shorter and easier to understand.
It also avoids ADL issues.

2) Define two helper functions:

template<typename It>
size_t hash_range(It begin, It end); // hash a range. Assumes
std::iterator_traits<It>::value_type is hashable.

This hash function returns equal for two ranges if
std::iterator_traits<It>::value_type are the same for both ranges, and
the ranges are equal. It is NOT allowed to vary depending on the exact
type It. The standard containers are not required to use hash_range.

template<typename... Args>
size_t hash_values(const Args&... args); // return the hash of
make_tuple(args...). There are some tiny issues to do with consts and
things, nothing serious.
// For simplicity, we require that the hash of std::tuple<T> (so a tuple
with a single member) is the same as the hash of T. This is the only
place where we place a requirement on the hash of a function, and it is
just to make the relationship between the hash of tuples, hash_values
and single hashes easy to understand.

Users who just want to have a single value use hash_values, which a
single variable. There are no issues with conversions to bool, or similar.


I believe the 'hashes should return different values' to be an unnessary
confusion, and I frequently personally pass hash values between
different processes. However, I can see this is a complex issue, with
different viewpoints.


**** Hashing the standard library ****

One huge piece of work, which unfortunatly relies on first choosing a
hashing method, so could end up missing C++1y, is hashing the standard
library. I have done this in g++ and it is a simple, but very large,
process. I did it as follows:

1) The concept 'hashable' remains the same as the current standard. A
type T is hashable if std::hash<T> has an operator(), which satisfies
the condition that for two values a and b of type T, a==b implies
std::hash<T>()(a) == std::hash<T>()(b). This is the only requirement
placed on hash functions in the standard library (so, for example, empty
std::vector<int> and std::list<int>s may not have the same hash value).

2) Every type T in the standard library with operator== defined gains a
".hash() const" member function. This function requires that any members
which required to have operator== defined for T to be comparable are
hashable (this tends to make sense of each type).

These hash functions are in the main trivial to implement. Ironically,
the only hash functions which required any thought were the ones for
unordered_* containers. The easiest way to implement these is with a
hash function which takes an unordered range. Such hash functions exist
and are efficient.

Chris

--




--------------030708060001040000000702
Content-Type: text/html; charset=ISO-8859-1

<html>
  <head>

    <meta http-equiv="content-type" content="text/html; charset=ISO-8859-1">
  </head>
  <body text="#000000" bgcolor="#FFFFFF">
    I have been meaning to write up something about hashing for a long
    time. This week, when I discovered that users cannot write a hash
    function for iterators of standard containers (you can't do &amp;*it
    to get a raw pointer on past-the-end iterators) motivated me to
    write up my thoughts, and encourage discussion.<br>
    <br>
    A better written up, alternative proposal is at:
    <meta http-equiv="content-type" content="text/html;
      charset=ISO-8859-1">
    <a
      href="http://open-std.org/JTC1/SC22/WG21/docs/papers/2012/n3333.html#">http://open-std.org/JTC1/SC22/WG21/docs/papers/2012/n3333.html</a><br>
    I am interested if anyone believes this alternative proposal is
    worth serious consideration.<br>
    <br>
    The main difference is that while N3333 introduces a new function
    hash_value(), and std::hash&lt;T&gt; forwards to hash_value(t). The
    implementation of hash_value(t) is complicated by the fact that it
    has to worry about conversions to bool, and adl. I suggest instead:<br>
    <br>
    1) Introduce a default implementation of std::hash as follows:<br>
    <br>
    template&lt;typename T&gt;<br>
    struct hash { size_t operator()(const T&amp; t) { return t.hash(); }
    };<br>
    <br>
    This provides users with the ability to get a hash by adding a hash
    member function to their class. They can still specialise std::hash
    if they like. This implementation is much shorter and easier to
    understand. It also avoids ADL issues.<br>
    <br>
    2) Define two helper functions:<br>
    <br>
    template&lt;typename It&gt;<br>
    size_t hash_range(It begin, It end); // hash a range. Assumes
    std::iterator_traits&lt;It&gt;::value_type is hashable.<br>
    <br>
    This hash function returns equal for two ranges if
    std::iterator_traits&lt;It&gt;::value_type are the same for both
    ranges, and the ranges are equal. It is NOT allowed to vary
    depending on the exact type It. The standard containers are not
    required to use hash_range.<br>
    <br>
    template&lt;typename... Args&gt;<br>
    size_t hash_values(const Args&amp;... args); // return the hash of
    make_tuple(args...). There are some tiny issues to do with consts
    and things, nothing serious.<br>
    // For simplicity, we require that the hash of std::tuple&lt;T&gt;
    (so a tuple with a single member) is the same as the hash of T. This
    is the only place where we place a requirement on the hash of a
    function, and it is just to make the relationship between the hash
    of tuples, hash_values and single hashes easy to understand.<br>
    <br>
    Users who just want to have a single value use hash_values, which a
    single variable. There are no issues with conversions to bool, or
    similar. <br>
    <br>
    <br>
    I believe the 'hashes should return different values' to be an
    unnessary confusion, and I frequently personally pass hash values
    between different processes. However, I can see this is a complex
    issue, with different viewpoints.<br>
    <br>
    <br>
    **** Hashing the standard library ****<br>
    <br>
    One huge piece of work, which unfortunatly relies on first choosing
    a hashing method, so could end up missing C++1y, is hashing the
    standard library. I have done this in g++ and it is a simple, but
    very large, process. I did it as follows:<br>
    <br>
    1) The concept 'hashable' remains the same as the current standard.
    A type T is hashable if std::hash&lt;T&gt; has an operator(), which
    satisfies the condition that for two values a and b of type T, a==b
    implies std::hash&lt;T&gt;()(a) == std::hash&lt;T&gt;()(b). This is
    the only requirement placed on hash functions in the standard
    library (so, for example, empty std::vector&lt;int&gt; and
    std::list&lt;int&gt;s may not have the same hash value).<br>
    <br>
    2) Every type T in the standard library with operator== defined
    gains a ".hash() const" member function. This function requires that
    any members which required to have operator== defined for T to be
    comparable are hashable (this tends to make sense of each type).<br>
    <br>
    These hash functions are in the main trivial to implement.
    Ironically, the only hash functions which required any thought were
    the ones for unordered_* containers. The easiest way to implement
    these is with a hash function which takes an unordered range. Such
    hash functions exist and are efficient.<br>
    <br>
    Chris<br>
  </body>
</html>

<p></p>

-- <br />
&nbsp;<br />
&nbsp;<br />
&nbsp;<br />

--------------030708060001040000000702--

.

Author: s.hesp@oisyn.nl
Date: Wed, 9 Jan 2013 07:11:57 -0800 (PST) Raw View

------=_Part_2873_24609915.1357744317684
Content-Type: text/plain; charset=ISO-8859-1



On Wednesday, January 9, 2013 11:17:36 AM UTC+1, Chris Jefferson wrote:
>
>  I have been meaning to write up something about hashing for a long time.
> This week, when I discovered that users cannot write a hash function for
> iterators of standard containers (you can't do &*it to get a raw pointer on
> past-the-end iterators) motivated me to write up my thoughts, and encourage
> discussion.
>

Interesting. But why stop at iterators? I myself would like to be able to
put a std::function<> in an associative container. I know this has some
complications (like whether hash(bind(f)) == hash(f), pointer-to-members
should be hashable, etc).


> A better written up, alternative proposal is at:
> http://open-std.org/JTC1/SC22/WG21/docs/papers/2012/n3333.html<http://open-std.org/JTC1/SC22/WG21/docs/papers/2012/n3333.html#>
> I am interested if anyone believes this alternative proposal is worth
> serious consideration.
>

I do. std::hash should've always been a function, explicitely overriding
std::hash is tedious. What I don't like about the proposal is the rule that
hash values should be different across processes, which make things
difficult to debug (you like things to be perfectly deterministic every run
when working on the same input), and sometimes you want the output to be
simply deterministic as well (as an example, I work for a game studio and
for the past few months we've been determining and removing
non-deterministic ordering in our game content build processes, in order to
keep binary patches of that content as small as possible).

The main difference is that while N3333 introduces a new function
> hash_value(), and std::hash<T> forwards to hash_value(t). The
> implementation of hash_value(t) is complicated by the fact that it has to
> worry about conversions to bool
>

I think you misunderstood the proposal. A user implementation doesn't need
to care about conversion to bool, the implementation needs to make sure
that hash_value(bool) is not considered as a valid overload when calling
hash_value(T) when T is convertible to bool. This can be implemented as
described in the proposal, and the user doesn't need to worry about it.


> and adl.
>

Also not really a problem. A typical implementation would be combining hash
functions of several fields, so it can simply call std::hash_combine() and
won't have any ADL issues. If they really want to call hash_value, they
simply need to 'use' the std variant in the current scope to get access to
hash values of built-in types. Also, they could always fall back on calling
std::hash<T>()().


> I suggest instead:
>
> 1) Introduce a default implementation of std::hash as follows:
>
> template<typename T>
> struct hash { size_t operator()(const T& t) { return t.hash(); } };
>
> This provides users with the ability to get a hash by adding a hash member
> function to their class. They can still specialise std::hash if they like.
> This implementation is much shorter and easier to understand. It also
> avoids ADL issues.
>

But what if you want to add hash functionality to library classes? Yes, you
can specialize std::hash, but one of the points of N3333 was that that's a
tedious and very verbose solution. Still, I agree that *having the ability*to simply define a hash function on a class works best in my opinion if
that's an option available to the user. So why not combine all three
methods?

std::hash<T>()(t) (C++11) calls hash_value(t) (N333), and the default
implementation for hash_value() for unknown types calls t.hash() (your
proposal) (although I'd recommend to name the member function hash_value()
as well in that case)


> **** Hashing the standard library ****
>
> One huge piece of work, which unfortunatly relies on first choosing a
> hashing method, so could end up missing C++1y, is hashing the standard
> library. I have done this in g++ and it is a simple, but very large,
> process. I did it as follows:
>
> 1) The concept 'hashable' remains the same as the current standard. A type
> T is hashable if std::hash<T> has an operator(), which satisfies the
> condition that for two values a and b of type T, a==b implies
> std::hash<T>()(a) == std::hash<T>()(b). This is the only requirement placed
> on hash functions in the standard library (so, for example, empty
> std::vector<int> and std::list<int>s may not have the same hash value).
>
> 2) Every type T in the standard library with operator== defined gains a
> ".hash() const" member function. This function requires that any members
> which required to have operator== defined for T to be comparable are
> hashable (this tends to make sense of each type).
>
> These hash functions are in the main trivial to implement. Ironically, the
> only hash functions which required any thought were the ones for
> unordered_* containers. The easiest way to implement these is with a hash
> function which takes an unordered range. Such hash functions exist and are
> efficient.
>

+1 for hash implementations for all (sensible) standard library classes,
regardless of how it is implemented.

--




------=_Part_2873_24609915.1357744317684
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

<br><br>On Wednesday, January 9, 2013 11:17:36 AM UTC+1, Chris Jefferson wr=
ote:<blockquote class=3D"gmail_quote" style=3D"margin: 0;margin-left: 0.8ex=
;border-left: 1px #ccc solid;padding-left: 1ex;">
 =20

   =20
 =20
  <div text=3D"#000000" bgcolor=3D"#FFFFFF">
    I have been meaning to write up something about hashing for a long
    time. This week, when I discovered that users cannot write a hash
    function for iterators of standard containers (you can't do &amp;*it
    to get a raw pointer on past-the-end iterators) motivated me to
    write up my thoughts, and encourage discussion.<br></div></blockquote><=
div><br></div><div>Interesting. But why stop at iterators? I myself would l=
ike to be able to put a std::function&lt;&gt; in an associative container. =
I know this has some complications (like whether <font face=3D"courier new,=
 monospace">hash(bind(f)) =3D=3D hash(f)</font>, pointer-to-members should =
be hashable, etc).</div><div>&nbsp;</div><blockquote class=3D"gmail_quote" =
style=3D"margin: 0;margin-left: 0.8ex;border-left: 1px #ccc solid;padding-l=
eft: 1ex;"><div text=3D"#000000" bgcolor=3D"#FFFFFF">A better written up, a=
lternative proposal is at:
   =20
    <a href=3D"http://open-std.org/JTC1/SC22/WG21/docs/papers/2012/n3333.ht=
ml#" target=3D"_blank">http://open-std.org/JTC1/SC22/<wbr>WG21/docs/papers/=
2012/n3333.<wbr>html</a><br>
    I am interested if anyone believes this alternative proposal is
    worth serious consideration.</div></blockquote><div><br></div><div>I do=
.. std::hash should've always been a function, explicitely overriding std::h=
ash is tedious. What I don't like about the proposal is the rule that hash =
values should be different across processes, which make things difficult to=
 debug (you like things to be perfectly deterministic every run when workin=
g on the same input), and sometimes you want the output to be simply determ=
inistic as well (as an example, I work for a game studio and for the past f=
ew months we've been determining and removing non-deterministic ordering in=
 our game content build processes, in order to keep binary patches of that =
content as small as possible).</div><div><br></div><blockquote class=3D"gma=
il_quote" style=3D"margin: 0;margin-left: 0.8ex;border-left: 1px #ccc solid=
;padding-left: 1ex;"><div text=3D"#000000" bgcolor=3D"#FFFFFF">The main dif=
ference is that while N3333 introduces a new function
    hash_value(), and std::hash&lt;T&gt; forwards to hash_value(t). The
    implementation of hash_value(t) is complicated by the fact that it
    has to worry about conversions to bool</div></blockquote><div><br></div=
><div>I think you misunderstood the proposal. A user implementation doesn't=
 need to care about conversion to bool, the implementation needs to make su=
re that hash_value(bool) is not considered as a valid overload when calling=
 hash_value(T) when T is convertible to bool. This can be implemented as de=
scribed in the proposal, and the user doesn't need to worry about it.&nbsp;=
<br></div><div><div>&nbsp;</div><blockquote class=3D"gmail_quote" style=3D"=
margin: 0px 0px 0px 0.8ex; border-left-width: 1px; border-left-color: rgb(2=
04, 204, 204); border-left-style: solid; padding-left: 1ex;"><div text=3D"#=
000000" bgcolor=3D"#FFFFFF">and adl.</div></blockquote><div>&nbsp;</div><di=
v>Also not really a problem. A typical implementation would be&nbsp;combini=
ng hash functions of several fields, so it can simply call std::hash_combin=
e() and won't have any ADL issues. If they really want to call hash_value, =
they simply need to 'use' the std variant in the current scope to get acces=
s to hash values of built-in types. Also, they could always fall back on ca=
lling std::hash&lt;T&gt;()().</div></div><div>&nbsp;<br></div><blockquote c=
lass=3D"gmail_quote" style=3D"margin: 0;margin-left: 0.8ex;border-left: 1px=
 #ccc solid;padding-left: 1ex;"><div text=3D"#000000" bgcolor=3D"#FFFFFF"> =
I suggest instead:<br>
    <br>
    1) Introduce a default implementation of std::hash as follows:<br>
    <br>
    template&lt;typename T&gt;<br>
    struct hash { size_t operator()(const T&amp; t) { return t.hash(); }
    };<br>
    <br>
    This provides users with the ability to get a hash by adding a hash
    member function to their class. They can still specialise std::hash
    if they like. This implementation is much shorter and easier to
    understand. It also avoids ADL issues.<br></div></blockquote><div><br><=
/div><div>But what if you want to add hash functionality to library classes=
? Yes, you can specialize std::hash, but one of the points of N3333 was tha=
t that's a tedious and very verbose solution. Still, I agree that <i>having=
 the ability</i> to simply define a hash function on a class works best in =
my opinion if that's an option available to the user. So why not combine al=
l three methods?</div><div><br></div><div>std::hash&lt;T&gt;()(t) (C++11) c=
alls hash_value(t) (N333), and the default implementation for hash_value() =
for unknown types calls t.hash() (your proposal) (although I'd recommend to=
 name the member function hash_value() as well in that case)</div><div>&nbs=
p;</div><blockquote class=3D"gmail_quote" style=3D"margin: 0;margin-left: 0=
..8ex;border-left: 1px #ccc solid;padding-left: 1ex;"><div text=3D"#000000" =
bgcolor=3D"#FFFFFF">**** Hashing the standard library ****<br>
    <br>
    One huge piece of work, which unfortunatly relies on first choosing
    a hashing method, so could end up missing C++1y, is hashing the
    standard library. I have done this in g++ and it is a simple, but
    very large, process. I did it as follows:<br>
    <br>
    1) The concept 'hashable' remains the same as the current standard.
    A type T is hashable if std::hash&lt;T&gt; has an operator(), which
    satisfies the condition that for two values a and b of type T, a=3D=3Db
    implies std::hash&lt;T&gt;()(a) =3D=3D std::hash&lt;T&gt;()(b). This is
    the only requirement placed on hash functions in the standard
    library (so, for example, empty std::vector&lt;int&gt; and
    std::list&lt;int&gt;s may not have the same hash value).<br>
    <br>
    2) Every type T in the standard library with operator=3D=3D defined
    gains a ".hash() const" member function. This function requires that
    any members which required to have operator=3D=3D defined for T to be
    comparable are hashable (this tends to make sense of each type).<br>
    <br>
    These hash functions are in the main trivial to implement.
    Ironically, the only hash functions which required any thought were
    the ones for unordered_* containers. The easiest way to implement
    these is with a hash function which takes an unordered range. Such
    hash functions exist and are efficient.<br></div></blockquote><div><br>=
</div><div>+1 for hash implementations for all (sensible) standard library =
classes, regardless of how it is implemented.&nbsp;</div>

<p></p>

-- <br />
&nbsp;<br />
&nbsp;<br />
&nbsp;<br />

------=_Part_2873_24609915.1357744317684--

.

Author: Chris Jefferson <chris@bubblescope.net>
Date: Wed, 09 Jan 2013 16:23:45 +0000 Raw View

This is a multi-part message in MIME format.
--------------020204010408010400090302
Content-Type: text/plain; charset=ISO-8859-1; format=flowed

On 09/01/13 15:11, s.hesp@oisyn.nl wrote:
>
>
> On Wednesday, January 9, 2013 11:17:36 AM UTC+1, Chris Jefferson wrote:
>
>     I have been meaning to write up something about hashing for a long
>     time. This week, when I discovered that users cannot write a hash
>     function for iterators of standard containers (you can't do &*it
>     to get a raw pointer on past-the-end iterators) motivated me to
>     write up my thoughts, and encourage discussion.
>
>
> Interesting. But why stop at iterators? I myself would like to be able
> to put a std::function<> in an associative container. I know this has
> some complications (like whether hash(bind(f)) == hash(f),
> pointer-to-members should be hashable, etc).

Oh yes, everything should be hashable. However, the fact I couldn't even
write a hash function for iterators motivated me to move and suggest
some improvements.

>     A better written up, alternative proposal is at:
>     http://open-std.org/JTC1/SC22/WG21/docs/papers/2012/n3333.html
>     <http://open-std.org/JTC1/SC22/WG21/docs/papers/2012/n3333.html#>
>     I am interested if anyone believes this alternative proposal is
>     worth serious consideration.
>
>
> I do. std::hash should've always been a function, explicitely
> overriding std::hash is tedious. What I don't like about the proposal
> is the rule that hash values should be different across processes,
> which make things difficult to debug (you like things to be perfectly
> deterministic every run when working on the same input), and sometimes
> you want the output to be simply deterministic as well (as an example,
> I work for a game studio and for the past few months we've been
> determining and removing non-deterministic ordering in our game
> content build processes, in order to keep binary patches of that
> content as small as possible).
>
>     The main difference is that while N3333 introduces a new function
>     hash_value(), and std::hash<T> forwards to hash_value(t). The
>     implementation of hash_value(t) is complicated by the fact that it
>     has to worry about conversions to bool
>
>
> I think you misunderstood the proposal. A user implementation doesn't
> need to care about conversion to bool, the implementation needs to
> make sure that hash_value(bool) is not considered as a valid overload
> when calling hash_value(T) when T is convertible to bool. This can be
> implemented as described in the proposal, and the user doesn't need to
> worry about it.

I understand the user does not have to worry about it. However, it is a
"clever hack" around a real problem, but I find such hacks have a habit
of coming back and biting us later. I like the fact that the C++03
standard library was mostly understandable, and things like this make
that increasingly hard. I notice N3333 already says we should do this
hack for every primitive, and "other techniques" to prevent "implicit
conversions" should also be allowed. This sounds like a nasty rabit hole!


>     and adl.
>
> Also not really a problem. A typical implementation would be combining
> hash functions of several fields, so it can simply call
> std::hash_combine() and won't have any ADL issues. If they really want
> to call hash_value, they simply need to 'use' the std variant in the
> current scope to get access to hash values of built-in types. Also,
> they could always fall back on calling std::hash<T>()().
So, we have to remember to 'using std::hash_value'. And the standard
library has to keep calling 'std::hash<T>()(value)' anyway.
>
>     I suggest instead:
>
>     1) Introduce a default implementation of std::hash as follows:
>
>     template<typename T>
>     struct hash { size_t operator()(const T& t) { return t.hash(); } };
>
>     This provides users with the ability to get a hash by adding a
>     hash member function to their class. They can still specialise
>     std::hash if they like. This implementation is much shorter and
>     easier to understand. It also avoids ADL issues.
>
>
> But what if you want to add hash functionality to library classes?
> Yes, you can specialize std::hash, but one of the points of N3333 was
> that that's a tedious and very verbose solution. Still, I agree that
> /having the ability/ to simply define a hash function on a class works
> best in my opinion if that's an option available to the user. So why
> not combine all three methods?
>
> std::hash<T>()(t) (C++11) calls hash_value(t) (N333), and the default
> implementation for hash_value() for unknown types calls t.hash() (your
> proposal) (although I'd recommend to name the member function
> hash_value() as well in that case)

My worry, as I said above, is that while I can see having to specialise
a class is tedious and verbose, using a function seems like a much more
risky possibility. Especially when said function has a whole load of
special overloads to try to catch unwanted conversions.
>
>     **** Hashing the standard library ****
>
>     One huge piece of work, which unfortunatly relies on first
>     choosing a hashing method, so could end up missing C++1y, is
>     hashing the standard library. I have done this in g++ and it is a
>     simple, but very large, process. I did it as follows:
>
>     1) The concept 'hashable' remains the same as the current
>     standard. A type T is hashable if std::hash<T> has an operator(),
>     which satisfies the condition that for two values a and b of type
>     T, a==b implies std::hash<T>()(a) == std::hash<T>()(b). This is
>     the only requirement placed on hash functions in the standard
>     library (so, for example, empty std::vector<int> and
>     std::list<int>s may not have the same hash value).
>
>     2) Every type T in the standard library with operator== defined
>     gains a ".hash() const" member function. This function requires
>     that any members which required to have operator== defined for T
>     to be comparable are hashable (this tends to make sense of each type).
>
>     These hash functions are in the main trivial to implement.
>     Ironically, the only hash functions which required any thought
>     were the ones for unordered_* containers. The easiest way to
>     implement these is with a hash function which takes an unordered
>     range. Such hash functions exist and are efficient.
>
>
> +1 for hash implementations for all (sensible) standard library
> classes, regardless of how it is implemented.
> --
>
>
>

--




--------------020204010408010400090302
Content-Type: text/html; charset=ISO-8859-1

<html>
  <head>
    <meta content="text/html; charset=ISO-8859-1"
      http-equiv="Content-Type">
  </head>
  <body text="#000000" bgcolor="#FFFFFF">
    <div class="moz-cite-prefix">On 09/01/13 15:11, <a class="moz-txt-link-abbreviated" href="mailto:s.hesp@oisyn.nl">s.hesp@oisyn.nl</a>
      wrote:<br>
    </div>
    <blockquote
      cite="mid:752b6793-7784-4cab-b509-d2e0cf238c0b@isocpp.org"
      type="cite"><br>
      <br>
      On Wednesday, January 9, 2013 11:17:36 AM UTC+1, Chris Jefferson
      wrote:
      <blockquote class="gmail_quote" style="margin: 0;margin-left:
        0.8ex;border-left: 1px #ccc solid;padding-left: 1ex;">
        <div text="#000000" bgcolor="#FFFFFF"> I have been meaning to
          write up something about hashing for a long time. This week,
          when I discovered that users cannot write a hash function for
          iterators of standard containers (you can't do &amp;*it to get
          a raw pointer on past-the-end iterators) motivated me to write
          up my thoughts, and encourage discussion.<br>
        </div>
      </blockquote>
      <div><br>
      </div>
      <div>Interesting. But why stop at iterators? I myself would like
        to be able to put a std::function&lt;&gt; in an associative
        container. I know this has some complications (like whether <font
          face="courier new, monospace">hash(bind(f)) == hash(f)</font>,
        pointer-to-members should be hashable, etc).</div>
    </blockquote>
    <br>
    Oh yes, everything should be hashable. However, the fact I couldn't
    even write a hash function for iterators motivated me to move and
    suggest some improvements.<br>
    <br>
    <blockquote
      cite="mid:752b6793-7784-4cab-b509-d2e0cf238c0b@isocpp.org"
      type="cite">
      <div>&nbsp;</div>
      <blockquote class="gmail_quote" style="margin: 0;margin-left:
        0.8ex;border-left: 1px #ccc solid;padding-left: 1ex;">
        <div text="#000000" bgcolor="#FFFFFF">A better written up,
          alternative proposal is at: <a moz-do-not-send="true"
            href="http://open-std.org/JTC1/SC22/WG21/docs/papers/2012/n3333.html#"
            target="_blank">http://open-std.org/JTC1/SC22/<wbr>WG21/docs/papers/2012/n3333.<wbr>html</a><br>
          I am interested if anyone believes this alternative proposal
          is worth serious consideration.</div>
      </blockquote>
      <div><br>
      </div>
      <div>I do. std::hash should've always been a function, explicitely
        overriding std::hash is tedious. What I don't like about the
        proposal is the rule that hash values should be different across
        processes, which make things difficult to debug (you like things
        to be perfectly deterministic every run when working on the same
        input), and sometimes you want the output to be simply
        deterministic as well (as an example, I work for a game studio
        and for the past few months we've been determining and removing
        non-deterministic ordering in our game content build processes,
        in order to keep binary patches of that content as small as
        possible).</div>
      <div><br>
      </div>
      <blockquote class="gmail_quote" style="margin: 0;margin-left:
        0.8ex;border-left: 1px #ccc solid;padding-left: 1ex;">
        <div text="#000000" bgcolor="#FFFFFF">The main difference is
          that while N3333 introduces a new function hash_value(), and
          std::hash&lt;T&gt; forwards to hash_value(t). The
          implementation of hash_value(t) is complicated by the fact
          that it has to worry about conversions to bool</div>
      </blockquote>
      <div><br>
      </div>
      <div>I think you misunderstood the proposal. A user implementation
        doesn't need to care about conversion to bool, the
        implementation needs to make sure that hash_value(bool) is not
        considered as a valid overload when calling hash_value(T) when T
        is convertible to bool. This can be implemented as described in
        the proposal, and the user doesn't need to worry about it. <br>
      </div>
    </blockquote>
    <br>
    I understand the user does not have to worry about it. However, it
    is a "clever hack" around a real problem, but I find such hacks have
    a habit of coming back and biting us later. I like the fact that the
    C++03 standard library was mostly understandable, and things like
    this make that increasingly hard. I notice N3333 already says we
    should do this hack for every primitive, and "other techniques" to
    prevent "implicit conversions" should also be allowed. This sounds
    like a nasty rabit hole!<br>
    <br>
    <br>
    <blockquote
      cite="mid:752b6793-7784-4cab-b509-d2e0cf238c0b@isocpp.org"
      type="cite">
      <div>
        <div>&nbsp;</div>
        <blockquote class="gmail_quote" style="margin: 0px 0px 0px
          0.8ex; border-left-width: 1px; border-left-color: rgb(204,
          204, 204); border-left-style: solid; padding-left: 1ex;">
          <div text="#000000" bgcolor="#FFFFFF">and adl.</div>
        </blockquote>
        <div>&nbsp;</div>
        <div>Also not really a problem. A typical implementation would
          be&nbsp;combining hash functions of several fields, so it can
          simply call std::hash_combine() and won't have any ADL issues.
          If they really want to call hash_value, they simply need to
          'use' the std variant in the current scope to get access to
          hash values of built-in types. Also, they could always fall
          back on calling std::hash&lt;T&gt;()().</div>
      </div>
    </blockquote>
    So, we have to remember to 'using std::hash_value'. And the standard
    library has to keep calling 'std::hash&lt;T&gt;()(value)' anyway.<br>
    <blockquote
      cite="mid:752b6793-7784-4cab-b509-d2e0cf238c0b@isocpp.org"
      type="cite">
      <div>&nbsp;<br>
      </div>
      <blockquote class="gmail_quote" style="margin: 0;margin-left:
        0.8ex;border-left: 1px #ccc solid;padding-left: 1ex;">
        <div text="#000000" bgcolor="#FFFFFF"> I suggest instead:<br>
          <br>
          1) Introduce a default implementation of std::hash as follows:<br>
          <br>
          template&lt;typename T&gt;<br>
          struct hash { size_t operator()(const T&amp; t) { return
          t.hash(); } };<br>
          <br>
          This provides users with the ability to get a hash by adding a
          hash member function to their class. They can still specialise
          std::hash if they like. This implementation is much shorter
          and easier to understand. It also avoids ADL issues.<br>
        </div>
      </blockquote>
      <div><br>
      </div>
      <div>But what if you want to add hash functionality to library
        classes? Yes, you can specialize std::hash, but one of the
        points of N3333 was that that's a tedious and very verbose
        solution. Still, I agree that <i>having the ability</i> to
        simply define a hash function on a class works best in my
        opinion if that's an option available to the user. So why not
        combine all three methods?</div>
      <div><br>
      </div>
      <div>std::hash&lt;T&gt;()(t) (C++11) calls hash_value(t) (N333),
        and the default implementation for hash_value() for unknown
        types calls t.hash() (your proposal) (although I'd recommend to
        name the member function hash_value() as well in that case)</div>
    </blockquote>
    <br>
    My worry, as I said above, is that while I can see having to
    specialise a class is tedious and verbose, using a function seems
    like a much more risky possibility. Especially when said function
    has a whole load of special overloads to try to catch unwanted
    conversions.<br>
    <blockquote
      cite="mid:752b6793-7784-4cab-b509-d2e0cf238c0b@isocpp.org"
      type="cite">
      <div>&nbsp;</div>
      <blockquote class="gmail_quote" style="margin: 0;margin-left:
        0.8ex;border-left: 1px #ccc solid;padding-left: 1ex;">
        <div text="#000000" bgcolor="#FFFFFF">**** Hashing the standard
          library ****<br>
          <br>
          One huge piece of work, which unfortunatly relies on first
          choosing a hashing method, so could end up missing C++1y, is
          hashing the standard library. I have done this in g++ and it
          is a simple, but very large, process. I did it as follows:<br>
          <br>
          1) The concept 'hashable' remains the same as the current
          standard. A type T is hashable if std::hash&lt;T&gt; has an
          operator(), which satisfies the condition that for two values
          a and b of type T, a==b implies std::hash&lt;T&gt;()(a) ==
          std::hash&lt;T&gt;()(b). This is the only requirement placed
          on hash functions in the standard library (so, for example,
          empty std::vector&lt;int&gt; and std::list&lt;int&gt;s may not
          have the same hash value).<br>
          <br>
          2) Every type T in the standard library with operator==
          defined gains a ".hash() const" member function. This function
          requires that any members which required to have operator==
          defined for T to be comparable are hashable (this tends to
          make sense of each type).<br>
          <br>
          These hash functions are in the main trivial to implement.
          Ironically, the only hash functions which required any thought
          were the ones for unordered_* containers. The easiest way to
          implement these is with a hash function which takes an
          unordered range. Such hash functions exist and are efficient.<br>
        </div>
      </blockquote>
      <div><br>
      </div>
      <div>+1 for hash implementations for all (sensible) standard
        library classes, regardless of how it is implemented.&nbsp;</div>
      -- <br>
      &nbsp;<br>
      &nbsp;<br>
      &nbsp;<br>
    </blockquote>
    <br>
  </body>
</html>

<p></p>

-- <br />
&nbsp;<br />
&nbsp;<br />
&nbsp;<br />

--------------020204010408010400090302--

.

Author: DeadMG <wolfeinstein@gmail.com>
Date: Wed, 9 Jan 2013 08:28:26 -0800 (PST) Raw View

------=_Part_60_14996907.1357748906856
Content-Type: text/plain; charset=ISO-8859-1

Functions is never going to happen. You would have to be able to compare
things like lambdas, and if you can come up with sensible comparison
functions for lambdas, feel free. But there's a reason why std::function
has no operator== and it's because nobody can come up with a good
specification for one.

--




------=_Part_60_14996907.1357748906856
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

Functions is never going to happen. You would have to be able to compare th=
ings like lambdas, and if you can come up with sensible comparison function=
s for lambdas, feel free. But there's a reason why std::function has no ope=
rator=3D=3D and it's because nobody can come up with a good specification f=
or one.

<p></p>

-- <br />
&nbsp;<br />
&nbsp;<br />
&nbsp;<br />

------=_Part_60_14996907.1357748906856--

.

Author: s.hesp@oisyn.nl
Date: Wed, 9 Jan 2013 14:16:32 -0800 (PST) Raw View

------=_Part_1348_32665455.1357769793084
Content-Type: text/plain; charset=ISO-8859-1

On Wednesday, January 9, 2013 5:23:45 PM UTC+1, Chris Jefferson wrote:

I understand the user does not have to worry about it. However, it is a
> "clever hack" around a real problem, but I find such hacks have a habit of
> coming back and biting us later.
>

Then fix the problem (for one, that built-in types don't have an associated
namespace, which is the root cause of the ADL problem in this case). Don't
avoid the problem by presenting another solution that annoys the user in
the general scenario. IMHO of course. But that discussion is off-topic here.


So, we have to remember to 'using std::hash_value'. And the standard
> library has to keep calling 'std::hash<T>()(value)' anyway.
>

Well, yes, std::hash_combine() has to call std::hash<T>(). Everyone else
can call std::hash_combine(). No ADL problems, no tediousness. To be
honest, I don't see any cons here, only pros.

My worry, as I said above, is that while I can see having to specialise a
> class is tedious and verbose, using a function seems like a much more risky
> possibility. Especially when said function has a whole load of special
> overloads to try to catch unwanted conversions.
>

Well, can you think of any clear examples when it would be "risky"?

--




------=_Part_1348_32665455.1357769793084
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

On Wednesday, January 9, 2013 5:23:45 PM UTC+1, Chris Jefferson wrote:<div>=
<br></div><div><blockquote style=3D"margin: 0px 0px 0px 40px; border: none;=
 padding: 0px;"><blockquote class=3D"gmail_quote" style=3D"margin: 0px 0px =
0px 0.8ex; border-left-width: 1px; border-left-color: rgb(204, 204, 204); b=
order-left-style: solid; padding-left: 1ex;"><div text=3D"#000000" bgcolor=
=3D"#FFFFFF">I understand the user does not have to worry about it. However=
, it is a "clever hack" around a real problem, but I find such hacks have a=
 habit of coming back and biting us later.</div></blockquote></blockquote><=
div><br></div><div>Then fix the problem (for one, that built-in types don't=
 have an associated namespace, which is the root cause of the ADL problem i=
n this case). Don't avoid the problem by presenting another solution that a=
nnoys the user in the general scenario. IMHO of course. But that discussion=
 is off-topic here.</div><div>&nbsp;</div><blockquote style=3D"margin: 0px =
0px 0px 40px; border: none; padding: 0px;"><blockquote class=3D"gmail_quote=
" style=3D"margin: 0px 0px 0px 0.8ex; border-left-width: 1px; border-left-c=
olor: rgb(204, 204, 204); border-left-style: solid; padding-left: 1ex;"><di=
v text=3D"#000000" bgcolor=3D"#FFFFFF">So, we have to remember to 'using st=
d::hash_value'. And the standard library has to keep calling 'std::hash&lt;=
T&gt;()(value)' anyway.</div></blockquote></blockquote><div><br></div><div>=
Well, yes, std::hash_combine() has to call std::hash&lt;T&gt;(). Everyone e=
lse can call std::hash_combine(). No ADL problems, no tediousness. To be ho=
nest, I don't see any cons here, only pros.</div><div><br></div><blockquote=
 style=3D"margin: 0px 0px 0px 40px; border: none; padding: 0px;"><blockquot=
e class=3D"gmail_quote" style=3D"margin: 0px 0px 0px 0.8ex; border-left-wid=
th: 1px; border-left-color: rgb(204, 204, 204); border-left-style: solid; p=
adding-left: 1ex;"><div text=3D"#000000" bgcolor=3D"#FFFFFF">My worry, as I=
 said above, is that while I can see having to specialise a class is tediou=
s and verbose, using a function seems like a much more risky possibility. E=
specially when said function has a whole load of special overloads to try t=
o catch unwanted conversions.</div></blockquote></blockquote><div><br></div=
><div>Well, can you think of any clear examples when it would be "risky"?&n=
bsp;</div></div>

<p></p>

-- <br />
&nbsp;<br />
&nbsp;<br />
&nbsp;<br />

------=_Part_1348_32665455.1357769793084--

.

Author: Christopher Jefferson <chris@bubblescope.net>
Date: Thu, 10 Jan 2013 10:52:14 +0000 Raw View

On 9 January 2013 22:16,  <s.hesp@oisyn.nl> wrote:
> On Wednesday, January 9, 2013 5:23:45 PM UTC+1, Chris Jefferson wrote:
>> My worry, as I said above, is that while I can see having to specialise a
>> class is tedious and verbose, using a function seems like a much more risky
>> possibility. Especially when said function has a whole load of special
>> overloads to try to catch unwanted conversions.
>
>
> Well, can you think of any clear examples when it would be "risky"?
>

Consider a user class A, where we implement hashing for A using:

size_t hash_value(const A& a), any class derived from A will hash, by slicing.

Also, if A has a constructor: A(const B&), then B will also gain a
hash function.

Avoiding this constructor issue is the aim of the workaround given in
the paper, where instead of implementing:

size_t hash_value(bool b);, we provide:

template<typename Bool>
  typename enable_if<is_same<Bool, bool>::value, size_t>::type
  hash_value(Bool b) { return ...; }

However, I don't see a good argument why we wouldn't also want to
advise users to write that, to avoid exactly the same issues we are
trying to avoid. And, if we want to advise users to write that, then
surely it is simpler to ask them to write:

template<>
struct hash<bool>
{ size_t operator()(bool b) { ... } };

Chris

--

.

Author: Sylvester Hesp <s.hesp@oisyn.nl>
Date: Thu, 10 Jan 2013 03:38:26 -0800 (PST) Raw View

------=_Part_113_1044705.1357817906896
Content-Type: text/plain; charset=ISO-8859-1

On Thursday, January 10, 2013 11:52:14 AM UTC+1, Chris Jefferson wrote:

On 9 January 2013 22:16,  <s.h...@oisyn.nl <javascript:>> wrote:

> On Wednesday, January 9, 2013 5:23:45 PM UTC+1, Chris Jefferson wrote:

>> My worry, as I said above, is that while I can see having to specialise
> a

>> class is tedious and verbose, using a function seems like a much more
> risky

>> possibility. Especially when said function has a whole load of special

>> overloads to try to catch unwanted conversions.

>

>

> Well, can you think of any clear examples when it would be "risky"?

>

> Consider a user class A, where we implement hashing for A using:

> size_t hash_value(const A& a), any class derived from A will hash, by
> slicing.

This might even be what the author originally intended. This is not very
different from other languages where you implement a hash function on a
class, which gets inherited by any derived class. The rule of thumb in Java
and .Net is: if you override the equality method/operator (whichever is
applicable in the language), you should also override the hash function. I
don't see any real problems with that.

Also, you don't really solve that problem with your proposal, as it'll also
apply to any implemented hash() function on the base class.

Also, if A has a constructor: A(const B&), then B will also gain a

hash function.

This is just an extention of the above. If you want B to be implicitely
convertible to A, you automatically gain B's perks for every A. This also
applies to regular operators. I don't see why it shouldn't apply to hash
calculation. If you don't want that, don't make a type implicitely
convertible.

But it's clear that we differ in opinion on this subject. Suffice to say, I
like N3333 (aside from the suggested indeterministicness), and I like being
able to implement a hash function on a class.

--

------=_Part_113_1044705.1357817906896
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

<br><br>On Thursday, January 10, 2013 11:52:14 AM UTC+1, Chris Jefferson wr=
ote:<blockquote style=3D"margin: 0 0 0 40px; border: none; padding: 0px;"><=
blockquote class=3D"gmail_quote" style=3D"margin: 0;margin-left: 0.8ex;bord=
er-left: 1px #ccc solid;padding-left: 1ex;">On 9 January 2013 22:16, &nbsp;=
&lt;<a href=3D"javascript:" target=3D"_blank" gdf-obfuscated-mailto=3D"0bbF=
g-mvlKMJ">s.h...@oisyn.nl</a>&gt; wrote:
</blockquote><blockquote class=3D"gmail_quote" style=3D"margin: 0;margin-le=
ft: 0.8ex;border-left: 1px #ccc solid;padding-left: 1ex;">&gt; On Wednesday=
, January 9, 2013 5:23:45 PM UTC+1, Chris Jefferson wrote:
</blockquote><blockquote class=3D"gmail_quote" style=3D"margin: 0;margin-le=
ft: 0.8ex;border-left: 1px #ccc solid;padding-left: 1ex;">&gt;&gt; My worry=
, as I said above, is that while I can see having to specialise a
</blockquote><blockquote class=3D"gmail_quote" style=3D"margin: 0;margin-le=
ft: 0.8ex;border-left: 1px #ccc solid;padding-left: 1ex;">&gt;&gt; class is=
 tedious and verbose, using a function seems like a much more risky
</blockquote><blockquote class=3D"gmail_quote" style=3D"margin: 0;margin-le=
ft: 0.8ex;border-left: 1px #ccc solid;padding-left: 1ex;">&gt;&gt; possibil=
ity. Especially when said function has a whole load of special
</blockquote><blockquote class=3D"gmail_quote" style=3D"margin: 0;margin-le=
ft: 0.8ex;border-left: 1px #ccc solid;padding-left: 1ex;">&gt;&gt; overload=
s to try to catch unwanted conversions.
</blockquote><blockquote class=3D"gmail_quote" style=3D"margin: 0;margin-le=
ft: 0.8ex;border-left: 1px #ccc solid;padding-left: 1ex;">&gt;
</blockquote><blockquote class=3D"gmail_quote" style=3D"margin: 0;margin-le=
ft: 0.8ex;border-left: 1px #ccc solid;padding-left: 1ex;">&gt;
</blockquote><blockquote class=3D"gmail_quote" style=3D"margin: 0;margin-le=
ft: 0.8ex;border-left: 1px #ccc solid;padding-left: 1ex;">&gt; Well, can yo=
u think of any clear examples when it would be "risky"?
</blockquote><blockquote class=3D"gmail_quote" style=3D"margin: 0;margin-le=
ft: 0.8ex;border-left: 1px #ccc solid;padding-left: 1ex;">&gt;
</blockquote><blockquote class=3D"gmail_quote" style=3D"margin: 0;margin-le=
ft: 0.8ex;border-left: 1px #ccc solid;padding-left: 1ex;"><br></blockquote>=
<blockquote class=3D"gmail_quote" style=3D"margin: 0;margin-left: 0.8ex;bor=
der-left: 1px #ccc solid;padding-left: 1ex;">Consider a user class A, where=
 we implement hashing for A using:
</blockquote><blockquote class=3D"gmail_quote" style=3D"margin: 0;margin-le=
ft: 0.8ex;border-left: 1px #ccc solid;padding-left: 1ex;"><br></blockquote>=
<blockquote class=3D"gmail_quote" style=3D"margin: 0;margin-left: 0.8ex;bor=
der-left: 1px #ccc solid;padding-left: 1ex;">size_t hash_value(const A&amp;=
 a), any class derived from A will hash, by slicing.&nbsp;&nbsp;</blockquot=
e></blockquote><div><br></div><div>This might even be what the author origi=
nally intended. This is not very different from other languages&nbsp;where =
you implement a hash function on a class, which gets inherited by any deriv=
ed class. The rule of thumb in Java and .Net is: if you override the equali=
ty method/operator (whichever is applicable in the language), you should al=
so override the hash function. I don't see any real problems with that.</di=
v><div><br></div><div>Also, you don't really solve that problem with your p=
roposal, as it'll also apply to any implemented hash() function on the base=
 class.</div><div>&nbsp;</div><blockquote style=3D"margin: 0 0 0 40px; bord=
er: none; padding: 0px;"><blockquote class=3D"gmail_quote" style=3D"margin:=
 0;margin-left: 0.8ex;border-left: 1px #ccc solid;padding-left: 1ex;">Also,=
 if A has a constructor: A(const B&amp;), then B will also gain a
</blockquote><blockquote class=3D"gmail_quote" style=3D"margin: 0;margin-le=
ft: 0.8ex;border-left: 1px #ccc solid;padding-left: 1ex;">hash function.
</blockquote></blockquote><div><br></div><div>This is just an extention of =
the above. If you want B to be implicitely convertible to A, you automatica=
lly gain B's perks for every A. This also applies to regular operators. I d=
on't see why it shouldn't apply to hash calculation. If you don't want that=
, don't make a type implicitely convertible.</div><div><br></div><div>But i=
t's clear that we differ in opinion on this subject. Suffice to say, I like=
 N3333 (aside from the suggested indeterministicness), and I like being abl=
e to implement a hash function on a class.</div>

<p></p>

-- <br />
&nbsp;<br />
&nbsp;<br />
&nbsp;<br />

------=_Part_113_1044705.1357817906896--

.

Author: Chris Jefferson <chris@bubblescope.net>
Date: Thu, 10 Jan 2013 13:53:05 +0000 Raw View

This is a multi-part message in MIME format.
--------------010106000408010909060909
Content-Type: text/plain; charset=ISO-8859-1; format=flowed

On 10/01/13 11:38, Sylvester Hesp wrote:
>
>
> On Thursday, January 10, 2013 11:52:14 AM UTC+1, Chris Jefferson wrote:
>
>         On 9 January 2013 22:16,  <s.h...@oisyn.nl <javascript:>> wrote:
>
>         > On Wednesday, January 9, 2013 5:23:45 PM UTC+1, Chris
>         Jefferson wrote:
>
>         >> My worry, as I said above, is that while I can see having
>         to specialise a
>
>         >> class is tedious and verbose, using a function seems like a
>         much more risky
>
>         >> possibility. Especially when said function has a whole load
>         of special
>
>         >> overloads to try to catch unwanted conversions.
>
>         >
>
>         >
>
>         > Well, can you think of any clear examples when it would be
>         "risky"?
>
>         >
>
>
>         Consider a user class A, where we implement hashing for A using:
>
>
>         size_t hash_value(const A& a), any class derived from A will
>         hash, by slicing.
>
>
> This might even be what the author originally intended. This is not
> very different from other languages where you implement a hash
> function on a class, which gets inherited by any derived class. The
> rule of thumb in Java and .Net is: if you override the equality
> method/operator (whichever is applicable in the language), you should
> also override the hash function. I don't see any real problems with that.
>
> Also, you don't really solve that problem with your proposal, as it'll
> also apply to any implemented hash() function on the base class.

You are correct. I was listing some issues. I could argue that as the
'hash' is in the class it is more likely someone will notice it, and fix
it, but I admit that is a weak argument.
>
>         Also, if A has a constructor: A(const B&), then B will also
>         gain a
>
>         hash function.
>
>
> This is just an extention of the above. If you want B to be
> implicitely convertible to A, you automatically gain B's perks for
> every A. This also applies to regular operators. I don't see why it
> shouldn't apply to hash calculation. If you don't want that, don't
> make a type implicitely convertible.

However, N3333 already tries to handle this problem by special casing
primitive types like bool. N3333 says " This templating should be done
for every hash_value(primitive) overload to avoid similar conversion
mistakes. Other library, extension, or language techniques to prevent
implicit conversions to the argument types should also be allowed.".

I don't see why conversions from primitive types should be special cased
to avoid problems, while leaving user-types to have conversion mistakes.

That "other ... techniques to prevent implicit conversions to the
argument types should also be allowed" sounds far too vague for me. I
accept this isn't a final proposal, but I would not want to see a
sentence like that in the standard, which would lead to different
results occurring on different libraries.

Chris

--




--------------010106000408010909060909
Content-Type: text/html; charset=ISO-8859-1

<html>
  <head>
    <meta content="text/html; charset=ISO-8859-1"
      http-equiv="Content-Type">
  </head>
  <body text="#000000" bgcolor="#FFFFFF">
    <div class="moz-cite-prefix">On 10/01/13 11:38, Sylvester Hesp
      wrote:<br>
    </div>
    <blockquote
      cite="mid:bf24b965-1cec-4c0e-b989-6826ba2f3026@isocpp.org"
      type="cite"><br>
      <br>
      On Thursday, January 10, 2013 11:52:14 AM UTC+1, Chris Jefferson
      wrote:
      <blockquote style="margin: 0 0 0 40px; border: none; padding:
        0px;">
        <blockquote class="gmail_quote" style="margin: 0;margin-left:
          0.8ex;border-left: 1px #ccc solid;padding-left: 1ex;">On 9
          January 2013 22:16, &nbsp;&lt;<a moz-do-not-send="true"
            href="javascript:" target="_blank"
            gdf-obfuscated-mailto="0bbFg-mvlKMJ">s.h...@oisyn.nl</a>&gt;
          wrote:
        </blockquote>
        <blockquote class="gmail_quote" style="margin: 0;margin-left:
          0.8ex;border-left: 1px #ccc solid;padding-left: 1ex;">&gt; On
          Wednesday, January 9, 2013 5:23:45 PM UTC+1, Chris Jefferson
          wrote:
        </blockquote>
        <blockquote class="gmail_quote" style="margin: 0;margin-left:
          0.8ex;border-left: 1px #ccc solid;padding-left: 1ex;">&gt;&gt;
          My worry, as I said above, is that while I can see having to
          specialise a
        </blockquote>
        <blockquote class="gmail_quote" style="margin: 0;margin-left:
          0.8ex;border-left: 1px #ccc solid;padding-left: 1ex;">&gt;&gt;
          class is tedious and verbose, using a function seems like a
          much more risky
        </blockquote>
        <blockquote class="gmail_quote" style="margin: 0;margin-left:
          0.8ex;border-left: 1px #ccc solid;padding-left: 1ex;">&gt;&gt;
          possibility. Especially when said function has a whole load of
          special
        </blockquote>
        <blockquote class="gmail_quote" style="margin: 0;margin-left:
          0.8ex;border-left: 1px #ccc solid;padding-left: 1ex;">&gt;&gt;
          overloads to try to catch unwanted conversions.
        </blockquote>
        <blockquote class="gmail_quote" style="margin: 0;margin-left:
          0.8ex;border-left: 1px #ccc solid;padding-left: 1ex;">&gt;
        </blockquote>
        <blockquote class="gmail_quote" style="margin: 0;margin-left:
          0.8ex;border-left: 1px #ccc solid;padding-left: 1ex;">&gt;
        </blockquote>
        <blockquote class="gmail_quote" style="margin: 0;margin-left:
          0.8ex;border-left: 1px #ccc solid;padding-left: 1ex;">&gt;
          Well, can you think of any clear examples when it would be
          "risky"?
        </blockquote>
        <blockquote class="gmail_quote" style="margin: 0;margin-left:
          0.8ex;border-left: 1px #ccc solid;padding-left: 1ex;">&gt;
        </blockquote>
        <blockquote class="gmail_quote" style="margin: 0;margin-left:
          0.8ex;border-left: 1px #ccc solid;padding-left: 1ex;"><br>
        </blockquote>
        <blockquote class="gmail_quote" style="margin: 0;margin-left:
          0.8ex;border-left: 1px #ccc solid;padding-left: 1ex;">Consider
          a user class A, where we implement hashing for A using:
        </blockquote>
        <blockquote class="gmail_quote" style="margin: 0;margin-left:
          0.8ex;border-left: 1px #ccc solid;padding-left: 1ex;"><br>
        </blockquote>
        <blockquote class="gmail_quote" style="margin: 0;margin-left:
          0.8ex;border-left: 1px #ccc solid;padding-left: 1ex;">size_t
          hash_value(const A&amp; a), any class derived from A will
          hash, by slicing.&nbsp;&nbsp;</blockquote>
      </blockquote>
      <div><br>
      </div>
      <div>This might even be what the author originally intended. This
        is not very different from other languages&nbsp;where you implement a
        hash function on a class, which gets inherited by any derived
        class. The rule of thumb in Java and .Net is: if you override
        the equality method/operator (whichever is applicable in the
        language), you should also override the hash function. I don't
        see any real problems with that.</div>
      <div><br>
      </div>
      <div>Also, you don't really solve that problem with your proposal,
        as it'll also apply to any implemented hash() function on the
        base class.</div>
    </blockquote>
    <br>
    You are correct. I was listing some issues. I could argue that as
    the 'hash' is in the class it is more likely someone will notice it,
    and fix it, but I admit that is a weak argument.<br>
    <blockquote
      cite="mid:bf24b965-1cec-4c0e-b989-6826ba2f3026@isocpp.org"
      type="cite">
      <div>&nbsp;</div>
      <blockquote style="margin: 0 0 0 40px; border: none; padding:
        0px;">
        <blockquote class="gmail_quote" style="margin: 0;margin-left:
          0.8ex;border-left: 1px #ccc solid;padding-left: 1ex;">Also, if
          A has a constructor: A(const B&amp;), then B will also gain a
        </blockquote>
        <blockquote class="gmail_quote" style="margin: 0;margin-left:
          0.8ex;border-left: 1px #ccc solid;padding-left: 1ex;">hash
          function.
        </blockquote>
      </blockquote>
      <div><br>
      </div>
      <div>This is just an extention of the above. If you want B to be
        implicitely convertible to A, you automatically gain B's perks
        for every A. This also applies to regular operators. I don't see
        why it shouldn't apply to hash calculation. If you don't want
        that, don't make a type implicitely convertible.</div>
    </blockquote>
    <br>
    However, N3333 already tries to handle this problem by special
    casing primitive types like bool. N3333 says "
    <meta http-equiv="content-type" content="text/html;
      charset=ISO-8859-1">
    This templating should be done for
    every&nbsp;hash_value(primitive)&nbsp;overload to avoid similar conversion
    mistakes. Other library, extension, or language techniques to
    prevent implicit conversions to the argument types should also be
    allowed.".<br>
    <br>
    I don't see why conversions from primitive types should be special
    cased to avoid problems, while leaving user-types to have conversion
    mistakes.<br>
    <br>
    That "other ... techniques to prevent implicit conversions to the
    argument types should also be allowed" sounds far too vague for me.
    I accept this isn't a final proposal, but I would not want to see a
    sentence like that in the standard, which would lead to different
    results occurring on different libraries.<br>
    <br>
    Chris<br>
  </body>
</html>

<p></p>

-- <br />
&nbsp;<br />
&nbsp;<br />
&nbsp;<br />

--------------010106000408010909060909--

.