Topic: Idea for ANSI standard - compile time hashing for speed
Author: w@w.com.lga.highwinds-media.com ("wxs")
Date: Thu, 7 Apr 2005 04:23:24 GMT Raw View
I work in the financial industry and one of our big issues is speed. Many
times we have a bunch of enums we have from either different enums or the
same enum that will have various numeric values assigned. Rarely will there
be collisions in numbering between the enums. These enums you might
imagine would be like OrderPrice=27, OrderQuantity=50, OrderSide=62. There
may be a lot of these. So normally what we would have to do is store these
in hash table so hashtable[OrderPrice]=value could be set with the value.
Hashing is very expensive in terms of performance at the level we deal with.
What would be nice is if the C#/C++ compiler would allow me to do something
like this:
Imagine an original enum:
enum OrderInformation {., OrderPrice=27, OrderQuantity=50, OrderSide=62};
Now imagine if at compile time the compiler could do the hash for us like:
enum map LocalOrderInformation { OrderInformation.OrderPrice alias
OrderPrice=0, OrderInformation.OrderQuantity alias OrderQuantity,
OrderInformation.OrderSide alias OrderSide};
So basically this tells the compiler to create me a new enum mapping the old
enumeration to the new mapping to be 0 based or 1 based so that now we can
use a direct array lookup rather than a hashmap.
So now we could do something like:
int OrderInfo[3];
void OrderClass::ProcessOrder(LocalOrderInformation info, int value)
{
OrderInfo[info]=value;
}
So that a caller if they did something like this:
OrderClassInstance.ProcessOrder(OrderInformation.OrderPrice,27);
The compiler would translate the OrderInformatin.OrderPrice enum to our
LocalOrderInformation enum value for that enum to be 0. Now we can do a
direct array entry rather than a hashtable.
I'm sure there are numerous issues here to work out and it is not quite as
simple as the pseudo code above. But if it were and we could get the
compiler to do this mapping for us our runtime code would be much improved.
What do people think about this? Something like this would significantly
help our performance in many areas since we use a lot of hashing for the
above reason. I'd like to find a reasonable implementation/semantics that
would be acceptable for a standard.
Any ideas on if this is feasable, and possibly some better syntax or other
ideas based on it?
Thanks,
Dave
---
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html ]
Author: AlbertoBarbati@libero.it (Alberto Barbati)
Date: Thu, 7 Apr 2005 22:34:04 GMT Raw View
wxs wrote:
>
> Any ideas on if this is feasable, and possibly some better syntax or other
> ideas based on it?
>
It seems to me that you are looking for a complex solution to a simple
problem. I think we already have all tools to get that within current
standard, provided you have a reasonably good optimizing compiler.
Consider this:
enum OrderInformation
{
OrderPrice=27,
OrderQuantity=50,
OrderSide=62
};
struct LocalOrderInformation
{
LocalOrderInformation(OrderInformation oi)
{
switch(oi)
{
case OrderPrice: value = 0; break;
case OrderQuantity: value = 1; break;
case OrderSide: value = 2; break;
}
}
int value;
};
void ProcessOrder(LocalOrderInformation info, int value)
{
OrderInfo[info.value]=value;
}
You may look at the switch statement in horror, but a reasonable
optimizing compiler *is* able to perform the selection at compile-time
and fully inline the constructor call for maximum efficiency. Just have
a look at the generated machine-code and you might be amazed about how
good compilers are nowadays. (I tested it with VC++ 7.1 and works like a
charm)
Of course there may be hidden pitfalls, for example if the number of
cases is very high the compiler might not be smart enough and the code
will probably be quite inefficient in debug builds (yet it might be
better than using hash table!). Your mileage may vary.
Another way to explore to ensure a compile-time mapping in a
non-compiler-dependent way is to use templates and meta-programming.
However, the syntax might become a bit convoluted.
HTH,
Alberto
---
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html ]
Author: "msalters" <Michiel.Salters@logicacmg.com>
Date: Thu, 7 Apr 2005 17:35:50 CST Raw View
"wxs" wrote:
> I work in the financial industry and one of our big issues is speed.
Many
> times we have a bunch of enums we have from either different enums or
the
> same enum that will have various numeric values assigned. Rarely
will there
> be collisions in numbering between the enums. These enums you might
> imagine would be like OrderPrice=27, OrderQuantity=50, OrderSide=62.
There
> may be a lot of these. So normally what we would have to do is store
these
> in hash table so hashtable[OrderPrice]=value could be set with the
value.
> Hashing is very expensive in terms of performance at the level we
deal with.
> What would be nice is if the C#/C++ compiler would allow me to do
something
> like this:
> Imagine an original enum:
>
> enum OrderInformation {., OrderPrice=27, OrderQuantity=50,
OrderSide=62};
>
> Now imagine if at compile time the compiler could do the hash for us
like:
>
> enum map LocalOrderInformation { OrderInformation.OrderPrice alias
> OrderPrice=0, OrderInformation.OrderQuantity alias OrderQuantity,
> OrderInformation.OrderSide alias OrderSide};
>
> So basically this tells the compiler to create me a new enum mapping
the old
> enumeration to the new mapping to be 0 based or 1 based so that now
we can
> use a direct array lookup rather than a hashmap.
Except that as you propose it, the compiler would need something like a
hashmap to map from the original array to your array. Effect: you add
an
extra level of indirection.
However, you might be able to get a better performance out of your
hasmap implementation if you can replace the hash function. Hashing
a small number like 'enum OrderInformation' is very cheap, just a few
instructions. You might even get away with just casting it to an
integral type.
Another solution could be to use a std::map, if there are only a few
keys. std::map offers a similar interface, but is probably a lot
faster if there are just one or a few elements.
If memory isn't a very big issue, even a vector of pointers can work.
HTH,
Michiel Salters
---
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html ]