Topic: Allowing the compiler to pick between multiple inline


Author: James Berrow <icedshot@gmail.com>
Date: Thu, 4 May 2017 14:44:59 -0700 (PDT)
Raw View
------=_Part_825_126016557.1493934299545
Content-Type: multipart/alternative;
 boundary="----=_Part_826_955040833.1493934299546"

------=_Part_826_955040833.1493934299546
Content-Type: text/plain; charset=UTF-8

Hi there!

This is an issue I run into a lot (primarily in OpenCL on the GPU, which
has now moved to C++), but I suspect its also an issue for people writing
C++ on regular platforms as well

Problem: My function is slow because it has a lot of trig functions in (eg
sin functions), which take a while to calculate. To alleviate this, I use a
simple approximation. This gives less speedup than expected, as in the
context where some of my function calls are inlined, the compiler was able
to infer optimisation information (powers of two, constants, etc) and
transform my code to be faster with the slow version than when using the
approximation

It turns out, the compiler is pretty smart. In a lot of cases the compiler
knows a lot of detailed information about various trig functions (and while
I will use these as the example, the argument can be extended to any sets
of functions).

///this function is actually only a fragment of the real sin approximation
function, but it illustrates the point
inline
float sin_fast(float x)
{
    return 1.27323954f * x + .405284735f * x * x;
}

inline
float sin_impl(float x)
{
    ///return sin(x);
    return sin_fast(x);
}

int main()
{
    float v1 = sin_impl(M_PI/2) * some_variable; ///evaluates to
some_constant * some_variable
    float v2 = sin(M_PI/2) * some_variable; ///evaluates to some_variable
}

Here replacing all calls to sin to sin_impl is slower. While in this
specific example a constexpr sin would be great (and compile time
programming would help in other cases), there are other situations where
this doesn't get us out of this hole. For example, the compiler may be able
to infer that your variable is a power of 2, which would make a log2_fast
significantly slower than a native call to log2. In other cases, using a
faster builtin might restrict what optimisations the compiler is able to
perform:

///mad is opencl specific approximation to a*b+c in these function
fragments, but this would apply to any functions that the compiler is
unable to reorder
inline
float some_function_approx(float x1, float x2, float x3, float y1, float y2,
float y3, float x, float y)
{
    return mad(x2,y,mad(x3,y2,x*y3)-mad(x3,y,mad(x,y2,x2*y3)));

}


vs

inline
float some_function(float x1, float x2, float x3, float y1, float y2, float
y3, float x, float y)
{
    return x2*y-x*y2+x3*y2-x2*y3+x*y3-x3*y;
}



If you swap the above function into a context where say, y1 and y2 and y3
happen to be powers of two, your code will be slower. If you call the
approximation function repeatedly in a context where some of the variables
happen to be constant in that context (or really any compiler inferred
knowledge about them), the approximation function will be slower as the
compiler is unable to factor the expression. This is why constexpr and
other solutions are not acceptable - I want to be able to leverage the full
optimisation power of the compiler to maximally optimise my code
automatically, not have to guess what I'm doing and feel around in the dark
with profiling

Motivation for fixing: I really don't want to go through my code and check
every call to sin and see if its faster using a real sin, or an approximate
sin, and I don't want to bear in mind the special cases for trig functions
that the compiler already knows about. This smells like something that
could be fixed from a standard perspective

Proposed solution(ish): Allow the compiler to pick the 'best'/fastest
(approximately) of several functions for a specific context. Below is an
informal statement of what I would like, I'm no standardese wizard

EG

inline
float sin_fast(float x)
{
    return 1.27323954f * x + .405284735f * x * x;
}



inline
float sin_impl(float x)
{
    return std::compile_time_pick(sin, sin_fast, std::performance); ///perhaps
you could extend this to std::code_size etc
}

int main()
{
    float v1 = sin_impl(M_PI/2) * some_variable; ///evaluates to
some_variable, as sin(M_PI/2) == 1 while sin_fast evaluates to
some_constant, so therefore the 'call' to sin is faster
    float v2 = sin(M_PI/2) * some_variable; ///evaluates to some_variable
}


This could be doubly useful for performance portability across compilers.
I'm sure most of us have run into issues with compiler 1 not optimising a
function vs compiler 2 in some case and having to write a hacky ifdef (or
etc) to fix the function for one compiler, but if you could simply say
compile_time_pick(func_gcc, func_msvc, performance) and have the compiler
pick the one that optimises best (heuristically), it would save me a lot of
time

Thoughts? This would be super helpful for me as I run into this a lot and I
can't see any solution other than extremely tedious and complex busy work,
but there are probably things I haven't considered. Thanks!

--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/13cf06f3-eac6-4413-8689-27a056faed3f%40isocpp.org.

------=_Part_826_955040833.1493934299546
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr">Hi there!=C2=A0<div><br></div><div>This is an issue I run =
into a lot (primarily in OpenCL on the GPU, which has now moved to C++), bu=
t I suspect its also an issue for people writing C++ on regular platforms a=
s well</div><div><br></div><div>Problem: My function is slow because it has=
 a lot of trig functions in (eg sin functions), which take a while to calcu=
late. To alleviate this, I use a simple approximation. This gives less spee=
dup than expected, as in the context where some of my function calls are in=
lined, the compiler was able to infer optimisation information (powers of t=
wo, constants, etc) and transform my code to be faster with the slow versio=
n than when using the approximation</div><div><br></div><div>It turns out, =
the compiler is pretty smart. In a lot of cases the compiler knows a lot of=
 detailed information about various trig functions (and while I will use th=
ese as the example, the argument can be extended to any sets of functions).=
=C2=A0</div><div><br></div><div>///this function is actually only a fragmen=
t of the real sin approximation function, but it illustrates the point</div=
><div><div>inline</div><div>float sin_fast(float x)</div><div>{</div><div>=
=C2=A0 =C2=A0 return 1.27323954f * x + .405284735f * x * x;</div><div>}</di=
v></div><div><br></div><div>inline</div><div>float sin_impl(float x)<br>{</=
div><div>=C2=A0 =C2=A0 ///return sin(x);</div><div>=C2=A0 =C2=A0 return sin=
_fast(x);</div><div>}</div><div><br></div><div>int main()</div><div>{</div>=
<div>=C2=A0 =C2=A0 float v1 =3D sin_impl(M_PI/2) * some_variable; ///evalua=
tes to some_constant * some_variable</div><div>=C2=A0 =C2=A0 float v2 =3D s=
in(M_PI/2) * some_variable; ///evaluates to some_variable</div><div>}</div>=
<div><br></div><div>Here replacing all calls to sin to sin_impl is slower. =
While in this specific example a constexpr sin would be great (and compile =
time programming would help in other cases), there are other situations whe=
re this doesn&#39;t get us out of this hole. For example, the compiler may =
be able to infer that your variable is a power of 2, which would make a log=
2_fast significantly slower than a native call to log2. In other cases, usi=
ng a faster builtin might restrict what optimisations the compiler is able =
to perform:</div><div><br></div><div class=3D"prettyprint" style=3D"backgro=
und-color: rgb(250, 250, 250); border-color: rgb(187, 187, 187); border-sty=
le: solid; border-width: 1px; word-wrap: break-word;"><code class=3D"pretty=
print"><div class=3D"subprettyprint"><span style=3D"color: #800;" class=3D"=
styled-by-prettify">///mad is opencl specific approximation to a*b+c in the=
se function fragments, but this would apply to any functions that the compi=
ler is unable to reorder</span><span style=3D"color: #000;" class=3D"styled=
-by-prettify"><br></span><span style=3D"color: #008;" class=3D"styled-by-pr=
ettify">inline</span><span style=3D"color: #000;" class=3D"styled-by-pretti=
fy"> <br></span><span style=3D"color: #008;" class=3D"styled-by-prettify">f=
loat</span><span style=3D"color: #000;" class=3D"styled-by-prettify"> some_=
function_approx</span><span style=3D"color: #660;" class=3D"styled-by-prett=
ify">(</span><span style=3D"color: #008;" class=3D"styled-by-prettify">floa=
t</span><span style=3D"color: #000;" class=3D"styled-by-prettify"> x1</span=
><span style=3D"color: #660;" class=3D"styled-by-prettify">,</span><span st=
yle=3D"color: #000;" class=3D"styled-by-prettify"> </span><span style=3D"co=
lor: #008;" class=3D"styled-by-prettify">float</span><span style=3D"color: =
#000;" class=3D"styled-by-prettify"> x2</span><span style=3D"color: #660;" =
class=3D"styled-by-prettify">,</span><span style=3D"color: #000;" class=3D"=
styled-by-prettify"> </span><span style=3D"color: #008;" class=3D"styled-by=
-prettify">float</span><span style=3D"color: #000;" class=3D"styled-by-pret=
tify"> x3</span><span style=3D"color: #660;" class=3D"styled-by-prettify">,=
</span><span style=3D"color: #000;" class=3D"styled-by-prettify"> </span><s=
pan style=3D"color: #008;" class=3D"styled-by-prettify">float</span><span s=
tyle=3D"color: #000;" class=3D"styled-by-prettify"> y1</span><span style=3D=
"color: #660;" class=3D"styled-by-prettify">,</span><span style=3D"color: #=
000;" class=3D"styled-by-prettify"> </span><span style=3D"color: #008;" cla=
ss=3D"styled-by-prettify">float</span><span style=3D"color: #000;" class=3D=
"styled-by-prettify"> y2</span><span style=3D"color: #660;" class=3D"styled=
-by-prettify">,</span><span style=3D"color: #000;" class=3D"styled-by-prett=
ify"> </span><span style=3D"color: #008;" class=3D"styled-by-prettify">floa=
t</span><span style=3D"color: #000;" class=3D"styled-by-prettify"> y3</span=
><span style=3D"color: #660;" class=3D"styled-by-prettify">,</span><span st=
yle=3D"color: #000;" class=3D"styled-by-prettify"> </span><span style=3D"co=
lor: #008;" class=3D"styled-by-prettify">float</span><span style=3D"color: =
#000;" class=3D"styled-by-prettify"> x</span><span style=3D"color: #660;" c=
lass=3D"styled-by-prettify">,</span><span style=3D"color: #000;" class=3D"s=
tyled-by-prettify"> </span><span style=3D"color: #008;" class=3D"styled-by-=
prettify">float</span><span style=3D"color: #000;" class=3D"styled-by-prett=
ify"> y</span><span style=3D"color: #660;" class=3D"styled-by-prettify">)</=
span><span style=3D"color: #000;" class=3D"styled-by-prettify"><br></span><=
span style=3D"color: #660;" class=3D"styled-by-prettify">{</span><span styl=
e=3D"color: #000;" class=3D"styled-by-prettify"><br>=C2=A0 =C2=A0 </span><s=
pan style=3D"color: #008;" class=3D"styled-by-prettify">return</span><span =
style=3D"color: #000;" class=3D"styled-by-prettify"> mad</span><span style=
=3D"color: #660;" class=3D"styled-by-prettify">(</span><span style=3D"color=
: #000;" class=3D"styled-by-prettify">x2</span><span style=3D"color: #660;"=
 class=3D"styled-by-prettify">,</span><span style=3D"color: #000;" class=3D=
"styled-by-prettify">y</span><span style=3D"color: #660;" class=3D"styled-b=
y-prettify">,</span><span style=3D"color: #000;" class=3D"styled-by-prettif=
y">mad</span><span style=3D"color: #660;" class=3D"styled-by-prettify">(</s=
pan><span style=3D"color: #000;" class=3D"styled-by-prettify">x3</span><spa=
n style=3D"color: #660;" class=3D"styled-by-prettify">,</span><span style=
=3D"color: #000;" class=3D"styled-by-prettify">y2</span><span style=3D"colo=
r: #660;" class=3D"styled-by-prettify">,</span><span style=3D"color: #000;"=
 class=3D"styled-by-prettify">x</span><span style=3D"color: #660;" class=3D=
"styled-by-prettify">*</span><span style=3D"color: #000;" class=3D"styled-b=
y-prettify">y3</span><span style=3D"color: #660;" class=3D"styled-by-pretti=
fy">)-</span><span style=3D"color: #000;" class=3D"styled-by-prettify">mad<=
/span><span style=3D"color: #660;" class=3D"styled-by-prettify">(</span><sp=
an style=3D"color: #000;" class=3D"styled-by-prettify">x3</span><span style=
=3D"color: #660;" class=3D"styled-by-prettify">,</span><span style=3D"color=
: #000;" class=3D"styled-by-prettify">y</span><span style=3D"color: #660;" =
class=3D"styled-by-prettify">,</span><span style=3D"color: #000;" class=3D"=
styled-by-prettify">mad</span><span style=3D"color: #660;" class=3D"styled-=
by-prettify">(</span><span style=3D"color: #000;" class=3D"styled-by-pretti=
fy">x</span><span style=3D"color: #660;" class=3D"styled-by-prettify">,</sp=
an><span style=3D"color: #000;" class=3D"styled-by-prettify">y2</span><span=
 style=3D"color: #660;" class=3D"styled-by-prettify">,</span><span style=3D=
"color: #000;" class=3D"styled-by-prettify">x2</span><span style=3D"color: =
#660;" class=3D"styled-by-prettify">*</span><span style=3D"color: #000;" cl=
ass=3D"styled-by-prettify">y3</span><span style=3D"color: #660;" class=3D"s=
tyled-by-prettify">)));</span><span style=3D"color: #000;" class=3D"styled-=
by-prettify"><br><br></span><span style=3D"color: #660;" class=3D"styled-by=
-prettify">}</span></div></code></div><div><br></div><div><br></div><div>vs=
=C2=A0</div><div><br></div><div class=3D"prettyprint" style=3D"background-c=
olor: rgb(250, 250, 250); border-color: rgb(187, 187, 187); border-style: s=
olid; border-width: 1px; word-wrap: break-word;"><code class=3D"prettyprint=
"><div class=3D"subprettyprint"><span style=3D"color: #008;" class=3D"style=
d-by-prettify">inline</span><span style=3D"color: #000;" class=3D"styled-by=
-prettify"><br></span><span style=3D"color: #008;" class=3D"styled-by-prett=
ify">float</span><span style=3D"color: #000;" class=3D"styled-by-prettify">=
 some_function</span><span style=3D"color: #660;" class=3D"styled-by-pretti=
fy">(</span><span style=3D"color: #008;" class=3D"styled-by-prettify">float=
</span><span style=3D"color: #000;" class=3D"styled-by-prettify"> x1</span>=
<span style=3D"color: #660;" class=3D"styled-by-prettify">,</span><span sty=
le=3D"color: #000;" class=3D"styled-by-prettify"> </span><span style=3D"col=
or: #008;" class=3D"styled-by-prettify">float</span><span style=3D"color: #=
000;" class=3D"styled-by-prettify"> x2</span><span style=3D"color: #660;" c=
lass=3D"styled-by-prettify">,</span><span style=3D"color: #000;" class=3D"s=
tyled-by-prettify"> </span><span style=3D"color: #008;" class=3D"styled-by-=
prettify">float</span><span style=3D"color: #000;" class=3D"styled-by-prett=
ify"> x3</span><span style=3D"color: #660;" class=3D"styled-by-prettify">,<=
/span><span style=3D"color: #000;" class=3D"styled-by-prettify"> </span><sp=
an style=3D"color: #008;" class=3D"styled-by-prettify">float</span><span st=
yle=3D"color: #000;" class=3D"styled-by-prettify"> y1</span><span style=3D"=
color: #660;" class=3D"styled-by-prettify">,</span><span style=3D"color: #0=
00;" class=3D"styled-by-prettify"> </span><span style=3D"color: #008;" clas=
s=3D"styled-by-prettify">float</span><span style=3D"color: #000;" class=3D"=
styled-by-prettify"> y2</span><span style=3D"color: #660;" class=3D"styled-=
by-prettify">,</span><span style=3D"color: #000;" class=3D"styled-by-pretti=
fy"> </span><span style=3D"color: #008;" class=3D"styled-by-prettify">float=
</span><span style=3D"color: #000;" class=3D"styled-by-prettify"> y3</span>=
<span style=3D"color: #660;" class=3D"styled-by-prettify">,</span><span sty=
le=3D"color: #000;" class=3D"styled-by-prettify"> </span><span style=3D"col=
or: #008;" class=3D"styled-by-prettify">float</span><span style=3D"color: #=
000;" class=3D"styled-by-prettify"> x</span><span style=3D"color: #660;" cl=
ass=3D"styled-by-prettify">,</span><span style=3D"color: #000;" class=3D"st=
yled-by-prettify"> </span><span style=3D"color: #008;" class=3D"styled-by-p=
rettify">float</span><span style=3D"color: #000;" class=3D"styled-by-pretti=
fy"> y</span><span style=3D"color: #660;" class=3D"styled-by-prettify">)</s=
pan><span style=3D"color: #000;" class=3D"styled-by-prettify"><br></span><s=
pan style=3D"color: #660;" class=3D"styled-by-prettify">{</span><span style=
=3D"color: #000;" class=3D"styled-by-prettify"><br>=C2=A0 =C2=A0 </span><sp=
an style=3D"color: #008;" class=3D"styled-by-prettify">return</span><span s=
tyle=3D"color: #000;" class=3D"styled-by-prettify"> x2</span><span style=3D=
"color: #660;" class=3D"styled-by-prettify">*</span><span style=3D"color: #=
000;" class=3D"styled-by-prettify">y</span><span style=3D"color: #660;" cla=
ss=3D"styled-by-prettify">-</span><span style=3D"color: #000;" class=3D"sty=
led-by-prettify">x</span><span style=3D"color: #660;" class=3D"styled-by-pr=
ettify">*</span><span style=3D"color: #000;" class=3D"styled-by-prettify">y=
2</span><span style=3D"color: #660;" class=3D"styled-by-prettify">+</span><=
span style=3D"color: #000;" class=3D"styled-by-prettify">x3</span><span sty=
le=3D"color: #660;" class=3D"styled-by-prettify">*</span><span style=3D"col=
or: #000;" class=3D"styled-by-prettify">y2</span><span style=3D"color: #660=
;" class=3D"styled-by-prettify">-</span><span style=3D"color: #000;" class=
=3D"styled-by-prettify">x2</span><span style=3D"color: #660;" class=3D"styl=
ed-by-prettify">*</span><span style=3D"color: #000;" class=3D"styled-by-pre=
ttify">y3</span><span style=3D"color: #660;" class=3D"styled-by-prettify">+=
</span><span style=3D"color: #000;" class=3D"styled-by-prettify">x</span><s=
pan style=3D"color: #660;" class=3D"styled-by-prettify">*</span><span style=
=3D"color: #000;" class=3D"styled-by-prettify">y3</span><span style=3D"colo=
r: #660;" class=3D"styled-by-prettify">-</span><span style=3D"color: #000;"=
 class=3D"styled-by-prettify">x3</span><span style=3D"color: #660;" class=
=3D"styled-by-prettify">*</span><span style=3D"color: #000;" class=3D"style=
d-by-prettify">y</span><span style=3D"color: #660;" class=3D"styled-by-pret=
tify">;</span><span style=3D"color: #000;" class=3D"styled-by-prettify"><br=
></span><span style=3D"color: #660;" class=3D"styled-by-prettify">}</span><=
/div></code></div><div><br><br><br></div><div>If you swap the above functio=
n into a context where say, y1 and y2 and y3 happen to be powers of two, yo=
ur code will be slower. If you call the approximation function repeatedly i=
n a context where some of the variables happen to be constant in that conte=
xt (or really any compiler inferred knowledge about them), the approximatio=
n function will be slower as the compiler is unable to factor the expressio=
n. This is why constexpr and other solutions are not acceptable - I want to=
 be able to leverage the full optimisation power of the compiler to maximal=
ly optimise my code automatically, not have to guess what I&#39;m doing and=
 feel around in the dark with profiling</div><div><br></div><div>Motivation=
 for fixing: I really don&#39;t want to go through my code and check every =
call to sin and see if its faster using a real sin, or an approximate sin, =
and I don&#39;t want to bear in mind the special cases for trig functions t=
hat the compiler already knows about. This smells like something that could=
 be fixed from a standard perspective</div><div><br></div><div>Proposed sol=
ution(ish): Allow the compiler to pick the &#39;best&#39;/fastest (approxim=
ately) of several functions for a specific context. Below is an informal st=
atement of what I would like, I&#39;m no standardese wizard</div><div><br><=
/div><div>EG</div><div><br></div><div><div class=3D"prettyprint" style=3D"b=
ackground-color: rgb(250, 250, 250); border-color: rgb(187, 187, 187); bord=
er-style: solid; border-width: 1px; word-wrap: break-word;"><code class=3D"=
prettyprint"><div class=3D"subprettyprint"><span style=3D"color: #008;" cla=
ss=3D"styled-by-prettify">inline</span><span style=3D"color: #000;" class=
=3D"styled-by-prettify"><br></span><span style=3D"color: #008;" class=3D"st=
yled-by-prettify">float</span><span style=3D"color: #000;" class=3D"styled-=
by-prettify"> sin_fast</span><span style=3D"color: #660;" class=3D"styled-b=
y-prettify">(</span><span style=3D"color: #008;" class=3D"styled-by-prettif=
y">float</span><span style=3D"color: #000;" class=3D"styled-by-prettify"> x=
</span><span style=3D"color: #660;" class=3D"styled-by-prettify">)</span><s=
pan style=3D"color: #000;" class=3D"styled-by-prettify"><br></span><span st=
yle=3D"color: #660;" class=3D"styled-by-prettify">{</span><span style=3D"co=
lor: #000;" class=3D"styled-by-prettify"><br>=C2=A0 =C2=A0 </span><font col=
or=3D"#000088"><span style=3D"color: #008;" class=3D"styled-by-prettify">re=
turn</span></font><span style=3D"color: #000;" class=3D"styled-by-prettify"=
> </span><span style=3D"color: #066;" class=3D"styled-by-prettify">1.273239=
54f</span><span style=3D"color: #000;" class=3D"styled-by-prettify"> </span=
><span style=3D"color: #660;" class=3D"styled-by-prettify">*</span><span st=
yle=3D"color: #000;" class=3D"styled-by-prettify"> x </span><span style=3D"=
color: #660;" class=3D"styled-by-prettify">+</span><span style=3D"color: #0=
00;" class=3D"styled-by-prettify"> </span><span style=3D"color: #660;" clas=
s=3D"styled-by-prettify">.</span><span style=3D"color: #066;" class=3D"styl=
ed-by-prettify">405284735f</span><span style=3D"color: #000;" class=3D"styl=
ed-by-prettify"> </span><span style=3D"color: #660;" class=3D"styled-by-pre=
ttify">*</span><span style=3D"color: #000;" class=3D"styled-by-prettify"> x=
 </span><span style=3D"color: #660;" class=3D"styled-by-prettify">*</span><=
span style=3D"color: #000;" class=3D"styled-by-prettify"> x</span><span sty=
le=3D"color: #660;" class=3D"styled-by-prettify">;</span><span style=3D"col=
or: #000;" class=3D"styled-by-prettify"><br></span><span style=3D"color: #6=
60;" class=3D"styled-by-prettify">}</span><span style=3D"color: #000;" clas=
s=3D"styled-by-prettify"><br><br></span></div></code></div><div><br><br></d=
iv><div class=3D"prettyprint" style=3D"background-color: rgb(250, 250, 250)=
; border-color: rgb(187, 187, 187); border-style: solid; border-width: 1px;=
 word-wrap: break-word;"><code class=3D"prettyprint"><div class=3D"subprett=
yprint"><span style=3D"color: #008;" class=3D"styled-by-prettify">inline</s=
pan><span style=3D"color: #000;" class=3D"styled-by-prettify"><br></span><s=
pan style=3D"color: #008;" class=3D"styled-by-prettify">float</span><span s=
tyle=3D"color: #000;" class=3D"styled-by-prettify"> sin_impl</span><span st=
yle=3D"color: #660;" class=3D"styled-by-prettify">(</span><span style=3D"co=
lor: #008;" class=3D"styled-by-prettify">float</span><span style=3D"color: =
#000;" class=3D"styled-by-prettify"> x</span><span style=3D"color: #660;" c=
lass=3D"styled-by-prettify">)</span><span style=3D"color: #000;" class=3D"s=
tyled-by-prettify"><br></span><span style=3D"color: #660;" class=3D"styled-=
by-prettify">{</span><span style=3D"color: #000;" class=3D"styled-by-pretti=
fy"><br>=C2=A0 =C2=A0 </span><span style=3D"color: #008;" class=3D"styled-b=
y-prettify">return</span><span style=3D"color: #000;" class=3D"styled-by-pr=
ettify"> std</span><span style=3D"color: #660;" class=3D"styled-by-prettify=
">::</span><span style=3D"color: #000;" class=3D"styled-by-prettify">compil=
e_time_pick</span><span style=3D"color: #660;" class=3D"styled-by-prettify"=
>(</span><span style=3D"color: #000;" class=3D"styled-by-prettify">sin</spa=
n><span style=3D"color: #660;" class=3D"styled-by-prettify">,</span><span s=
tyle=3D"color: #000;" class=3D"styled-by-prettify"> sin_fast</span><span st=
yle=3D"color: #660;" class=3D"styled-by-prettify">,</span><span style=3D"co=
lor: #000;" class=3D"styled-by-prettify"> std</span><span style=3D"color: #=
660;" class=3D"styled-by-prettify">::</span><span style=3D"color: #000;" cl=
ass=3D"styled-by-prettify">performance</span><span style=3D"color: #660;" c=
lass=3D"styled-by-prettify">);</span><span style=3D"color: #000;" class=3D"=
styled-by-prettify"> </span><span style=3D"color: #800;" class=3D"styled-by=
-prettify">///perhaps you could extend this to std::code_size etc</span><sp=
an style=3D"color: #000;" class=3D"styled-by-prettify"><br></span><span sty=
le=3D"color: #660;" class=3D"styled-by-prettify">}</span></div></code></div=
><div><br></div></div><div class=3D"prettyprint" style=3D"background-color:=
 rgb(250, 250, 250); border-color: rgb(187, 187, 187); border-style: solid;=
 border-width: 1px; word-wrap: break-word;"><code class=3D"prettyprint"><di=
v class=3D"subprettyprint"><span style=3D"color: #008;" class=3D"styled-by-=
prettify">int</span><span style=3D"color: #000;" class=3D"styled-by-prettif=
y"> main</span><span style=3D"color: #660;" class=3D"styled-by-prettify">()=
</span><span style=3D"color: #000;" class=3D"styled-by-prettify"><br></span=
><span style=3D"color: #660;" class=3D"styled-by-prettify">{</span><span st=
yle=3D"color: #000;" class=3D"styled-by-prettify"><br>=C2=A0 =C2=A0 </span>=
<span style=3D"color: #008;" class=3D"styled-by-prettify">float</span><span=
 style=3D"color: #000;" class=3D"styled-by-prettify"> v1 </span><span style=
=3D"color: #660;" class=3D"styled-by-prettify">=3D</span><span style=3D"col=
or: #000;" class=3D"styled-by-prettify"> sin_impl</span><span style=3D"colo=
r: #660;" class=3D"styled-by-prettify">(</span><span style=3D"color: #000;"=
 class=3D"styled-by-prettify">M_PI</span><span style=3D"color: #660;" class=
=3D"styled-by-prettify">/</span><span style=3D"color: #066;" class=3D"style=
d-by-prettify">2</span><span style=3D"color: #660;" class=3D"styled-by-pret=
tify">)</span><span style=3D"color: #000;" class=3D"styled-by-prettify"> </=
span><span style=3D"color: #660;" class=3D"styled-by-prettify">*</span><spa=
n style=3D"color: #000;" class=3D"styled-by-prettify"> some_variable</span>=
<span style=3D"color: #660;" class=3D"styled-by-prettify">;</span><span sty=
le=3D"color: #000;" class=3D"styled-by-prettify"> </span><span style=3D"col=
or: #800;" class=3D"styled-by-prettify">///evaluates to some_variable, as s=
in(M_PI/2) =3D=3D 1 while sin_fast evaluates to some_constant, so therefore=
 the &#39;call&#39; to sin is faster</span><span style=3D"color: #000;" cla=
ss=3D"styled-by-prettify"><br>=C2=A0 =C2=A0 </span><span style=3D"color: #0=
08;" class=3D"styled-by-prettify">float</span><span style=3D"color: #000;" =
class=3D"styled-by-prettify"> v2 </span><span style=3D"color: #660;" class=
=3D"styled-by-prettify">=3D</span><span style=3D"color: #000;" class=3D"sty=
led-by-prettify"> sin</span><span style=3D"color: #660;" class=3D"styled-by=
-prettify">(</span><span style=3D"color: #000;" class=3D"styled-by-prettify=
">M_PI</span><span style=3D"color: #660;" class=3D"styled-by-prettify">/</s=
pan><span style=3D"color: #066;" class=3D"styled-by-prettify">2</span><span=
 style=3D"color: #660;" class=3D"styled-by-prettify">)</span><span style=3D=
"color: #000;" class=3D"styled-by-prettify"> </span><span style=3D"color: #=
660;" class=3D"styled-by-prettify">*</span><span style=3D"color: #000;" cla=
ss=3D"styled-by-prettify"> some_variable</span><span style=3D"color: #660;"=
 class=3D"styled-by-prettify">;</span><span style=3D"color: #000;" class=3D=
"styled-by-prettify"> </span><span style=3D"color: #800;" class=3D"styled-b=
y-prettify">///evaluates to some_variable</span><span style=3D"color: #000;=
" class=3D"styled-by-prettify"><br></span><span style=3D"color: #660;" clas=
s=3D"styled-by-prettify">}</span><span style=3D"color: #000;" class=3D"styl=
ed-by-prettify"><br><br></span></div></code></div><div><br>This could be do=
ubly useful for performance portability across compilers. I&#39;m sure most=
 of us have run into issues with compiler 1 not optimising a function vs co=
mpiler 2 in some case and having to write a hacky ifdef (or etc) to fix the=
 function for one compiler, but if you could simply say=C2=A0<span class=3D=
"styled-by-prettify" style=3D"font-family: monospace; background-color: rgb=
(250, 250, 250); color: rgb(0, 0, 0);">compile_time_</span><span class=3D"s=
tyled-by-prettify" style=3D"font-family: monospace; background-color: rgb(2=
50, 250, 250); color: rgb(0, 0, 0);">pick</span>(func_gcc, func_msvc, perfo=
rmance) and have the compiler pick the one that optimises best (heuristical=
ly), it would save me a lot of time</div><div><br></div><div>Thoughts? This=
 would be super helpful for me as I run into this a lot and I can&#39;t see=
 any solution other than extremely tedious and complex busy work, but there=
 are probably things I haven&#39;t considered. Thanks!</div></div>

<p></p>

-- <br />
You received this message because you are subscribed to the Google Groups &=
quot;ISO C++ Standard - Future Proposals&quot; group.<br />
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to <a href=3D"mailto:std-proposals+unsubscribe@isocpp.org">std-proposa=
ls+unsubscribe@isocpp.org</a>.<br />
To post to this group, send email to <a href=3D"mailto:std-proposals@isocpp=
..org">std-proposals@isocpp.org</a>.<br />
To view this discussion on the web visit <a href=3D"https://groups.google.c=
om/a/isocpp.org/d/msgid/std-proposals/13cf06f3-eac6-4413-8689-27a056faed3f%=
40isocpp.org?utm_medium=3Demail&utm_source=3Dfooter">https://groups.google.=
com/a/isocpp.org/d/msgid/std-proposals/13cf06f3-eac6-4413-8689-27a056faed3f=
%40isocpp.org</a>.<br />

------=_Part_826_955040833.1493934299546--

------=_Part_825_126016557.1493934299545--

.