Topic: : impossible alignment requirements?


Author: jcoffin@taeus.com (Jerry Coffin)
Date: Wed, 28 Nov 2007 06:05:31 GMT
Raw View
In article <1194958252.805504.290230@50g2000hsm.googlegroups.com>,
levicki.i@sezampro.yu says...

[ ... ]

> Not having an operator new which returns _aligned_ memory seriously
> hampers SIMD use in C++.

The C++ standard defines new as ($18.4.1.1): "The allocation function
(3.7.3.1) called by a new-expression (5.3.4) to allocate size bytes of
storage suitably aligned to represent any object of that size.

If your implementation returns memory that is not aligned, the problem
lies with the implementation, not the language.

> What should one do with this code?
>
> class foo
> {
> public:
>  __m128 vec;
>  ...
>  foo(void)
>  {
>   vec = _mm_setzero_ps();
>  }
> };
>
> foo *p = new [](10);

As a description of a problem, this leaves everything to be desired. You
haven't described what an __m128 is, you haven't described what
_mm_setzero_ps() does, you haven't described what happens now, or what
you'd like to have happen. The only part that's clear is that the last
line, where you attempt to allocate memory isn't even syntactically
correct.

> C++ regulatory body is obviously sitting with their collective thumbs
> up their arses instead of listening to developers and their needs.

The people I've met on the committee have all been quite intelligent and
hard-working, but they can't read minds. If you want something done, you
need to at least specify the problem sufficiently that somebody can
actually figure out what the hell you're talking about. If you really
care about fixing that problem, making at least an attempt at telling
about how you'd suggest that the problem be fixed.

Right now, you've done nothing of the sort. The code you've posted lacks
almost any meaning as it stands; you seem to be assuming that some
requirements should be apparent, because whatever problem you seem to be
trying to cite isn't apparent (or necessarily correct) based strictly on
the code itself. Right now, the syntax of the allocation statement is
wrong, but it's hard to believe that this is really the point of your
post.

[ ... ]

> There is a compiler option to force stack alignment on every
> function entry to a given value. Problem is in the standard again,
> if you force say 16-byte alignment:
>
> void somefunc(void)
> {
>  char buf[32]; // sure, this will be aligned to 16 bytes
>  ...
>  for (int i = 0; i < 32; i++) {
>   char foo[32]; // but this won't!
>   __declspec(align(16)) char bar[32]; // unless you add explicit
> alignment directive
>  }
> }

Nothing in the standard requires that the data be misaligned. If your
compiler vendor says otherwise, they're just feeding you nonsense. I
suspect the problem you're having is fairly simple: it's very difficult
for the C++ committee to specify performance requirements, so if using a
misaligned buffer _works_, but much more slowly than an aligned buffer
would have, the standard provides no reasonable way to differentiate
between the two.

That doesn't mean your needs are being entirely ignored though. As far
as obtaining storage that's aligned to an arbitrary boundary, C++ 0x has
added a function named 'align' that does precisely that. See ptr.align
in N2461 for more details.

--
    Later,
    Jerry.

The universe is a figment of its own imagination.

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]





Author: jcoffin@taeus.com (Jerry Coffin)
Date: Wed, 28 Nov 2007 17:54:40 GMT
Raw View
In article <1194958252.805504.290230@50g2000hsm.googlegroups.com>,
levicki.i@sezampro.yu says...

[ ... ]

> Not having an operator new which returns _aligned_ memory seriously
> hampers SIMD use in C++.

The C++ standard defines new as ($18.4.1.1): "The allocation function
(3.7.3.1) called by a new-expression (5.3.4) to allocate size bytes of
storage suitably aligned to represent any object of that size.

If your implementation returns memory that is not aligned, the problem
lies with the implementation, not the language.

> What should one do with this code?
>
> class foo
> {
> public:
>  __m128 vec;
>  ...
>  foo(void)
>  {
>   vec = _mm_setzero_ps();
>  }
> };
>
> foo *p = new [](10);

As a description of a problem, this leaves everything to be desired. You
haven't described what an __m128 is, you haven't described what
_mm_setzero_ps() does, you haven't described what happens now, or what
you'd like to have happen. The only part that's clear is that the last
line, where you attempt to allocate memory isn't even syntactically
correct.

> C++ regulatory body is obviously sitting with their collective thumbs
> up their arses instead of listening to developers and their needs.

The people I've met on the committee have all been quite intelligent and
hard-working, but they can't read minds. If you want something done, you
need to at least specify the problem sufficiently that somebody can
actually figure out what you're talking about. It's also often helpful
to make a suggestion as to how the problem should be fixed, rather than
just pointing out that things aren't how you want them.

Right now, you've done nothing of the sort. The code you've posted lacks
almost any meaning as it stands. You've used a type without describing
its properties, and a function without describing what it does. You
haven't told us what you'd like to see happen, or what does happen right
now. You haven't shown anything about whether the problem really lies
with the language itself, or just a particular implementation. You
haven't provided any indication of whether the problem prevents
operation, or simply causes slow performance.

[ ... ]

> There is a compiler option to force stack alignment on every
> function entry to a given value. Problem is in the standard again,
> if you force say 16-byte alignment:
>
> void somefunc(void)
> {
>  char buf[32]; // sure, this will be aligned to 16 bytes
>  ...
>  for (int i = 0; i < 32; i++) {
>   char foo[32]; // but this won't!
>   __declspec(align(16)) char bar[32]; // unless you add explicit
> alignment directive
>  }
> }

Nothing in the standard requires that the data be misaligned. If your
compiler vendor says otherwise, they're just feeding you nonsense. I
suspect the problem you're having is fairly simple: it's very difficult
for the C++ committee to specify performance requirements, so if using a
misaligned buffer _works_, but much more slowly than an aligned buffer
would have, the standard provides no reasonable way to differentiate
between the two. It's also true that the standard bases alignment on
type, so when/if you (attempt to) use an array of char as simply
"memory", it's almost impossible for the compiler to figure out what
alignment you _want_ unless you do something to specify it explicitly.
This isn't a matter of anything the committee has done wrong though;
it's a simple matter of compiler technology falling somewhat short of
the mind-reading necessary to figure out what you want when you don't
tell it.

That doesn't mean your needs are being entirely ignored though. As far
as obtaining storage that's aligned to an arbitrary boundary, C++ 0x has
added a function named 'align' that does precisely that. See ptr.align
in N2461 for more details.

--
    Later,
    Jerry.

The universe is a figment of its own imagination.

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]





Author: levicki.i@sezampro.yu
Date: Tue, 13 Nov 2007 11:03:07 CST
Raw View
On Nov 12, 10:10 pm, Mathias Gaunard <loufo...@gmail.com> wrote:
> On Nov 7, 5:17 pm, Martin Bonner <martinfro...@yahoo.co.uk> wrote:
>
> > Could you give more details please.  The alignment of a type must be
> > <= sizeof the type.  What types on Mac have an alignment of 16?  (NB.
> > I don't doubt your statement; I am just curious as to what reasonable
> > implementations may do.)
>
> It's just that the stack is 16-byte aligned, which is quite unrelated.

This is not true. Stack may be or may be not aligned, it doesn't
matter.

Problem lies with vector types both on Mac and on PC and Playstation 3
also comes to mind.

*** C++ standard is terribly outdated ***

Not having an operator new which returns _aligned_ memory seriously
hampers SIMD use in C++.

What should one do with this code?

class foo
{
public:
 __m128 vec;
 ...
 foo(void)
 {
  vec = _mm_setzero_ps();
 }
};

foo *p = new [](10);

C++ regulatory body is obviously sitting with their collective thumbs
up their arses instead of listening to developers and their needs.

Not only that, compiler vendors are not able to provide workarounds
for such issues because that would break the standard which btw is
already broken.

Let me touch on stack alignment when you mentioned it:

There is a compiler option to force stack alignment on every
function entry to a given value. Problem is in the standard again,
if you force say 16-byte alignment:

void somefunc(void)
{
 char buf[32]; // sure, this will be aligned to 16 bytes
 ...
 for (int i = 0; i < 32; i++) {
  char foo[32]; // but this won't!
  __declspec(align(16)) char bar[32]; // unless you add explicit
alignment directive
 }
}

Then what is the point of having forced alignment if you are not
consistent? For every performance conscious developer those
things just suck.

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]





Author: Martin Bonner <martinfrompi@yahoo.co.uk>
Date: Fri, 16 Nov 2007 09:45:06 CST
Raw View
On Nov 13, 5:03 pm, levick...@sezampro.yu wrote:
> Problem lies with vector types both on Mac and on PC and Playstation 3
> also comes to mind.
>
> *** C++ standard is terribly outdated ***
>
> Not having an operator new which returns _aligned_ memory seriously
> hampers SIMD use in C++.

There's nothing in the C++ standard which prevents the implementation
from returning memory aligned for SIMD use.  If the implementation
chooses not to do that, complain to your compiler vendor.

> What should one do with this code?
>
> class foo
> {
> public:
>         __m128  vec;
>         ...
>         foo(void)
>         {
>                 vec = _mm_setzero_ps();
>         }
>
> };
>
> foo *p = new [](10);
>
I don't know.  At the trivial level, I presume you meant:
   foo *p = new foo[10];
At the deeper level, you have to see what your compiler vendor
provides.

> C++ regulatory body is obviously sitting with their collective thumbs
> up their arses instead of listening to developers and their needs.

The C++ standards committee is composed of volunteers who PAY to take
part.  They /do/ listen to developers, but different developers have
different needs.  Maybe a coherent proposal a few years ago would have
made it to C++0X, but I fear it is too late now.

> Not only that, compiler vendors are not able to provide workarounds
> for such issues because that would break the standard which btw is
> already broken.
Why would it break the standard?  As I said above, there's nothing to
stop new returning 16-byte aligned memory.

> Let me touch on stack alignment when you mentioned it:
>
> There is a compiler option to force stack alignment on every
> function entry to a given value. Problem is in the standard again,
> if you force say 16-byte alignment:
>
> void somefunc(void)
> {
>         char buf[32]; // sure, this will be aligned to 16 bytes
>         ...
>         for (int i = 0; i < 32; i++) {
>                 char foo[32]; // but this won't!
>                 __declspec(align(16)) char bar[32]; // unless you add explicit
> alignment directive
>         }

I /completely/ fail to see why that is a problem with the standard.
There is no reason that your compiler vendor can't associate the
property __dclspec(align(16)) with all auto declarations.  (I suspect
they don't because they think that on average the additional space
used will have a more negative impact on performance than the
performance benefit of being able to use SIMD).

> }
>
> Then what is the point of having forced alignment if you are not
> consistent? For every performance conscious developer those
> things just suck.
Talk to your compiler vendor.

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]





Author: Mathias Gaunard <loufoque@gmail.com>
Date: Sun, 18 Nov 2007 15:41:04 CST
Raw View
On Nov 13, 6:03 pm, levick...@sezampro.yu wrote:

> What should one do with this code?
>
> class foo
> {
> public:
>         __m128  vec;
>         ...
>         foo(void)
>         {
>                 vec = _mm_setzero_ps();
>         }
>
> };
>
> foo *p = new foo[10]; // [corrected]

Since the foo type has an alignment requirement of 16, new should
return properly aligned memory for that type.
If it doesn't, then it's an implementation issue.

It is true however that it is quite hard to implement it correctly,
since the standard asks operator new to return memory properly aligned
for any type ; and raising that default alignment could have an
important impact on efficiency.
A simple solution is to not use new. Which is quite a shame because
you lose the possibility of calling delete polymorphically.

As for the stack, if the compiler doesn't align foo as requested, it
is a pure implementation issue. In your case, char[32] has an
alignment requirement of 1, so it's not surprising if it isn't.

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]





Author: Martin Bonner <martinfrompi@yahoo.co.uk>
Date: Wed, 7 Nov 2007 10:17:35 CST
Raw View
On Nov 5, 5:58 pm, newsna...@delphin.all.de (daran) wrote:
> In article <WLudnWjZzJkLOynYnZ2dnUVZ_tHin...@giganews.com>,
>  p...@versatilecoding.com (Pete Becker) wrote:
>
> >  "K   i   tof    elechovski" wrote:
> > > Is there a portable way to implement an allocation function that fulfills
> > > the alignment requirements
> > > by other means than calling the standard C library functions?
> > No. There are two approaches commonly used. One is to define a union of
> > a bunch of types and hope that you've hit all the alignments that the
> > compiler uses.
>
> Which will fail on as common a plattform as the Mac. The required
> alignment is multiples of 16, whereas the size of the largest standard
> type is only 8.
Could you give more details please.  The alignment of a type must be
<= sizeof the type.  What types on Mac have an alignment of 16?  (NB.
I don't doubt your statement; I am just curious as to what reasonable
implementations may do.)


---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]





Author: Mathias Gaunard <loufoque@gmail.com>
Date: Mon, 12 Nov 2007 15:10:25 CST
Raw View
On Nov 7, 5:17 pm, Martin Bonner <martinfro...@yahoo.co.uk> wrote:

> Could you give more details please.  The alignment of a type must be
> <= sizeof the type.  What types on Mac have an alignment of 16?  (NB.
> I don't doubt your statement; I am just curious as to what reasonable
> implementations may do.)

It's just that the stack is 16-byte aligned, which is quite unrelated.


---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]





Author: "W Karas" <wkaras@yahoo.com>
Date: Wed, 24 Jan 2007 11:37:05 CST
Raw View

On Jan 22, 12:22 am, giecr...@stegny.2a.pl ("K   i   tof    elechovski")
wrote:
> Is there a portable way to implement an allocation function that fulfills
> the alignment requirements
> by other means than calling the standard C library functions?
> The problem is that the result must be aligned for any type;
> however, it is impossible to test whether a given raw pointer is suitably
> aligned for an object of any possible type,
> even if we can use alignment_of, because there are infinitely many composite
> types
> and their alignment cannot be inferred from their structure by standard
> means.
> Cf. also [basic.types]/5: it is undefined what it means that an address
> meets alignment requirements.
> Cf. also [meta.unary.prop]/2: it is undefined what it means that an address
> is a multiple of an integer;
> although it seems to be a more specific description than that
> [basic.types]/5,
> the standard does not specify any way to obtain an address from an integer
> by multiplication,
> except for reinterpret_cast<address>(integer * alignment), which is
> implementation-defined;
> however, it is not clear that this is the intended meaning.

In my effort to write a highly portable allocator
( http://www.geocities.com/wkaras/heapmm/heapmm.html ),
I left the alignment multiplier as part of the platform-specific
configuration.  On most platforms, you have several reasonable
choices for the alignment multiplier, depending on the speed-
space trade-off you wish to make.

I think the Committee should consider adding something like:
"Each implementation must define a type A where, given
an array m:

A m[DIM];

then the expression:

reinterpret_cast<T *>(m + i)

(i and integer expression) gives an aligned pointer to
any type T so long as:

i + (sizeof(T)  + sizeof(A) - 1)/sizeof(A) <= DIM "

Without this, all the nebulous stuff about alignment
doesn't seem very useful.


---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]