Topic: On alignment (final committee draft for C++0x and n1425 for C1X)


Author: Gennaro Prota <gennaro.prota@yahoo.com>
Date: Fri, 20 Aug 2010 17:01:07 CST
Raw View
 NOTE:

 This is multi-posted and cross-posted, and follow ups are set
 to comp.lang.c++.

 The cross-post is to comp.std.c++ and comp.lang.c++ and
 follow-ups are set to comp.lang.c++.

 Furthermore, a copy of this message was also posted to
 comp.std.c ("multi-posting"), for information, asking to use
 comp.lang.c++, instead, for any replies.

 Here's why:

 the message was originally intended for comp.std.c++ only;
 then I noticed that the wording it refers to was basically
 copied from a C1X draft, so I cross-posted it to the two
 ".std." groups. But the comp.std.c++ software auto-rejected
 it, on the grounds that this is difficult to handle.

 Furthermore, since these days comp.std.c++ has an unbelievably
 high latency the only way I could think of to make the
 discussion happen was to set the follow-ups to a low-latency
 group. I apologize, it's probably the Usenet hack of the year,
 and I'm not proud of it, but I really couldn't think how else
 to manage it (if you have better ideas, feel free to tell).

 In any case, beware that the message is geared towards C++,
 including the terminology and the references to the standard.
----------------------------------------------------------------


I was reading the Alignment paragraph ([basic.align]) in the FCD
for C++0x and was really, really perplexed.

In particular I couldn't find an answer to this question:

a) is "alignment" a function of the type (over the set of
complete object types [less, perhaps, array types])? Or can two
instances of the same type have different alignments?

(Note that in the question above "complete" refers to types, not
objects (parse it as "complete types that are object types").
Non-complete objects, i.e. sub-objects, do enter in the picture.
In particular I was looking for a guarantee that given e.g.

 void f() {
   T t ;
 }
 struct C {
   char c ;
   T    t2 ;
 } ;

the object t and the subobject t2 in an instance of C would have
the same alignment.)

Here are some sentences that I found particularly perplexing:

 --

 Furthermore, the types char, signed char, and unsigned char
 shall have the weakest alignment requirement.

That is? Just 1, no? I was thinking (before reading the
paragraph) that since sizeof( T ) must be a multiple of the
alignment on every object, and since by (a) (if it holds) the
alignment of the type is that of any object, it was guaranteed
that align( char ) == 1.

 --

 An aligment [sic] is an implementation-defined integer value
 representing the number of bytes between successive addresses
 at which a given object can be allocated.

Minimum positive number? (Among other things, if one doesn't
make it (existing and) unique I don't even see how one can use
the definite article "the".)

 <note>
 Note, too, that this definition (or pseudo such) doesn't imply
 that the numerical address is a multiple of the alignment:
 think e.g. of alignment = 4 and the invented addresses 7, 11,
 15 (as opposed to 8, 12, 16).

 One might thing that talking of addresses as numbers
 ("multiples of") is problematic in the context of the standard
 specification, but note that the above is basically talking
 about the difference of two arbitrary pointers, which isn't
 defined in general, either.
 </note>


And is it a function of the type or not? alignof is applicable
to a type-id and its description says "An alignof expression
yields the alignment requirement of its operand *type*".

(But why "alignment requirement" rather than just "alignment"?)

Also, consider:

 char c [[ align( 4 ) ]] ;
 static_assert( alignof( c ) == 1, "" ) ; // intentional?

(I think this is OK: the attribute applies to the declaration,
thus to the particular object c, not the type. I'm asking just
because I seem to recall a gcc patch where the author assumed
that alignof worked like their __alignof__. But then, their
__alignof__ may also yield different values for a standalone
double than for a double in a struct, at least on some targets.
Again we are at the "is a function of the type" issue.)


 --

 Alignments are represented as values of the type std::size_t

That is? I thought they *were* numbers. And, at this stage,
alignof hasn't been introduced yet, so what's the point of
bringing in std::size_t? Aren't we talking of integers in the
mathematical sense?

 --

 A fundamental alignment is represented by an alignment less
 than or equal...

An alignment is represented by an alignment?

Guys, please, consider that we need definitions, here, not
novels. If you have to explain what a fundamental alignment *is*
just say "a fundamental alignment is"; or something like "an
alignment is said to be "fundamental" if and only if...". (Note
that there's a "representing the number of bytes" above, too.
Just a little more acceptable than this one.)

In case you are wondering: yes, these things make me angry. They
waste everyone's time and mental energies.

 --

 Alignments have an order from weaker to stronger or stricter
 alignments. Stricter alignments have larger alignment values.
 An address that satisfies an alignment requirement also
 satisfies any weaker valid alignment requirement.

Again, vagueness. Couldn't you just have said e.g.:

 given two alignments a1 and a2 (a1 > 0, a2 > 0):

   - a1 is said to be weaker than a2 if and only if a1 is a
     proper integer submultiple of a2

   - a1 is said to be stronger, or stricter, than a2 if and
     only if a2 is weaker than a1

About this matter, I also found the following example in
7.6.2/7:

      [Example: An aligned buffer with an alignment requirement of A
 and holding N elements of type T other than char, signed char,
 or unsigned char can be declared as:

 T buffer [[ align(T), align(A) ]] [N];

 Specifying align(T) in the attribute-list ensures that the
 final requested alignment will not be weaker than alignof(T),
 and therefore the program will not be ill-formed.    end example
 ]

I thought that such a thing would require a minimum alignment
that was the lcm of align( T ) and A.

Hmm, I think I found the key: it's /assumed/ that any valid
alignment is a power of 2 with a non-negative integer exponent;
but where is such a requirement?

 --

 Valid alignments include only those values returned by an
 alignof expression for the fundamental types plus an
 additional implementation-defined set of values which may be
 empty.

What's the point of this if there's no requirement for the set
to be finite, or to contain PODs only, or to satisfy any
particular property? As I see it, this is just saying that it's
implementation-defined what alignments are valid, and that
alignof shall only yield valid alignments.


A PROPOSED, PROVISIONAL, NEW WORDING
------------------------------------

 Here's some provisional wording which I think solves the
 problems above. With this in place the paragraph about the
 alignment attribute and the alignof operator would only need
 minor tweaks.

 NOTE: Just because of ASCII limitations, I use "!=" for "not
 equal to" and "**" for "raised to".

For each implementation, there exists a mathematical function

 align: S -> V

defined on the set S of all and only the complete types that are
object types but not array types. Its codomain V contains only
powers of two with an integral non-negative integer exponent.

For every t belonging to S, align(t) is the greatest a=2**k,
with k being a non-negative integer, such that

 - all addresses at which instances of t can be placed are
   exact multiples of a and

 - it's possible for the implementation to place some instances
   of t at an address which is *not* a multiple of 2a.
      [footnote: Thus, for instance, an implementation which
   places all instances of t to addresses multiple of 8 cannot
   "lie" and just consider the alignment of the type to be four
   on the ground that any multiple of 8 is also a multiple of
   4. --endfootnote]

[NOTE: although there doesn't necessarily exist a way for the
program to check whether an address is a multiple of a given
integer, this is intended to be unsurprising to those who know
the addressing structure of the underlying machine. And when an
integral type Int large enough exists, it is intended that
reinterpret_cast< Int >( address ) % n == 0 has the expected
truth value.]

Note that, due to the power-of-two requirement, the following
property trivially holds: given two values in V, a1 and a2, a1
is a submultiple of a2 if and only if a1 <= a2; or,
equivalently, if and only if log2(a1) <= log2(a2).

Also, the least common multiple of two alignments is just the
greatest of them.

By definition, an alignment a1 is said to be "stricter" (or
"stronger") than a2 if and only if a2 != a1 and a2 is a
submultiple of a1.

Likewise, by definition, a1 is said to be "weaker" than a2 <=>
a2 is stricter than a1.

Let t0 be a type in the domain of align and arr an array
thereof, with at least two elements: since two consecutive
elements of arr have each an address multiple of align(t0) then
the positive difference (i.e. the difference from the address of
the later one), which is sizeof(t0), is a multiple of align(t0),
too. That is:

 - for any type in S, align(t) is a submultiple of sizeof(t).

In particular, align( char ) is 1.

--
 Gennaro Prota         |           name.surname yahoo.com
   Breeze C++ (preview): <https://sourceforge.net/projects/breeze/>
   Do you need expertise in C++?   I'm available.


[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@netlab.cs.rpi.edu]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]