Topic: portable method of determining endianness byte order.


Author: "Bill Wade" <bill.wade@stoner.com>
Date: 1999/12/14
Raw View
David R Tribble wrote in message <384FE344.170E84DE@tribble.com>...
>Perhaps it would be reasonable to suggest a set of functions for
>the standard library that would allow extracting native integers
>from a network-ordered array of bytes and visa versa.  (This could
>also be accomplished as a set of iostream inserter/extractor
>functions.)  Making them part of the standard would be more of a
>guarantee that they would work correctly (no matter how they are
>implemented underneath) than the current state of affairs where
>everyone reimplements them in different ways.

There is a standard set of C functions for this, defined by XPG4 (X-Open).
Of course they only work in an environment which is very restrictive about
data types (or more precisely ranges of data values).
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]





Author: David R Tribble <david@tribble.com>
Date: 1999/12/16
Raw View
Bill Wade wrote:
>
> David R Tribble wrote in message <384FE344.170E84DE@tribble.com>...
>> Perhaps it would be reasonable to suggest a set of functions for
>> the standard library that would allow extracting native integers
>> from a network-ordered array of bytes and visa versa.  (This could
>> also be accomplished as a set of iostream inserter/extractor
>> functions.)  Making them part of the standard would be more of a
>> guarantee that they would work correctly (no matter how they are
>> implemented underneath) than the current state of affairs where
>> everyone reimplements them in different ways.
>
> There is a standard set of C functions for this, defined by XPG4
> (X-Open).  Of course they only work in an environment which is very
> restrictive about data types (or more precisely ranges of data
> values).

Well, by "standard" I meant "ISO C or C++ standard library".

POSIX has some truly wonderful things in it, but not every C/C++
platform provides them.

-- David R. Tribble, david@tribble.com, http://david.tribble.com --


[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]






Author: David R Tribble <david@tribble.com>
Date: 1999/12/11
Raw View
David R Tribble <david@tribble.com> wrote:
>> But to answer the original question (can endianness be determined
>> as compile time?), the answer is: No, unless you run a program
>> (like the one above) to create a header file; such a header file
>> then contains constants that will tell the compiler, at compile time,
>> integer endianness.
>>
>> It would be nice if <limits.h> contained a few macros like those
>> above, so that programs could indeed know at compile time things
>> like endianness.  Alas.

C. M. Heard wrote:
> I do not agree.  I have worked with networking code that overlays
> structures onto byte arrays and uses byte-swapping macros (the
> infamous ntohl, ntohs, htonl, and htons in sys/byteorder.h on many
> unix systems) to convert between the external and internal
> representation.  I have also written with networking code that
> uses explicit shifting and masking to do the same thing.  I've found
> the latter approach to be much more reliable, in the sense that the
> code is much more likely to be written correctly the first time --
> it's very, very easy to forget to invoke a byte-swapping macro in a
> seldom-used code path.  I consider byte-swapping macros to be a
> major maintenance headache, and I'm glad that they are not part
> of the standard.

Perhaps it would be reasonable to suggest a set of functions for
the standard library that would allow extracting native integers
from a network-ordered array of bytes and visa versa.  (This could
also be accomplished as a set of iostream inserter/extractor
functions.)  Making them part of the standard would be more of a
guarantee that they would work correctly (no matter how they are
implemented underneath) than the current state of affairs where
everyone reimplements them in different ways.

-- David R. Tribble, david@tribble.com, http://david.tribble.com --
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]





Author: Bill Wade <bill.wade@stoner.com>
Date: 1999/12/06
Raw View
Steve Clamage wrote in message <384588BB.E3DB6EC9@sun.com>...

>I don't think you can write a C++ program having defined behavior that
>can tell what byte order is used in physical memory.

In the same sense that you can't tell if your machine uses virtual memory, I
agree with you.  However by byte order I am talking about observable
behavior.

If you establish that integer types have no padding (which you can do by
computations on limits.h values) you can write something like.

bool ShortIsLittleEndian()
{
    unsigned short s;
    unsigned char* c = (unsigned char*)&s;

    for(s = -1; s; --s)
    {
        unsigned short t = s;
        for(size_t i = 0; i < sizeof(s); ++i)
        {
            if(t & UCHAR_MAX != c[i])
                return false;
            t >>= CHAR_BIT;
        }
    }
    return true;
};

Of course if short is a 128-bit type this will take a while to run.
Naturally !little_endian does not necessarily imply big_endian.
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]





Author: David R Tribble <david@tribble.com>
Date: 1999/12/06
Raw View
Hyman Rosen wrote:
>
> Francis Glassborow <francis@robinton.demon.co.uk> writes:
>> Because no such method exists.  And why do you assume that it is a
>> simple two way choice?
>
> If I recall correctly, the VAX had a mixed model for its 32-bit
> integers. I think the order for 0x12345678 was 34 12 78 56.

No, you're thinking of its predecessor, the PDP-11.
VAX integers (32 bits) were little-endian (e.g., 0x11223344 was
<0x44,0x33,0x22,0x11>), while PDP-11 long ints (32 bits) were
mixed-endian (e.g., 0x11223344 was <0x22,0x11,0x44,0x33>).

-- David R. Tribble, david@tribble.com, http://david.tribble.com --
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]





Author: David R Tribble <david@tribble.com>
Date: 1999/12/06
Raw View
Steve Clamage wrote:
>
> Bill Wade wrote:
>>
>> Francis Glassborow wrote in message ...
>> > ... why do you assume that [integer endianess] is a
>> >simple two way choice?
>>
>> Maybe the original poster read read footnote 44 (associated with
>> 3.9.1/7)
>> which strongly implies that adjacent bits represent adjacent powers
>> of two.  It takes some imagination to suppose that two bits in
>> separate bytes can be adjacent to each other, if their bytes aren't
>> adjacent to each other.  It would seem that the least significant
>> byte of an integer must be at one end, and the most significant at
>> the other, with any intermediate bytes in a uniform order.  Thus a
>> simple two way choice.
>
> I don't think you can write a C++ program having defined behavior that
> can tell what byte order is used in physical memory. That is, the
> requirement on a "pure binary representation" affects the observable
> behavior of operations like shifting and masking. For example, you are
> guaranteed that for an int k, 0<=k<=(INT_MAX/2) implies (k*2)==(k<<1),
> and that (0x12345678 & 0x0000FF00)==0x5600.
>
> I don't think there are any guarantees in the standard about what
> happens when you extract arbitrary bytes from an object in memory,
> however. For example, if you write
>         int k = 0x12345678; // 32-bit int, 8-bit bytes
>         unsigned char* p = (unsigned char*)&k + 2;
>         cout << hex << *p;
> the operations are valid in the sense that no undefined behavior is
> involved, but I don't think you can expect to see 34 (little-endian)
> or 56 (big-endian) as the output.

I believe that the following program, while only portable to
systems where sizeof(long)==4, is well-behaved:

    #include <limits.h>
    #include <stdio.h>

    union U
    {
        unsignd long    i;
        unsigned char   c[4];
    };

    int main()
    {
        union U   u;

        if (sizeof(long) != 4)
        {
            printf("Sorry, wrong long int size\n");
            return 1;
        }

        //u.i = 0x01020304L;
        u.i = (0x01L << sizeof(int)*CHAR_BIT*3/4) +
              (0x02L << sizeof(int)*CHAR_BIT*2/4) +
              (0x03L << sizeof(int)*CHAR_BIT*1/4) +
              (0x04L << sizeof(int)*CHAR_BIT*0/4);

        if (u.c[0] == 0x01  or  u.c[0] == 0x02)
            printf("#define ORD_WORD_BIG    1\n");
        else
            printf("#define ORD_WORD_BIG    0\n");

        if (u.c[0] == 0x01  or  u.c[0] == 0x03)
            printf("#define ORD_BYTE_BIG    1\n");
        else
            printf("#define ORD_BYTE_BIG    0\n");

        return 0;
    }

The only thing it does that comes close to being "undefined" is
to access the first byte of an unsigned int as an unsigned char
(via a union).  But if I'm not mistaken, the C++ rules of unsigned
binary representation make this fairly well-behaved.

But to answer the original question (can endianness be determined
as compile time?), the answer is: No, unless you run a program
(like the one above) to create a header file; such a header file
then contains constants that will tell the compiler, at compile time,
integer endianness.

It would be nice if <limits.h> contained a few macros like those
above, so that programs could indeed know at compile time things
like endianness.  Alas.

-- David R. Tribble, david@tribble.com, http://david.tribble.com --
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]





Author: Thomas Matelich <tmatelich@zetec.com>
Date: 1999/12/06
Raw View
Francis Glassborow wrote:

> In article <t73dtnacpa.fsf@calumny.jyacc.com>, Hyman Rosen
> <hymie@prolifics.com> writes
> >> Pukalo Boyd 810-492-3661 wrote:
> >> > I have been trying to find a portable method of determining the
> >> > endianness byte order at compile time
> >
> >>    c = (char*)&n;
> >>    byte_ordering = (!*c) ? BIG_ENDIAN : LITTLE_ENDIAN;
> >
> >This does not happen at compile time, but at run time,
> >so you have not answered the question.
>
> Because no such method exists.  And why do you assume that it is a
> simple two way choice?  (after all those of you the wrong side of the
> Atlantic/Pacific write your dates in an illogical ordering, neither day
> first nor last.  But it has just struck me that as you sit between Japan
> with its entirely logical year, month, day and Europe with its inverted
> day, month, year perhaps it makes sense to do it different from both:)
>

I did not limit myself to a two way choice, which is why I did not use a
bool.  However, there are only two orderings I am concerned about at the
time.  All my code using this class is designed to allow other byte
orderings to be added should the need arise.

I do agree that our dates are illogical, though.  I am trying to decide if
it is just cultural bias that I think Japan's method is inverted.  I use
Europe's personally.  At least our driving lanes aren't inverted (kidding :)

--
Thomas O Matelich
Senior Software Designer
Zetec, Inc.
sosedada@usa.net
tmatelich@zetec.com
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]





Author: "James Kuyper Jr." <kuyper@wizard.net>
Date: 1999/12/06
Raw View
Bill Wade wrote:
....
> Maybe the original poster read read footnote 44 (associated with 3.9.1/7)
> which strongly implies that adjacent bits represent adjacent powers of two.
> It takes some imagination to suppose that two bits in separate bytes can be
> adjacent to each other, if their bytes aren't adjacent to each other.  It

The amount of imagination required is not large enough to prevent real
computers from being built with hardware implementing such ordering. 32
bits, if divided into 4 groups of 8 consecutive bits each, be assigned
to 4 8-bit bytes in 12 different orders. I know of at least 4 of those
orders that are actually in use, and I've heard that two others are also
in use. 1234 and 4321 are by far the most common, but 2143 and 3412 are
not uncommon.

The standard doesn't define successive bits. That leaves the implementor
free to do so.
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]





Author: Hyman Rosen <hymie@prolifics.com>
Date: 1999/12/07
Raw View
David R Tribble <david@tribble.com> writes:
> No, you're thinking of its predecessor, the PDP-11.
> VAX integers (32 bits) were little-endian (e.g., 0x11223344 was
> <0x44,0x33,0x22,0x11>), while PDP-11 long ints (32 bits) were
> mixed-endian (e.g., 0x11223344 was <0x22,0x11,0x44,0x33>).

Sounds right. When I first started programming in C, it was on a
PDP-11/45, using a compiler which didn't implement longs or unions,
and which represented structure fields as absolute offsets, so that
no two structures could have the same field names unless the fields
were at the same offset in each structure. That was around 1979.


[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]






Author: "C. M. Heard/VVNET, Inc." <heard@vvnet.com>
Date: 1999/12/07
Raw View
David R Tribble <david@tribble.com> wrote:
>But to answer the original question (can endianness be determined
>as compile time?), the answer is: No, unless you run a program
>(like the one above) to create a header file; such a header file
>then contains constants that will tell the compiler, at compile time,
>integer endianness.
>
>It would be nice if <limits.h> contained a few macros like those
>above, so that programs could indeed know at compile time things
>like endianness.  Alas.

I do not agree.  I have worked with networking code that overlays
structures onto byte arrays and uses byte-swapping macros (the
infamous ntohl, ntohs, htonl, and htons in sys/byteorder.h on many
unix systems) to convert between the external and internal
representation.  I have also written with networking code that
uses explicit shifting and masking to do the same thing.  I've found
the latter approach to be much more reliable, in the sense that the
code is much more likely to be written correctly the first time --
it's very, very easy to forget to invoke a byte-swapping macro in a
seldom-used code path.  I consider byte-swapping macros to be a
major maintenance headache, and I'm glad that they are not part
of the standard.

Mike
--
C. M. Heard/VVNET, Inc.
heard@vvnet.com
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]





Author: "AllanW" <AllanW@my-deja.com>
Date: 1999/12/07
Raw View
Steve Clamage <stephen.clamage@sun.com> wrote in message
news:38455F22.F7244F42@sun.com...
>
> Hyman Rosen wrote:
> >
> > Francis Glassborow <francis@robinton.demon.co.uk> writes:
> > > Because no such method exists.  And why do you assume that it is a
> > > simple two way choice?
> >
> > If I recall correctly, the VAX had a mixed model for its 32-bit
> > integers. I think the order for 0x12345678 was 34 12 78 56.
> >
>
> No, integers on the VAX were strictly little-endian, but
> floating-point numbers were neither big- nor little-endian.
> The most significant byte was in the middle of the storage
> area in memory. When you tranferred a floating-point value
> between memory and a register, the bytes were arranged into
> the appropriate order.
>
> I did work on an ideosyncratic 16-bit system where 16-bit
> values were little-endian and 32-bit values were stored
> in the order you show above.

You're both right. When the VAX executed code in PDP-11 compatibility mode,
it used 16-bit registers. The RSX-11 system calls (which are emulated in
VMS) used 32-bit values in two 16-bit words, least-significant word first,
but each word was stored with the most-significant byte first. The result
was the order shown above.

This is also the run-time answer to the original question: on systems with
32-bit longs and sizeof(long)==4, there are 4*3*2*1=24 possible "endian"
arrangements, although I suspect that all but three are quite rare. So you
can use:
    enum endian {
        big_endian = 0x4321,
        little_endian = 0x5678,
        rsx_endian = 0x3412
    };
    endian find_endian() {
        assert(4 == sizeof(long));
        union {
            long l;
            unsigned char c[4];
        } u = 0x12345678;
        switch (u.c[0]) {
            case 0x12: return little_endian;
            case 0x34: return rsx_endian;
            case 0x78: return big_endian;
            default: assert(false); // Special case not handled
        };
    };

As for doing the same thing at compile-time: This is dangerous, because
there's no guarantee that the compile-time environment even matches the
run-time environment. But if you want to make this dangerous assumption, you
might get by with:
    #if 'ABCD'&0xFF=='A'
        // big-endian
    #else
    #if 'ABCD'&0xFF=='B'
        // rsx-endian
    #else
    #if 'ABCD'&0xFF=='D'
        // little-endian
    #else
    #error Machine not recognized
    #endif
    #endif
    #endif

I say "Might" because we're layering undefined upon undefined -- even if a
compiler recognizes multiple-character constants (and not all do), that
doesn't mean that the precompiler does, and even if it does, will it support
implicit cast to int, and will it do so the same way that the run-time
system does?

You're better off relying on compiler-specific macros, such as __MSC_VER for
Microsoft or __BORLAND_C for Borland (if these aren't right, they're at
least close). This does mean that each new port requires slight changes to a
"configuration" header to detect the new platform. In practice this
shouldn't be a killer.
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]





Author: Ron Natalie <ron@sensor.com>
Date: 1999/12/07
Raw View
Hyman Rosen wrote:
>
> David R Tribble <david@tribble.com> writes:
> > No, you're thinking of its predecessor, the PDP-11.
> > VAX integers (32 bits) were little-endian (e.g., 0x11223344 was
> > <0x44,0x33,0x22,0x11>), while PDP-11 long ints (32 bits) were
> > mixed-endian (e.g., 0x11223344 was <0x22,0x11,0x44,0x33>).
>
> Sounds right. When I first started programming in C, it was on a
> PDP-11/45, using a compiler which didn't implement longs or unions,
> and which represented structure fields as absolute offsets, so that
> no two structures could have the same field names unless the fields
> were at the same offset in each structure. That was around 1979.
>

Ah yes, those were the days.  That's why most system code from
UNIX uses structure members with the type uniquer (u_uid) at the
beginning as the tags needed to be unique.

I still remember the following construct in the kernel:

#define PS 0177776

struct {
   int integ;
}

   if(PS->integ) ...


The problem is that the PDP-11 didn't really support 32 bit math.
The implementation put the high order word of a long into the second
place word in memory because that way it would be compatibile with
16 bit values when sloppily mixed.

-Ron p&P6
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]





Author: Hyman Rosen <hymie@prolifics.com>
Date: 1999/12/01
Raw View
Francis Glassborow <francis@robinton.demon.co.uk> writes:
> Because no such method exists.  And why do you assume that it is a
> simple two way choice?

If I recall correctly, the VAX had a mixed model for its 32-bit
integers. I think the order for 0x12345678 was 34 12 78 56.


[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]






Author: "Bill Wade" <bill.wade@stoner.com>
Date: 1999/12/01
Raw View

Francis Glassborow wrote in message ...
> ... why do you assume that [integer endianess] is a
>simple two way choice?

Maybe the original poster read read footnote 44 (associated with 3.9.1/7)
which strongly implies that adjacent bits represent adjacent powers of two.
It takes some imagination to suppose that two bits in separate bytes can be
adjacent to each other, if their bytes aren't adjacent to each other.  It
would seem that the least significant byte of an integer must be at one end,
and the most significant at the other, with any intermediate bytes in a
uniform order.  Thus a simple two way choice.

I suppose an integer type could have a byte ordering of 2341 if sizeof(type)
== sizeof(address space) and addresses wrap around ;-).

Being paranoid, I wouldn't count on all vendors observing that footnote (or
reading it with my interpretation).  The actual footnote uses "successive"
rather than "adjacent."  I suppose a vendor (like a culture) might use
strange rules of succession.



[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]






Author: Steve Clamage <stephen.clamage@sun.com>
Date: 1999/12/01
Raw View
Hyman Rosen wrote:
>
> Francis Glassborow <francis@robinton.demon.co.uk> writes:
> > Because no such method exists.  And why do you assume that it is a
> > simple two way choice?
>
> If I recall correctly, the VAX had a mixed model for its 32-bit
> integers. I think the order for 0x12345678 was 34 12 78 56.
>

No, integers on the VAX were strictly little-endian, but
floating-point numbers were neither big- nor little-endian.
The most significant byte was in the middle of the storage
area in memory. When you tranferred a floating-point value
between memory and a register, the bytes were arranged into
the appropriate order.

I did work on an ideosyncratic 16-bit system where 16-bit
values were little-endian and 32-bit values were stored
in the order you show above.

--
Steve Clamage, stephen.clamage@sun.com


[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]






Author: Steve Clamage <stephen.clamage@sun.com>
Date: 1999/12/01
Raw View
Bill Wade wrote:
>
> Francis Glassborow wrote in message ...
> > ... why do you assume that [integer endianess] is a
> >simple two way choice?
>
> Maybe the original poster read read footnote 44 (associated with 3.9.1/7)
> which strongly implies that adjacent bits represent adjacent powers of two.
> It takes some imagination to suppose that two bits in separate bytes can be
> adjacent to each other, if their bytes aren't adjacent to each other.  It
> would seem that the least significant byte of an integer must be at one end,
> and the most significant at the other, with any intermediate bytes in a
> uniform order.  Thus a simple two way choice.

I don't think you can write a C++ program having defined behavior that
can tell what byte order is used in physical memory. That is, the
requirement on a "pure binary representation" affects the observable
behavior of operations like shifting and masking. For example, you are
guaranteed that for an int k, 0<=k<=(INT_MAX/2) implies (k*2)==(k<<1),
and that (0x12345678 & 0x0000FF00)==0x5600.

I don't think there are any guarantees in the standard about what
happens when you extract arbitrary bytes from an object in memory,
however. For example, if you write
 int k = 0x12345678; // 32-bit int, 8-bit bytes
 unsigned char* p = (unsigned char*)&k + 2;
 cout << hex << *p;
the operations are valid in the sense that no undefined behavior is
involved, but I don't think you can expect to see 34 (little-endian)
or 56 (big-endian) as the output.

--
Steve Clamage, stephen.clamage@sun.com


[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]






Author: Francis Glassborow <francis@robinton.demon.co.uk>
Date: 1999/12/01
Raw View
In article <823p8g$6gj@library1.airnews.net>, Bill Wade
<bill.wade@stoner.com> writes
>Maybe the original poster read read footnote 44 (associated with 3.9.1/7)
>which strongly implies that adjacent bits represent adjacent powers of two.
>It takes some imagination to suppose that two bits in separate bytes can be
>adjacent to each other, if their bytes aren't adjacent to each other.  It
>would seem that the least significant byte of an integer must be at one end,
>and the most significant at the other, with any intermediate bytes in a
>uniform order.  Thus a simple two way choice.
You are confusing logical organisation of bits (as used for shifts etc.)
with the organisation of bytes (which can be detected with techniques
such as unions)  There is nothing that says that adjacent bytes
represent adjacent powers of 256 (512 or whatever else constitutes a
byte on your hardware)

Francis Glassborow      Journal Editor, Association of C & C++ Users
64 Southfield Rd
Oxford OX4 1PA          +44(0)1865 246490
All opinions are mine and do not represent those of any organisation


[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]






Author: "Bill Wade" <bill.wade@stoner.com>
Date: 1999/12/02
Raw View

Francis Glassborow wrote in message ...

>You are confusing logical organisation of bits (as used for shifts etc.)
>with the organisation of bytes (which can be detected with techniques
>such as unions)  There is nothing that says that adjacent bytes
>represent adjacent powers of 256 (512 or whatever else constitutes a
>byte on your hardware)

If the standard requires adjacent bits to represent adjacent powers of two,
it would seem that it requires bits separated by CHAR_BIT to represent
values which differ by 1<<CHAR_BIT.  That would imply that a bit pattern in
an eight-bit unsigned char (within a larger integer type) must represent a
value which differs by a factor of 256 from the value of the same bit
pattern in an adjacent char.

Now the standard uses the word "successive" rather than "adjacent", and that
is only in a footnote.  If I ever have to write a compiler for a machine
that has hardware support for a byte order like 4132, I expect I'll take
advantage of the hardware.





[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]






Author: Hyman Rosen <hymie@prolifics.com>
Date: 1999/11/30
Raw View
Thomas Matelich <tmatelich@zetec.com> writes:
> Pukalo Boyd 810-492-3661 wrote:
> > I have been trying to find a portable method of determining the
> > endianness byte order at compile time

>    c = (char*)&n;
>    byte_ordering = (!*c) ? BIG_ENDIAN : LITTLE_ENDIAN;

This does not happen at compile time, but at run time,
so you have not answered the question.


[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]






Author: Thomas Matelich <tmatelich@zetec.com>
Date: 1999/12/01
Raw View
Hyman Rosen wrote:

> Thomas Matelich <tmatelich@zetec.com> writes:
> > Pukalo Boyd 810-492-3661 wrote:
> > > I have been trying to find a portable method of determining the
> > > endianness byte order at compile time
>
> >    c = (char*)&n;
> >    byte_ordering = (!*c) ? BIG_ENDIAN : LITTLE_ENDIAN;
>
> This does not happen at compile time, but at run time,
> so you have not answered the question.

However, in the next message by me, I stated this myself.


--
Thomas O Matelich
Senior Software Designer
Zetec, Inc.
sosedada@usa.net
tmatelich@zetec.com




[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]






Author: Francis Glassborow <francis@robinton.demon.co.uk>
Date: 1999/12/01
Raw View
In article <t73dtnacpa.fsf@calumny.jyacc.com>, Hyman Rosen
<hymie@prolifics.com> writes
>> Pukalo Boyd 810-492-3661 wrote:
>> > I have been trying to find a portable method of determining the
>> > endianness byte order at compile time
>
>>    c = (char*)&n;
>>    byte_ordering = (!*c) ? BIG_ENDIAN : LITTLE_ENDIAN;
>
>This does not happen at compile time, but at run time,
>so you have not answered the question.

Because no such method exists.  And why do you assume that it is a
simple two way choice?  (after all those of you the wrong side of the
Atlantic/Pacific write your dates in an illogical ordering, neither day
first nor last.  But it has just struck me that as you sit between Japan
with its entirely logical year, month, day and Europe with its inverted
day, month, year perhaps it makes sense to do it different from both:)




Francis Glassborow      Journal Editor, Association of C & C++ Users
64 Southfield Rd
Oxford OX4 1PA          +44(0)1865 246490
All opinions are mine and do not represent those of any organisation


[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]






Author: AllanW <allan_w@my-deja.com>
Date: 1999/11/23
Raw View
In article <814kpp$j6j@library1.airnews.net>,
  "Bill Wade" <bill.wade@stoner.com> wrote:
>
>
> Pukalo Boyd 810-492-3661 wrote in message
> <38329967.470B6CED@hqs.mid.gmeds.com>...
> >I have been trying to find a portable method of determining the
> >endianness byte order at compile time
> >to produce code that will work properly on both windows and unix
> >machines. I have searched the documentation
> >i have ( the books on the C Standard Library and the C++ Standard
> >Library) and havent found anything. Please help. Any pointers or
> >suggestions would be very helpful. Thanks.
>
> There is no standard way.  In many cases tests like
>   (('abcd' & 0xff) == 'd')
> can be used to successfully distinguish endianness at compile time.

Presumably you mean to use this in preprocessor directives:

    #if (('abcd' & 0xFF) == 'd')
      // ...
    #endif

However, this won't work. First off, many systems don't
handle multiple-character char literals like 'abcd'.
Second, there is no requirement that the host compiler
and the target system have the same endianness. To say
that another way, there is no requirement that the
preprocessor and the compiler treat this expression the
same way.

To do this you must use one of two different techniques:

    * Have at least one header file that has specific versions
      for each OS and compiler that you use. Part of copying
      over the source code to the next compiler would involve
      creating this header file which is used throughout the
      system. For instance, when copying the files to a system
      which will use the Microsoft compiler, you would also
      copy file ENVIRON_WINDOWS_MICROSOFT.H to ENVIRON.H.
      However, if you were porting it to a unix box with GCC
      then you would instead copy file ENVIRON_UNIX_GCC.H to
      ENVIRON.H.

      In the ENVIRON_WINDOWS_MICROSOFT.H file, you would have:
          // Microsoft C or C++ running on Intel platform
          const int LOWESTBYTE =0;
          const int LOWERBYTE  =1;
          const int HIGHERBYTE =2;
          const int HIGHESTBYTE=3;
          // Four-byte signed integer type
          typedef long INT4;
          // ... etc ...
      Other systems would have identical statements except that
      the values might be different.

    * Another way is to have just one file named ENVIRON.H,
      but with specific sections for each compiler. This
      relies on knowledge of compiler-specific pre-defined
      symbols. For instance, all versions of Microsoft C or
      C++ have defined the symbol __MSVC__, so you could
      have a section like this:
          #ifdef __MSVC__
          // Microsoft C or C++ running on Intel platform
          const int LOWESTBYTE =0;
          const int LOWERBYTE  =1;
          const int HIGHERBYTE =2;
          const int HIGHESTBYTE=3;
          // Four-byte signed integer type
          typedef long INT4;
          // ... etc ...
          #endif

      You would have one section like this for each compiler
      that you support; you may have to dig through the
      documentation to find some pre-defined symbol (or
      combination of symbols) for each compiler. If neccesary,
      you could change the build commands to add additional
      symbols; all compilers that I know of have a command-
      line switch such as /DXXX to define symbol XXX. You
      could use this to define a symbol on one platform, if
      you couldn't find any helpful symbols created by the
      compiler.

With each of these techniques, it's advisable to add an
assert in main() to make sure that it's set correctly.
Since the assert will only be tested once, the run-time
penalty is approximately zero:

    void assert_endian() {
        union {
            char c[sizeof(INT4)];
            INT4 l;
        } u;
        u.l = 0x01020304;
        if (    sizeof(INT4) != 4    ||
            u.c[LOWESTBYTE]  != 0x04 ||
            u.c[LOWERBYTE]   != 0x03 ||
            u.c[HIGHERBYTE]  != 0x02 ||
            u.c[HIGHESTBYTE] != 0x01)
        {
            // The assert will trigger if asserts are enabled
            assert(0); // Endian order set wrong!

            // Asserts aren't enabled,
            // but don't let program run!
            exit(4);
        }
    }

    int main() {
        // We only need to call this once
        assert_endian();

        // ... whatever else...
    }

--
Allan_W@my-deja.com is a "Spam Magnet," never read.
Please reply in newsgroups only, sorry.


Sent via Deja.com http://www.deja.com/
Before you buy.


[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]






Author: "Bill Wade" <bill.wade@stoner.com>
Date: 1999/11/23
Raw View
AllanW wrote in message <81cpkd$t5e$1@nnrp1.deja.com>...

>Presumably you mean to use this in preprocessor directives:
>
>    #if (('abcd' & 0xFF) == 'd')
>      // ...
>    #endif
>
>However, this won't work.

This won't always work.  It will work often enough to be a useful tool, when
used in conjunction with runtime tests.

Without this kind of tool you need to write something like

#if __system1  || __system2 ...
  // little endian
  const int INTORDER = 1234;
#elif __system18 || __system23 ...
  // big endian
  const int INTORDER = 4321;
#else
  #error Tell me endianess
#endif

static bool intorder_test = validate_intorder_or_die(INTORDER);

This requires manual intervention (or a configure program) every time you
port to a new system.

If instead you write something like

#if __system2
  // little endian, but test below fails
  #define INTORDER = 1234;
#elif __system23
  // big endian, but test below fails
  #define INTORDER = 4321;
#elif '\4\3\2\1' == 0x04030201
  #define INTORDER = 4321;
#elif '\4\3\2\1' == 0x01020304
  #define INTORDER = 1234;
#endif

static bool intorder_test = validate_intorder_or_die(INTORDER);

then you will sometimes be able to avoid modifying this file when porting to
a new system.  The test doesn't have to always work.  It is useful if it
works "often enough."  In other words if its benefits exceed its costs.  In
my experience (mostly systems with 8-bit char, two's complement, sizeof(int)
2 or 4, sizeof(long) 4 or 8, few cross-compilers), tests similar to this
succeed often enough to be "useful."
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]





Author: Thomas Matelich <tmatelich@zetec.com>
Date: 1999/11/22
Raw View
Pukalo Boyd 810-492-3661 wrote:

> I have been trying to find a portable method of determining the
> endianness byte order at compile time
> to produce code that will work properly on both windows and unix
> machines. I have searched the documentation
> i have ( the books on the C Standard Library and the C++ Standard
> Library) and havent found anything. Please help. Any pointers or
> suggestions would be very helpful. Thanks.
>

Here is what I came up with recently for this problem:

 //btw: BIG_ENDIAN -> UNIX hardware, LITTLE_ENDIAN -> x86 hardware
 enum ByteOrdering { BIG_ENDIAN, LITTLE_ENDIAN };

 class EndianNess
 {
 public:
  operator ByteOrdering() const { return byte_ordering; } //implicit
conversion

  friend const EndianNess& Omega();
 private:
  EndianNess()
  {
   //checks the first byte for a one
   //I have verified this to be correct on Windows and HPUX
   unsigned int n = 1;
   char* c;
   c = (char*)&n;
   byte_ordering = (!*c) ? BIG_ENDIAN : LITTLE_ENDIAN;
  }

  ByteOrdering byte_ordering;
 };

 inline const EndianNess& Omega()
 {
  static EndianNess _omega;
  return _omega;
 }


I had my reasons for returning EndianNess instead of ByteOrdering from
Omega, though I cannot think of them now.  This will not work on vax I
believe because it is something besides big or little endian.


Tom Matelich
sosedada@usa.net
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]





Author: Thomas Matelich <tmatelich@zetec.com>
Date: 1999/11/22
Raw View
--------------D5410C6688EEB913D9699421
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit

Pukalo Boyd 810-492-3661 wrote:

> I have been trying to find a portable method of determining the
> endianness byte order at compile time
> to produce code that will work properly on both windows and unix
> machines. I have searched the documentation
> i have ( the books on the C Standard Library and the C++ Standard
> Library) and havent found anything. Please help. Any pointers or
> suggestions would be very helpful. Thanks.
>
> Boyd Pukalo
>

Oops, my code was for runtime detection.  What I ended up doing was to write
a class called SwapFile which wrapped a FILE*.  For dealing with endianness,
it checks Omega(), and gives you the option to either specify the endianness
of the file you are dealing with or to check a specific value at a specific
location in the file and determine the files byte ordering.  Then you can
call the read or write function and have it swap if the byte order of the
file differs with the byte order of the system.

PS:  the nicest way I have found to swap data is to cast the address as a
char* and use std::reverse.


--
Thomas O Matelich
Senior Software Designer
Zetec, Inc.
sosedada@usa.net
tmatelich@zetec.com



--------------D5410C6688EEB913D9699421
Content-Type: text/html; charset=us-ascii
Content-Transfer-Encoding: 7bit

<!doctype html public "-//w3c//dtd html 4.0 transitional//en">
<html>
Pukalo Boyd 810-492-3661 wrote:
<blockquote TYPE=CITE>I have been trying to find a portable method of determining
the
<br>endianness byte order at compile time
<br>to produce code that will work properly on both windows and unix
<br>machines. I have searched the documentation
<br>i have ( the books on the C Standard Library and the C++ Standard
<br>Library) and havent found anything. Please help. Any pointers or
<br>suggestions would be very helpful. Thanks.
<p>Boyd Pukalo
<br>&nbsp;</blockquote>
Oops, my code was for runtime detection.&nbsp; What I ended up doing was
to write a class called SwapFile which wrapped a FILE*.&nbsp; For dealing
with endianness, it checks Omega(), and gives you the option to either
specify the endianness of the file you are dealing with or to check a specific
value at a specific location in the file and determine the files byte ordering.&nbsp;
Then you can call the read or write function and have it swap if the byte
order of the file differs with the byte order of the system.
<p>PS:&nbsp; the nicest way I have found to swap data is to cast the address
as a char* and use std::reverse.
<br>&nbsp;
<pre>--&nbsp;
Thomas O Matelich
Senior Software Designer
Zetec, Inc.
sosedada@usa.net
tmatelich@zetec.com</pre>
&nbsp;</html>

--------------D5410C6688EEB913D9699421--
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]





Author: "C. M. Heard/VVNET, Inc." <heard@vvnet.com>
Date: 1999/11/22
Raw View
Pukalo Boyd wrote:
> I have been trying to find a portable method of determining the
> endianness byte order at compile time
> to produce code that will work properly on both windows and unix
> machines. I have searched the documentation
> i have ( the books on the C Standard Library and the C++ Standard
> Library) and havent found anything. Please help. Any pointers or
> suggestions would be very helpful. Thanks.

While this is possible (as Francis Glassborow pointed out), in my
experience code that is designed to work properly regardless of
host byte-order is a lot less prone to error.

One specific example:  I've worked on IP protocol modules written in
the traditional Berkeley fashion, which is to convert multibyte fields
in packet headers from network to host order (and vice-versa) by
explicitly invoking the macros ntohs/ntohl/htons/htonl.  It's very easy
to forget to invoke one of the macros where it is needed, resulting in
code that tends to be fragile.  Errors are particularly likely to creep
in when one makes modifications on a host where the macros don't do
anything;  one does not notice the errors until the code is recompiled
for a host of the opposite byte order.  Most such problems disappear
when conversions are done by explicit masking and shifting, since
correctness is not dependent on host byte order.

I'd strongly advise you to reconsider your decision to write code that
needs to know the host byte order.

Mike
--
C. M. Heard/VVNET, Inc.
heard@vvnet.com
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]





Author: Hyman Rosen <hymie@prolifics.com>
Date: 1999/11/19
Raw View
Pukalo Boyd 810-492-3661 <qz3fwd@hqs.mid.gmeds.com> writes:
> I have been trying to find a portable method of determining the
> endianness byte order at compile time

There is no such way. You will need to supply this information
externally, usually as part of the build process.


[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]






Author: Francis Glassborow <francis@robinton.demon.co.uk>
Date: 1999/11/19
Raw View
In article <38329967.470B6CED@hqs.mid.gmeds.com>, Pukalo Boyd
810-492-3661 <qz3fwd@hqs.mid.gmeds.com> writes
>I have been trying to find a portable method of determining the
>endianness byte order at compile time
>to produce code that will work properly on both windows and unix
>machines. I have searched the documentation
>i have ( the books on the C Standard Library and the C++ Standard
>Library) and havent found anything. Please help. Any pointers or
>suggestions would be very helpful. Thanks.

Having checked that the executing system has the same number of
bits/byte that you expect, that it has the same size int (long, float
etc) you could try something such as:

union {
        int sample;
        char bytes[sizeof(int)];
} test;
if(sizeof(int) == 2){
 test.sample = Ox0102;
        if (test.bytes[0] == 1) ...
and so on.

but there are so many variations that it hardly seems worth the effort.




Francis Glassborow      Journal Editor, Association of C & C++ Users
64 Southfield Rd
Oxford OX4 1PA          +44(0)1865 246490
All opinions are mine and do not represent those of any organisation


[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]






Author: "Bill Wade" <bill.wade@stoner.com>
Date: 1999/11/20
Raw View

Pukalo Boyd 810-492-3661 wrote in message
<38329967.470B6CED@hqs.mid.gmeds.com>...
>I have been trying to find a portable method of determining the
>endianness byte order at compile time
>to produce code that will work properly on both windows and unix
>machines. I have searched the documentation
>i have ( the books on the C Standard Library and the C++ Standard
>Library) and havent found anything. Please help. Any pointers or
>suggestions would be very helpful. Thanks.

There is no standard way.  In many cases tests like
  (('abcd' & 0xff) == 'd')
can be used to successfully distinguish endianness at compile time.

However I would strongly recommend that you back up any such test with a
more comprehensive runtime test.  For instance there are machines which have
the same 'int' representation as Intel, but a double representation
different from Intel.

HTH



[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]






Author: Pukalo Boyd 810-492-3661 <qz3fwd@hqs.mid.gmeds.com>
Date: 1999/11/19
Raw View
I have been trying to find a portable method of determining the
endianness byte order at compile time
to produce code that will work properly on both windows and unix
machines. I have searched the documentation
i have ( the books on the C Standard Library and the C++ Standard
Library) and havent found anything. Please help. Any pointers or
suggestions would be very helpful. Thanks.

Boyd Pukalo

qz3fwd@hqs.mid.gmeds.com
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]