Peeter Joot's Blog.

Math, physics, perl, and programming obscurity.

C structure alignment padding.

Posted by peeterjoot on November 11, 2009

Questions like the following I’ve answered many many times to IBM developers. This is perhaps because I’ve seen subtle effects of layout assumptions many many times during our 64-bit port of DB2. Let’s put an answer online for all.

“Are you saying that since there was no 8-byte variable in the struct above it didn’t have to be 8-bytes aligned but as soon as you add one, then the compiler will force the 8 byte alignment of the overall structure ? “

What prompted the question this time was related to a desire to put a pad in a structure for versioning so that extra stuff could be added in a way that maintained layout compatibility for the rest. This can be done, but one is best reserving padding so that the total structure size is an eight byte multiple should also be done, so that one has the flexibility to make arbitrary changes later.

What’s the answer? Yes. Exactly. C structure alignment is based on the biggest size native type in the structure, at least generally (an exception is something like using a 64-bit integer on win32 where only 32-bit alignment is required). If you have only chars and arrays of chars, once you add an int, that int will end up starting on a 4 byte boundary (with possible hidden padding before the int member). Additionally, if the structure isn’t a multiple of sizeof(int), hidden padding will be added at the end. Same thing for short and 64-bit types. Example:

struct blah1
{
	char x ;
	char y[2] ;
} ;

// sizeof(blah1) == 3

struct blah1plusShort
{
	char x ;
	char y[2] ;
			// <<< (** P1 **)
	short z ;
	char w ;
			// <<< (** P2 **)
} ;

// sizeof(blah1plusShort) == 8

P1: There is a hidden one byte inserted by the compiler here, so that z will start on a 2 byte boundary (assuming the beginning of the struct is aligned).

P2: There is a hidden one byte tail pad inserted by the compiler so that the total struct size is a multiple of sizeof(short), which is the biggest size element. This ensures that if the struct is used in an array, if the beginning is aligned properly, the whole thing will be for all elements of the array.

About these ads

7 Responses to “C structure alignment padding.”

  1. The above statements about alignment are very platform specific. The C Standard permits types to have any alignment an implementation chooses (sentence 1421) although there are some requirements on the corresponding signed/unsigned types having the same alignment.

    Referring to your follow on post, developers usually group members having the same type together. I suspect that those two char members were added later and put after the pointer type to maintain backwards binary compatibility.

  2. peeterjoot said

    True enough. I’ve described structure alignment padding as roughly following a “biggest native size type” rule of thumb which ensures any member is aligned naturally. The obvious exceptions are doubles and 64-bit quantities on 32-bit platforms. For example, AIX has very strange alignment rules for doubles (it may have been only for 32-bit mode?), and how a double is aligned depends also on the position in the structure.

    Are you aware of common systems that align structures in any way significantly different than above? Examples of behavior I’ve never seen would be, say, always requiring 16 byte alignment for a char, or always “packing” shorts so that no 2 byte alignment was required.

    What I was really intending above was a description that is true enough for all practical purposes. For me I suppose that is a qualified statement, because my view of practical purposes really means “all the many platforms we build DB2 on”. That’s a lot of platforms, but there are many others.

    • It would be an unusual implementation that varied the alignment requirement according to the position in the structure (supporting offsetof on such an implementation would be very interesting). Can you point me at a compiler manual that describes this behavior.

      On some DSP chips pointers can only access objects on 16/32/40 bit boundaries. Some implementations on such systems simply define char to occupy 16/32/40 bits. I suspect you don’t build DB2 on DSP based platforms.

      The Intel x86 does not require 2-byte alignment for 16-bit quantities, although some implementations enforce this requirement for performance reasons.

      It is fairly common for the alignment of the first member of a structure to have stricture requirements when the object is at file scope, compared to block scope.

      • peeterjoot said

        Yes, we don’t build DB2 on any DSP;)

        DB2 build platforms include or have included AIX (32 & 64-bit powerpc), linux{390(32/64), ia32, ia64, amd64}, sun (sparc(32/64), and amd64), hp (parisc, and ipf), windows (ia32, ia64, amd64), sgi (many arch models), various flavours of sco on ia32, sni, os/2, and probably a few others that I now forget. Thankfully many of these are now dead ports!

        From the AIX compiler manual:

        http://www-01.ibm.com/software/awdtools/caix/downloads/caix50.pdf

        “On the RISC System/6000 system, if a double is the first member of a struct, it is 8-byte (doubleword)
        aligned.”

        (after stating “double: doubleword aligned if -qalign=natural. Otherwise, word aligned.”).

        We don’t see the effects of this much anymore since 64-bit AIX defaults to -qalign=natural, and we stopped shipping a 32-bit server on AIX (32-bit client only).

        offsetof() often just relies on &(((T*)0)->m), and thus falls back to the compiler to fill in the positions. Since the compiler knows its own layout rules, I don’t figure that this if-it-is-first quirk for double ends up treated much different than any other offsetof calculation.

  3. Thanks for the C compiler manual link. Either a compiler or documentation bug on page 98: “A bit field cannot have the volatile or const qualifier.”

    The rule you quote refers to the alignment of the start of the structure and does not imply the possibility of different alignments for the same type within a structure.

    No RS/400 in your list with its interesting 16-byte pointers ;-(

  4. […] align|padding: 911     c struct alignment    208     structure alignment in c    127     c structure alignment    116     structure padding in c pdf    54     structure alignment c    49     structure padding and alignment    36     padding struct to align    35     structure padding pdf    33     c structure alignment rules    32     struct alignment padding    27     c struct padding and alignment    24     c struct alignment rules    23     struct alignment in c    22     c struct padding    19     struct padding alignment    18     struct alignment c    17     structure alignment and padding    13     c struct align    13     c structure alignment padding    12     structure padding alignment    11     alignment padding    11     padding inserted in struct    11 http://peeterjoot.wordpress.com/2009/11/11/c-structure-alignment-padding/ […]

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

 
Follow

Get every new post delivered to your Inbox.

Join 43 other followers

%d bloggers like this: