By Kevin


2008-09-23 04:24:47 8 Comments

Why does the sizeof operator return a size larger for a structure than the total sizes of the structure's members?

11 comments

@Kevin 2008-09-23 04:25:34

This is because of padding added to satisfy alignment constraints. Data structure alignment impacts both performance and correctness of programs:

  • Mis-aligned access might be a hard error (often SIGBUS).
  • Mis-aligned access might be a soft error.
    • Either corrected in hardware, for a modest performance-degradation.
    • Or corrected by emulation in software, for a severe performance-degradation.
    • In addition, atomicity and other concurrency-guarantees might be broken, leading to subtle errors.

Here's an example using typical settings for an x86 processor (all used 32 and 64 bit modes):

struct X
{
    short s; /* 2 bytes */
             /* 2 padding bytes */
    int   i; /* 4 bytes */
    char  c; /* 1 byte */
             /* 3 padding bytes */
};

struct Y
{
    int   i; /* 4 bytes */
    char  c; /* 1 byte */
             /* 1 padding byte */
    short s; /* 2 bytes */
};

struct Z
{
    int   i; /* 4 bytes */
    short s; /* 2 bytes */
    char  c; /* 1 byte */
             /* 1 padding byte */
};

const int sizeX = sizeof(struct X); /* = 12 */
const int sizeY = sizeof(struct Y); /* = 8 */
const int sizeZ = sizeof(struct Z); /* = 8 */

One can minimize the size of structures by sorting members by alignment (sorting by size suffices for that in basic types) (like structure Z in the example above).

IMPORTANT NOTE: Both the C and C++ standards state that structure alignment is implementation-defined. Therefore each compiler may choose to align data differently, resulting in different and incompatible data layouts. For this reason, when dealing with libraries that will be used by different compilers, it is important to understand how the compilers align data. Some compilers have command-line settings and/or special #pragma statements to change the structure alignment settings.

@Cody Brocious 2008-09-23 04:27:48

I want to make a note here: Most processors penalize you for unaligned memory access (as you mentioned), but you can't forget that many completely disallow it. Most MIPS chips, in particular, will throw an exception on an unaligned access.

@Dark Shikari 2008-09-23 07:08:24

The x86 chips are actually rather unique in that they allow unaligned access, albeit penalized; AFAIK most chips will throw exceptions, not just a few. PowerPC is another common example.

@Mike Dimmick 2008-09-23 11:16:30

Enabling pragmas for unaligned accesses generally cause your code to balloon in size, on processors which throw misalignment faults, as code to fix up every misalignment has to be generated. ARM also throws misalignment faults.

@Aaron 2008-09-25 21:53:16

@Dark - totally agree. But most desktop processors are x86/x64, so most chips don't issue data alignment faults ;)

@Lara Dougan 2008-10-19 01:51:46

Unaligned data access is typically a feature found in CISC architectures, and most RISC architectures do not include it (ARM, MIPS, PowerPC, Cell). In actually, most chips are NOT desktop processors, for embedded rule by numbers of chips and the vast majority of these are RISC architectures.

@Donal Fellows 2010-12-22 07:29:52

The unaligned access trap is (or certainly used to be) used in functional language implementations for doing tagging of values so that their garbage collectors could know what is some arbitrary memory they're looking at. All in all, a very clever hack (too clever for me to use in my code, according to Kernighan's dictum).

@Kerrek SB 2011-06-15 10:20:33

To be pedantic, doesn't the standard guarantee alignment if all struct members are of char type?

@Bodo Thiesen 2014-11-26 10:25:25

@Kerrek SB: The standard guarantees alignment for any struct regardless of the used types. However for a char which is 1 byte in size, there is no way for it to be unaligned. So, the standard guarantees alignment if all struct members are char WITHOUT ANY PADDING.

@v.oddou 2016-06-14 08:41:39

@LaraDougan : yes and somehow there is an easy rule we can reason by to get why that is. Cost per chip. Desktop x86 chips are hundreds of dollars consummer products. Nothing tolerable for most industrial usages, usually industry deals with chips less than 1$ or close to. It's easy to see how widespread is affected by that.

@Wayne O 2017-01-26 18:43:12

Why is it that for the first char there are 3 bytes of padding and for the next 2 there is only 1 byte?

@8bittree 2017-02-17 17:42:30

@WayneO The amount of padding is always enough to make sure that whatever is next is aligned according to its size. So, in X, there's 2 bytes of padding after the short to ensure the 4 byte int starts on a 4 byte boundary. In Y, there's 1 byte padding after the char to make sure the 2 byte short starts on a 2 byte boundary. Since the compiler cannot know what might be after a struct in memory (and it could be many different things), it prepares for the worst and inserts enough padding to make the struct a multiple of 4 bytes. X needs 3 bytes to get to 12, Y only needs 1 for 8.

@Ben Voigt 2017-05-25 19:20:56

"x86 chips have hardware support for unaligned access" True. "x86 chips don't issue data alignment faults" False. It depends on the instruction, SSE instructions in particular tend to fault on misalignment (except for the special unaligned variations).

@Akshay Immanuel D 2018-02-07 04:19:18

Do padding exist between 2 struct as well to make the first member of the next struct to begin at an aligned address?

@Mooing Duck 2019-01-24 01:10:51

@AkshayImmanuelD: Not "between" structs no, it's part of the end of the struct. struct {long long a; char b;} usually has 7 bytes of padding at the end after b, making it 16 bytes total. (on most 64bit architectures yada yada)

@lkanab 2011-05-31 09:27:12

See also:

for Microsoft Visual C:

http://msdn.microsoft.com/en-us/library/2e70t5y1%28v=vs.80%29.aspx

and GCC claim compatibility with Microsoft's compiler.:

http://gcc.gnu.org/onlinedocs/gcc/Structure_002dPacking-Pragmas.html

In addition to the previous answers, please note that regardless the packaging, there is no members-order-guarantee in C++. Compilers may (and certainly do) add virtual table pointer and base structures' members to the structure. Even the existence of virtual table is not ensured by the standard (virtual mechanism implementation is not specified) and therefore one can conclude that such guarantee is just impossible.

I'm quite sure member-order is guaranteed in C, but I wouldn't count on it, when writing a cross-platform or cross-compiler program.

@Ciro Santilli 新疆改造中心996ICU六四事件 2016-05-04 15:39:59

"I'm quite sure member-order is grunted in C". Yes, C99 says: "Within a structure object, the non-bit-field members and the units in which bit-fields reside have addresses that increase in the order in which they are declared." More standard goodness at: stackoverflow.com/a/37032302/895245

@Ciro Santilli 新疆改造中心996ICU六四事件 2016-05-04 15:38:27

C99 N1256 standard draft

http://www.open-std.org/JTC1/SC22/WG14/www/docs/n1256.pdf

6.5.3.4 The sizeof operator:

3 When applied to an operand that has structure or union type, the result is the total number of bytes in such an object, including internal and trailing padding.

6.7.2.1 Structure and union specifiers:

13 ... There may be unnamed padding within a structure object, but not at its beginning.

and:

15 There may be unnamed padding at the end of a structure or union.

The new C99 flexible array member feature (struct S {int is[];};) may also affect padding:

16 As a special case, the last element of a structure with more than one named member may have an incomplete array type; this is called a flexible array member. In most situations, the flexible array member is ignored. In particular, the size of the structure is as if the flexible array member were omitted except that it may have more trailing padding than the omission would imply.

Annex J Portability Issues reiterates:

The following are unspecified: ...

  • The value of padding bytes when storing values in structures or unions (6.2.6.1)

C++11 N3337 standard draft

http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2012/n3337.pdf

5.3.3 Sizeof:

2 When applied to a class, the result is the number of bytes in an object of that class including any padding required for placing objects of that type in an array.

9.2 Class members:

A pointer to a standard-layout struct object, suitably converted using a reinterpret_cast, points to its initial member (or if that member is a bit-field, then to the unit in which it resides) and vice versa. [ Note: There might therefore be unnamed padding within a standard-layout struct object, but not at its beginning, as necessary to achieve appropriate alignment. — end note ]

I only know enough C++ to understand the note :-)

@DigitalRoss 2016-02-24 06:46:13

The idea is that for speed and cache considerations, operands should be read from addresses aligned to their natural size. To make this happen, the compiler pads structure members so the following member or following struct will be aligned.

struct pixel {
    unsigned char red;   // 0
    unsigned char green; // 1
    unsigned int alpha;  // 4 (gotta skip to an aligned offset)
    unsigned char blue;  // 8 (then skip 9 10 11)
};

// next offset: 12

The x86 architecture has always been able to fetch misaligned addresses. However, it's slower and when the misalignment overlaps two different cache lines, then it evicts two cache lines when an aligned access would only evict one.

Some architectures actually have to trap on misaligned reads and writes, and early versions of the ARM architecture (the one that evolved into all of today's mobile CPUs) ... well, they actually just returned bad data on for those. (They ignored the low-order bits.)

Finally, note that cache lines can be arbitrarily large, and the compiler doesn't attempt to guess at those or make a space-vs-speed tradeoff. Instead, the alignment decisions are part of the ABI and represent the minimum alignment that will eventually evenly fill up a cache line.

TL;DR: alignment is important.

@EmmEff 2008-09-23 04:27:32

Packing and byte alignment, as described in the C FAQ here:

It's for alignment. Many processors can't access 2- and 4-byte quantities (e.g. ints and long ints) if they're crammed in every-which-way.

Suppose you have this structure:

struct {
    char a[3];
    short int b;
    long int c;
    char d[3];
};

Now, you might think that it ought to be possible to pack this structure into memory like this:

+-------+-------+-------+-------+
|           a           |   b   |
+-------+-------+-------+-------+
|   b   |           c           |
+-------+-------+-------+-------+
|   c   |           d           |
+-------+-------+-------+-------+

But it's much, much easier on the processor if the compiler arranges it like this:

+-------+-------+-------+
|           a           |
+-------+-------+-------+
|       b       |
+-------+-------+-------+-------+
|               c               |
+-------+-------+-------+-------+
|           d           |
+-------+-------+-------+

In the packed version, notice how it's at least a little bit hard for you and me to see how the b and c fields wrap around? In a nutshell, it's hard for the processor, too. Therefore, most compilers will pad the structure (as if with extra, invisible fields) like this:

+-------+-------+-------+-------+
|           a           | pad1  |
+-------+-------+-------+-------+
|       b       |     pad2      |
+-------+-------+-------+-------+
|               c               |
+-------+-------+-------+-------+
|           d           | pad3  |
+-------+-------+-------+-------+

@Lakshmi Sreekanth Chitla 2016-12-26 06:07:32

Now what is the use of memory slots pad1, pad2 and pad3.

@phuclv 2017-03-02 02:57:10

@YoYoYonnY that's not possible. The compiler is not allowed to reorder struct members although gcc has an experimental option to do that

@bruziuz 2015-07-28 21:25:42

C language leaves compiler some freedom about the location of the structural elements in the memory:

  • memory holes may appear between any two components, and after the last component. It was due to the fact that certain types of objects on the target computer may be limited by the boundaries of addressing
  • "memory holes" size included in the result of sizeof operator. The sizeof only doesn't include size of the flexible array, which is available in C/C++
  • Some implementations of the language allow you to control the memory layout of structures through the pragma and compiler options

The C language provides some assurance to the programmer of the elements layout in the structure:

  • compilers required to assign a sequence of components increasing memory addresses
  • Address of the first component coincides with the start address of the structure
  • unnamed bit fields may be included in the structure to the required address alignments of adjacent elements

Problems related to the elements alignment:

  • Different computers line the edges of objects in different ways
  • Different restrictions on the width of the bit field
  • Computers differ on how to store the bytes in a word (Intel 80x86 and Motorola 68000)

How alignment works:

  • The volume occupied by the structure is calculated as the size of the aligned single element of an array of such structures. The structure should end so that the first element of the next following structure does not the violate requirements of alignment

p.s More detailed info are available here: "Samuel P.Harbison, Guy L.Steele C A Reference, (5.6.2 - 5.6.7)"

@sid1138 2015-06-10 15:07:08

The size of a structure is greater than the sum of its parts because of what is called packing. A particular processor has a preferred data size that it works with. Most modern processors' preferred size if 32-bits (4 bytes). Accessing the memory when data is on this kind of boundary is more efficient than things that straddle that size boundary.

For example. Consider the simple structure:

struct myStruct
{
   int a;
   char b;
   int c;
} data;

If the machine is a 32-bit machine and data is aligned on a 32-bit boundary, we see an immediate problem (assuming no structure alignment). In this example, let us assume that the structure data starts at address 1024 (0x400 - note that the lowest 2 bits are zero, so the data is aligned to a 32-bit boundary). The access to data.a will work fine because it starts on a boundary - 0x400. The access to data.b will also work fine, because it is at address 0x404 - another 32-bit boundary. But an unaligned structure would put data.c at address 0x405. The 4 bytes of data.c are at 0x405, 0x406, 0x407, 0x408. On a 32-bit machine, the system would read data.c during one memory cycle, but would only get 3 of the 4 bytes (the 4th byte is on the next boundary). So, the system would have to do a second memory access to get the 4th byte,

Now, if instead of putting data.c at address 0x405, the compiler padded the structure by 3 bytes and put data.c at address 0x408, then the system would only need 1 cycle to read the data, cutting access time to that data element by 50%. Padding swaps memory efficiency for processing efficiency. Given that computers can have huge amounts of memory (many gigabytes), the compilers feel that the swap (speed over size) is a reasonable one.

Unfortunately, this problem becomes a killer when you attempt to send structures over a network or even write the binary data to a binary file. The padding inserted between elements of a structure or class can disrupt the data sent to the file or network. In order to write portable code (one that will go to several different compilers), you will probably have to access each element of the structure separately to ensure the proper "packing".

On the other hand, different compilers have different abilities to manage data structure packing. For example, in Visual C/C++ the compiler supports the #pragma pack command. This will allow you to adjust data packing and alignment.

For example:

#pragma pack 1
struct MyStruct
{
    int a;
    char b;
    int c;
    short d;
} myData;

I = sizeof(myData);

I should now have the length of 11. Without the pragma, I could be anything from 11 to 14 (and for some systems, as much as 32), depending on the default packing of the compiler.

@Keith Thompson 2015-06-10 15:39:01

This discusses the consequences of structure padding, but it does not answer the question.

@Keith Thompson 2015-06-12 16:02:02

"... because of what is called packing. ... -- I think you mean "padding". "Most modern processors' preferred size if 32-bits (4 bytes)" -- That's a bit of an oversimplification. Typically sizes of 8, 16, 32, and 64 bits are supported; often each size has its own alignment. And I'm not sure your answer adds any new information that's not already in the accepted answer.

@sid1138 2015-06-12 21:12:16

WhenI said packing, I meant how the compiler packs data into a structure (and it can do so by padding the small items, but it does not need to pad, but it always packs). As for size - I was talking about the system architecture, not what the system will support for data access (which is way different from the underlying bus architecture). As for your final comment, I gave a simplified and expanded explanation of one aspect of the tradeoff (speed versus size) - a major programming problem. I also describe a way to fix the problem - that was not in the accepted answer.

@Keith Thompson 2015-06-12 21:16:18

"Packing" in this context usually refers to allocating members more tightly than the default, as with #pragma pack. If members are allocated on their default alignment, I'd generally say the structure is not packed.

@sid1138 2015-06-13 21:04:01

Packing is kind of an overloaded term. It means how you put structure elements into memory. Similar to the meaning of putting objects into a box (packing for moving). It also means putting elements into memory with no padding (sort of a short hand for "tightly packed"). Then there is the command version of the word in the #pragma pack command.

@Kyle Burton 2008-09-23 04:31:41

This can be due to byte alignment and padding so that the structure comes out to an even number of bytes (or words) on your platform. For example in C on Linux, the following 3 structures:

#include "stdio.h"


struct oneInt {
  int x;
};

struct twoInts {
  int x;
  int y;
};

struct someBits {
  int x:2;
  int y:6;
};


int main (int argc, char** argv) {
  printf("oneInt=%zu\n",sizeof(struct oneInt));
  printf("twoInts=%zu\n",sizeof(struct twoInts));
  printf("someBits=%zu\n",sizeof(struct someBits));
  return 0;
}

Have members who's sizes (in bytes) are 4 bytes (32 bits), 8 bytes (2x 32 bits) and 1 byte (2+6 bits) respectively. The above program (on Linux using gcc) prints the sizes as 4, 8, and 4 - where the last structure is padded so that it is a single word (4 x 8 bit bytes on my 32bit platform).

oneInt=4
twoInts=8
someBits=4

@dolmen 2013-08-20 07:46:07

"C on Linux using gcc" is not enough to describe your platform. Alignment mostly depend on the CPU architecture.

@youpilat13 2018-07-04 15:04:37

[email protected] Burton . Excuse me, I don't understand why the size of structure "someBits" is equal to 4, I expect 8 bytes since there are 2 integers declared (2*sizeof(int)) = 8 bytes. thanks

@Kyle Burton 2018-07-13 00:44:29

Hi @youpilat13, the :2 and :6 are actually specifying 2 and 6 bits, not full 32 bit integers in this case. someBits.x, being only 2 bits can only store 4 possible values: 00, 01, 10, and 11 (1, 2, 3 and 4). Does this make sense? Here's an article about the feature: geeksforgeeks.org/bit-fields-c

@INS 2008-09-23 07:06:14

If you want the structure to have a certain size with GCC for example use __attribute__((packed)).

On Windows you can set the alignment to one byte when using the cl.exe compier with the /Zp option.

Usually it is easier for the CPU to access data that is a multiple of 4 (or 8), depending platform and also on the compiler.

So it is a matter of alignment basically.

You need to have good reasons to change it.

@Mr.Ree 2008-12-08 04:58:34

"good reasons" Example: Keeping binary compatibility (padding) consistent between 32-bit and 64-bit systems for a complex struct in proof-of-concept demo code that's being showcased tomorrow. Sometimes necessity has to take precedence over propriety.

@Blaisorblade 2009-01-12 02:51:57

Everything is ok except when you mention the Operating System. This is an issue for the CPU speed, the OS is not involved at all.

@ceo 2009-10-20 15:18:53

Another good reason is if you're stuffing a datastream into a struct, e.g. when parsing network protocols.

@dolmen 2013-08-20 07:50:18

@Blaisorblade While the CPU architecture is the most important point, the OS may also matter. Think about an x86 CPU running in real mode (MS-DOS) vs protected mode (Windows, Linux...).

@Blaisorblade 2013-08-24 17:44:26

@dolmen I just pointed out that "it is easier for the Operatin System to access data" is incorrect, since the OS doesn't access data.

@Blaisorblade 2013-08-24 17:46:38

@dolmen In fact, one should talk about the ABI (application binary interface). Default alignment (used if you don't change it in the source) depends on the ABI, and many OSs support multiple ABIs (say, 32- and 64-bit, or for binaries from different OSs, or for different ways of compiling the same binaries for the same OS). OTOH, what alignment is performance-wise convenient depends on the CPU - memory is accessed the same way whether you use 32 or 64 bit mode (I can't comment on real mode, but seems hardly relevant for performance nowadays). IIRC Pentium started preferring 8-byte alignment.

@Keith Thompson 2015-06-10 15:41:14

__attribute__((packed)) is potentially unsafe in some cases: stackoverflow.com/q/8568432/827263

@Ben Voigt 2019-03-31 06:02:44

On Microsoft compilers you would use #pragma pack, doing this with a command-line option is evil. (GCC and clang on Windows use __attribute__ just like on any other OS)

@JohnMcG 2008-09-23 13:38:11

In addition to the other answers, a struct can (but usually doesn't) have virtual functions, in which case the size of the struct will also include the space for the vtbl.

@Don Wakefield 2008-10-18 03:16:48

Not quite. In typical implementations, what is added to the struct is a vtable pointer.

@Orion Adrian 2008-09-23 04:27:07

It can do so if you have implicitly or explicitly set the alignment of the struct. A struct that is aligned 4 will always be a multiple of 4 bytes even if the size of its members would be something that's not a multiple of 4 bytes.

Also a library may be compiled under x86 with 32-bit ints and you may be comparing its components on a 64-bit process would would give you a different result if you were doing this by hand.

Related Questions

Sponsored Content

17 Answered Questions

[SOLVED] What should main() return in C and C++?

10 Answered Questions

[SOLVED] Improve INSERT-per-second performance of SQLite?

7 Answered Questions

5 Answered Questions

9 Answered Questions

[SOLVED] Why does sizeof(x++) not increment x?

  • 2011-11-22 11:07:16
  • Neigyl R. Noval
  • 27761 View
  • 490 Score
  • 9 Answer
  • Tags:   c sizeof

13 Answered Questions

[SOLVED] Are the days of passing const std::string & as a parameter over?

  • 2012-04-19 15:20:57
  • Benj
  • 161221 View
  • 566 Score
  • 13 Answer
  • Tags:   c++ c++11

7 Answered Questions

4 Answered Questions

[SOLVED] Why does sizeof(my_arr)[0] compile and equal sizeof(my_arr[0])?

  • 2017-10-09 19:09:19
  • bgomberg
  • 6369 View
  • 125 Score
  • 4 Answer
  • Tags:   c sizeof

1 Answered Questions

[SOLVED] sizeof a struct member

  • 2010-10-05 14:22:58
  • xyzt
  • 10676 View
  • 12 Score
  • 1 Answer
  • Tags:   c sizeof

2 Answered Questions

[SOLVED] Why sizeof of a struct is unsafe

  • 2013-06-14 09:38:10
  • dna
  • 3329 View
  • 12 Score
  • 2 Answer
  • Tags:   c# struct sizeof

Sponsored Content