By Alex Gartrell


2008-09-22 22:46:19 8 Comments

Systems demand that certain primitives be aligned to certain points within the memory (ints to bytes that are multiples of 4, shorts to bytes that are multiples of 2, etc.). Of course, these can be optimized to waste the least space in padding.

My question is why doesn't GCC do this automatically? Is the more obvious heuristic (order variables from biggest size requirement to smallest) lacking in some way? Is some code dependent on the physical ordering of its structs (is that a good idea)?

I'm only asking because GCC is super optimized in a lot of ways but not in this one, and I'm thinking there must be some relatively cool explanation (to which I am oblivious).

7 comments

@A. K. 2015-09-12 04:25:10

You might want to try the latest gcc trunk or, struct-reorg-branch which is under active development.

https://gcc.gnu.org/wiki/cauldron2015?action=AttachFile&do=view&target=Olga+Golovanevsky_+Memory+Layout+Optimizations+of+Structures+and+Objects.pdf

@Michel 2009-02-18 19:18:47

Not saying it's a good idea, but you can certainly write code that relies on the order of the members of a struct. For example, as a hack, often people cast a pointer to a struct as the type of a certain field inside that they want access to, then use pointer arithmetic to get there. To me this is a pretty dangerous idea, but I've seen it used, especially in C++ to force a variable that's been declared private to be publicly accessible when it's in a class from a 3rd party library and isn't publicly encapsulated. Reordering the members would totally break that.

@Domingo Ignacio 2010-08-11 03:24:56

I believe the linux kernel does this for linked lists.

@Damien Neil 2008-09-22 23:13:43

gcc does not reorder the elements of a struct, because that would violate the C standard. Section 6.7.2.1 of the C99 standard states:

Within a structure object, the non-bit-field members and the units in which bit-fields reside have addresses that increase in the order in which they are declared.

@nes1983 2014-01-01 22:51:01

Yes, but why was it defined this way?

@Evo510 2014-01-02 16:58:04

@nes1983 The programmer may be making assumptions as to the order of the data in the struct and may be using masking to get each portion. If the struct is reordered than the masking my be incorrect.

@nes1983 2014-01-03 19:58:02

@Evo510: I’m confused. To use masking, you have to know padding, too, which is not guaranteed by the language. So, you can’t use masks. Am I missing something?

@Cort Ammon 2015-04-15 06:03:34

@nes1983 I've seen numerical integration code which makes the assumption that all of its inputs are floats in sequential order. You pass it the pointer to the first value to integrate, and the last, and it scans between them. However, you keep the information in a struct because, for everything except integration, it is a more convenient format.

@osgx 2017-06-05 15:47:04

While it will violate the Standard, there is useful reordering method to protect Linux kernel from rootkits/exploits: part of Linux KSPP (kernsec.org/wiki/index.php/Kernel_Self_Protection_Project) is some struct fields randomization/reordering: openwall.com/lists/kernel-hardening/2017/05/26/8 (Introduce struct layout randomization plugin), related paper: sec.taylor.edu/doc/… ("Improved kernel security through memory layout randomization" - DM Stanley - ‎2013)

@Mr Redstoner 2019-06-26 09:39:30

@nes1983 There are cases where assumptions are made, for example working with the network: the connect() call specifies a sockaddr as its argument, which in OOP terms is something like a 'base class' and in reality you will be sending in a 'subclass member', that is a different structure (sockaddr_in for example) and the kernel relies on the field identifying what type you passed in being in the same place as in sockaddr.

@alex strange 2008-10-31 19:11:25

gcc SVN does have a structure reorganization optimization (-fipa-struct-reorg), but it requires whole-program analysis and isn't very powerful at the moment.

@FooF 2018-09-27 15:10:46

Stock gcc 10 years later (version 7.2, packaged by Ubuntu 17.10) does not document this option in manual page. Strangely the option string is recognized by the gcc executable, though.

@tzot 2008-09-22 22:56:53

GCC is smarter than most of us in producing machine code from our source code; however, I shiver if it was smarter than us in re-arranging our structs, since it's data that e.g. can be written to a file. A struct that starts with 4 chars and then has a 4 byte integer would be useless if read on another system where GCC decided that it should re-arrange the struct members.

@forumulator 2018-01-30 19:57:00

Reading/Writing structs directly to a file is not compiler/platform portable anyway because of alignment (which is allowed), see this SO answer.

@Alex M 2008-09-22 22:50:58

C compilers don't automatically pack structs precisely because of alignment issues like you mention. Accesses not on word boundaries (32-bit on most CPUs) carry heavy penalty on x86 and cause fatal traps on RISC architectures.

@Alex Gartrell 2008-09-22 22:53:10

I wasn't talking about getting rid of the buffering, I'm talking about putting all the longs/pointers end-to-end, then all the shorts end-to-end, then all the characters end-to-end, etc. so that you're only losing space at the end.

@Cody Brocious 2008-09-22 22:54:02

Well, that's half true. The C compiler will default to packing them, they just do it aligned to the natural word boundaries of the architecture. That's why you need to #pragma pack(0) structs that are using chars/shorts in packed protocols, to stop it from adding padding.

@Cody Brocious 2008-09-22 22:55:08

@Alex, err. You're going to waste the same amount of space, since your character would have to be padded the same amount. You wouldn't benefit at all, space or performance-wise.

@Alex M 2008-09-22 22:57:03

Oh. Yeah, that causes trouble with binary formats, as Cody attested. Plus, ANSI guarantees that structure element offsets must be in increasing order.

@Alex Gartrell 2008-09-22 23:15:56

@Cody, all the space will be wasted at the very end and is either less than or equal to the regular "allocate as you go approach" case and point: struct blah { char a; short b; char c; }; struct blah2 { short b; char a, c; }; blah wastes 2 more bytes than blah2

@Alex Gartrell 2008-09-22 23:16:36

because one character can go between the other character and the short rather than wasting a pad character for each.

@Cody Brocious 2008-09-22 23:23:13

That's not the case, though. If you disable padding you get the same size, but you lose the reason you need padding -- most architectures can't access it at all, and those that can will do so more slowly. So you're losing padding and reordering to get... the same thing.

@Alex Gartrell 2008-09-23 00:53:00

you don't lose any of the benefits of padding by arranging the struct properly. With a short, char, char, you can have 0 padding, but all elements fall on the correct offset. In general, yo will not lose any speed at all for this, as they fall on their natural bounds

@Cody Brocious 2008-09-23 01:50:30

No, this is not correct on the majority of architectures. Even if it's allowed, accessing memory that is not word-aligned (meaning 32-bit typically) is penalized, even if it's allowed.

@Alex Gartrell 2008-09-23 15:49:57

You can maintain word-aligned memory. Characters can be placed on any byte (obviously). Shorts can be started on any byte that is a multiple of 2. Ints on any byte that is a multiple of 4. given char a, b; short c; |a|b|c|c is valid, but rearranged to a,c,b it must be |a|x|c|c|b|x|

@Cody Brocious 2008-09-22 22:48:40

Structs are frequently used as representations of the packing order of binary file formats and network protocols. This would break if that were done. In addition, different compilers would optimize things differently and linking code together from both would be impossible. This simply isn't feasible.

@Andrew Grant 2008-09-22 23:02:13

this has nothing to do with networking or file structures. Indeed the header of a BMP structure IS tightly packed with elements falling on non-natural boundaries that are alien to the compiler.

@Cody Brocious 2008-09-22 23:03:06

Err, yes? You've misinterpreted the question. Reread the second paragraph, where he talks about struct ordering. This is entirely different from padding.

@Johannes Schaub - litb 2009-02-18 19:29:49

your first point is very valid. but i think your second isn't. compiled code from different compilers is not compatible anyway.

@rubenvb 2013-05-23 07:57:50

@JohannesSchaub-litb that depends; if both compilers adhere to the same ABI there is no reason for them to produce incompatible code. Examples are GCC and Clang, and 32-bit GCC and MSVC for C on Windows.

Related Questions

Sponsored Content

25 Answered Questions

12 Answered Questions

18 Answered Questions

[SOLVED] With arrays, why is it the case that a[5] == 5[a]?

5 Answered Questions

[SOLVED] Is gcc's __attribute__((packed)) / #pragma pack unsafe?

  • 2011-12-19 22:28:00
  • Keith Thompson
  • 112785 View
  • 149 Score
  • 5 Answer
  • Tags:   c gcc pragma-pack

28 Answered Questions

[SOLVED] When to use struct?

  • 2009-02-06 17:37:55
  • Alex Baranosky
  • 257859 View
  • 1344 Score
  • 28 Answer
  • Tags:   c# struct

12 Answered Questions

[SOLVED] typedef struct vs struct definitions

  • 2009-11-04 17:21:57
  • user69514
  • 653001 View
  • 759 Score
  • 12 Answer
  • Tags:   c struct typedef

10 Answered Questions

11 Answered Questions

[SOLVED] Why isn't sizeof for a struct equal to the sum of sizeof of each member?

8 Answered Questions

[SOLVED] Difference between 'struct' and 'typedef struct' in C++?

  • 2009-03-04 20:41:12
  • criddell
  • 492452 View
  • 796 Score
  • 8 Answer
  • Tags:   c++ struct typedef

5 Answered Questions

[SOLVED] Why does the C preprocessor interpret the word "linux" as the constant "1"?

Sponsored Content