By Baruch


2011-03-22 20:29:29 8 Comments

If I have a struct in C++, is there no way to safely read/write it to a file that is cross-platform/compiler compatible?

Because if I understand correctly, every compiler 'pads' differently based on the target platform.

5 comments

@Nawaz 2011-03-22 22:12:34

No. That is not possible. It's because of lack of standardization of C++ at the binary level.

Don Box writes (quoting from his book Essential COM, chapter COM As A Better C++)

C++ and Portability


Once the decision is made to distribute a C++ class as a DLL, one is faced with one of the fundamental weaknesses of C++, that is, lack of standardization at the binary level. Although the ISO/ANSI C++ Draft Working Paper attempts to codify which programs will compile and what the semantic effects of running them will be, it makes no attempt to standardize the binary runtime model of C++. The first time this problem will become evident is when a client tries to link against the FastString DLL's import library from a C++ developement environment other than the one used to build the FastString DLL.

Struct padding is done differently by different compilers. Even if you use the same compiler, the packing alignment for structs can be different based on what pragma pack you're using.

Not only that if you write two structs whose members are exactly same, the only difference is that the order in which they're declared is different, then the size of each struct can be (and often is) different.

For example, see this,

struct A
{
   char c;
   char d;
   int i;
};

struct B
{
   char c;
   int i;
   char d;
};

int main() {
        cout << sizeof(A) << endl;
        cout << sizeof(B) << endl;
}

Compile it with gcc-4.3.4, and you get this output:

8
12

That is, sizes are different even though both structs have the same members!

Code at Ideone: http://ideone.com/HGGVl

The bottom line is that the standard doesn't talk about how padding should be done, and so the compilers are free to make any decision and you cannot assume all compilers make the same decision.

@Pijusn 2015-06-08 07:11:55

There is __attribute__((packed)) which I use for shared-memory structures as well as ones used to map network data. It does affect performance (see digitalvampire.org/blog/index.php/2006/07/31/… ) but it's a useful feature for network-related structs. (It's not a standard as far as I know, so the answer is still true).

@Dchris 2017-03-03 08:01:28

I don't understand why struct A size is 8 and not more. { char c; // what about this? char d; // size 1 + padding of 3 int i; // size 4 };

@hoodaticus 2017-05-25 20:58:22

@Dchris - the compiler is probably being careful to ensure that each field is aligned based on its own natural alignment. c and d are one byte and thus aligned no matter where you put them for the single-byte CPU instructions. The int however needs to be aligned on a 4-byte boundary, which to get there requires two bytes of padding after d. This gets you to 8.

@Lindydancer 2011-03-22 20:47:38

If you have the opportunity to design the struct yourself, it should be possible. The basic idea is that you should design it so that there would be no need to insert pad bytes into it. the second trick is that you must handle differences in endianess.

I'll describe how to construct the struct using scalars, but the you should be able to use nested structs, as long as you would apply the same design for each included struct.

First, a basic fact in C and C++ is that the alignment of a type can not exceed the size of the type. If it would, then it would not be possible to allocate memory using malloc(N*sizeof(the_type)).

Layout the struct, starting with the largest types.

 struct
 {
   uint64_t alpha;
   uint32_t beta;
   uint32_t gamma;
   uint8_t  delta;

Next, pad out the struct manually, so that in the end you will match up the largest type:

   uint8_t  pad8[3];    // Match uint32_t
   uint32_t pad32;      // Even number of uint32_t
 }

Next step is to decide if the struct should be stored in little or big endian format. The best way is to "swap" all the element in situ before writing or after reading the struct, if the storage format does not match the endianess of the host system.

@Phil 2015-02-21 22:52:15

This sounds interesting. But can you get more in Detail: Why do you order it by type length descending and why did you pad it that you have an even number of uint32_t?

@Lindydancer 2015-02-22 08:34:45

@Phil, A basic type, like uint32_t, can (potentially) have an alignment requirement that match its size, in this case four bytes. A compiler may insert padding to achieve this. By doing this manually, there will be no need for the compiler to do this, as the alignment always will be correct. The drawback is that on systems with less strict alignment requirements, a manually padded struct will be larger than one padded by the compiler. You can do this in ascending or descending order, but you will need to insert more pads in the middle of the struct if you do int in ascending order...

@Lindydancer 2015-02-22 08:35:13

... Padding in the end of the struct is only needed if you plan to use it in arrays.

@jwg 2015-08-13 06:52:31

I'm not an expert - this seems like a 'heuristic' which might work, but definitely does not guarantee that the same padding would be used. Is that the case? Can you explain why your answer is the complete opposite of the other highly-voted answers here?

@Lindydancer 2015-08-13 09:48:52

@jwg. In the general case (like, when you use a struct someone else has designed), padding can be inserted to ensure that no field end up on a location the hardware can't read (as explained in the other answers). However, when you design the struct yourself, you can, with some care, ensure that no padding is needed. These two facts do not, in any way, oppose each other! I believe that this heuristic will hold for all possible architectures (given that a type to doesn't have an alignment requirement which is greater than it's size, which isn't legal in C anyway).

@hoodaticus 2017-05-25 21:02:36

@Lindydancer - padding is needed if you intend to composite them into a contiguous memory block of random stuff, not necessarily just a homogenous array. Padding can make you self-aligning on arbitrary boundaries such as sizeof(void*) or the size of an SIMD register,.

@Lindydancer 2018-02-19 15:45:32

@TimSeguine, The statement "the alignment of a type can not exceed the size of the type" is true. Otherwise, malloc(2*sizeof(a_type)) (or new[]) would not return an array, where both elements could be accessed. On a given system, std::max_align_t is a typedef of the highest aligned scalar (like long double). If it were a typedef to a scalar with lower alignment, then the C++ implementation would be broken. However, please prove me wrong, all you need to do is to come up with a single example where `sizeof(type) < alignof(type)' would hold.

@Tim Seguine 2018-02-21 16:32:45

@Lindydancer I was under the impression that the alignas specifier doesn't affect the sizeof. I was also under the impression that sizeof(long double) was 10 on intel architectures with gcc. Both are apparently not true.

@Lindydancer 2018-02-23 15:14:04

The alignment bumps the size of types. For example, a struct with a short(on a machine where it is 2 bytes and is 2 byte aligned) and a char (1 byte), has the size 4.

@Tim Seguine 2018-03-18 21:29:27

Yes, I see now that I was just wrong. By experimentation and just pure logic. I think my mistake was thinking sizeof was an intrinsic constant of the type. Since alignment can be changed arbitrarily, it seemed like the two could not be related.

@John Dibling 2011-03-22 20:43:33

Long story short, no. There is no platform-independent, Standard-conformant way to deal with padding.

Padding is called "alignment" in the Standard, and it begins discussing it in 3.9/5:

Object types have alignment requirements (3.9.1, 3.9.2). The alignment of a complete object type is an implementation-defined integer value representing a number of bytes; an object is allocated at an address that meets the alignment requirements of its object type.

But it goes on from there and winds off to many dark corners of the Standard. Alignment is "implementation-defined" meaning it can be different across different compilers, or even across address models (ie 32-bit/64-bit) under the same compiler.

Unless you have truly harsh performance requirements, you might consider storing your data to disc in a different format, like char strings. Many high-performance protocols send everything using strings when the natural format might be something else. For example, a low-latency exchange feed I recently worked on sends dates as strings formatted like this: "20110321" and times are sent similarly: "141055.200". Even though this exchange feed sends 5 million messages per second all day long, they still use strings for everything because that way they can avoid endian-ness and other issues.

@Erik 2011-03-22 20:32:41

No, there's no safe way. In addition to padding, you have to deal with different byte ordering, and different sizes of builtin types.

You need to define a file format, and convert your struct to and from that format. Serialization libraries (e.g. boost::serialization, or google's protocolbuffers) can help with this.

@Thomas Matthews 2011-03-22 22:21:04

"The size of a structure (or class) may not be equal to the sum of the size of its members."

@Erik 2011-03-22 22:23:05

@Thomas: Exactly. And that's just the start of the fun.

@Yippie-Ki-Yay 2011-03-22 20:31:28

You could use something like boost::serialization.

Related Questions

Sponsored Content

34 Answered Questions

10 Answered Questions

21 Answered Questions

[SOLVED] What is the "-->" operator in C++?

1 Answered Questions

[SOLVED] The Definitive C++ Book Guide and List

  • 2008-12-23 05:23:56
  • grepsedawk
  • 2065084 View
  • 4250 Score
  • 1 Answer
  • Tags:   c++ c++-faq

28 Answered Questions

[SOLVED] When to use struct?

  • 2009-02-06 17:37:55
  • Alex Baranosky
  • 240242 View
  • 1270 Score
  • 28 Answer
  • Tags:   c# struct

23 Answered Questions

[SOLVED] Image Processing: Algorithm Improvement for 'Coca-Cola Can' Recognition

25 Answered Questions

[SOLVED] When should you use a class vs a struct in C++?

  • 2008-09-10 16:29:54
  • Alan Hinchcliffe
  • 345093 View
  • 797 Score
  • 25 Answer
  • Tags:   c++ oop class struct ooad

8 Answered Questions

[SOLVED] Difference between 'struct' and 'typedef struct' in C++?

  • 2009-03-04 20:41:12
  • criddell
  • 451524 View
  • 742 Score
  • 8 Answer
  • Tags:   c++ struct typedef

31 Answered Questions

[SOLVED] Why is this program erroneously rejected by three C++ compilers?

Sponsored Content