By M.M


2015-05-08 01:35:49 8 Comments

In C++, is this code correct?

#include <cstdlib>
#include <cstring>

struct T   // trivially copyable type
{
    int x, y;
};

int main()
{
    void *buf = std::malloc( sizeof(T) );
    if ( !buf ) return 0;

    T a{};
    std::memcpy(buf, &a, sizeof a);
    T *b = static_cast<T *>(buf);

    b->x = b->y;

    free(buf);
}

In other words, is *b an object whose lifetime has begun? (If so, when did it begin exactly?)

3 comments

@Shafik Yaghmour 2015-05-08 02:41:48

This is unspecified which is supported by N3751: Object Lifetime, Low-level Programming, and memcpy which says amongst other things:

The C++ standards is currently silent on whether the use of memcpy to copy object representation bytes is conceptually an assignment or an object construction. The difference does matter for semantics-based program analysis and transformation tools, as well as optimizers, tracking object lifetime. This paper suggests that

  1. uses of memcpy to copy the bytes of two distinct objects of two different trivial copyable tables (but otherwise of the same size) be allowed

  2. such uses are recognized as initialization, or more generally as (conceptually) object construction.

Recognition as object construction will support binary IO, while still permitting lifetime-based analyses and optimizers.

I can not find any meeting minutes that has this paper discussed, so it seems like it is still an open issue.

The C++14 draft standard currently says in 1.8 [intro.object]:

[...]An object is created by a definition (3.1), by a new-expression (5.3.4) or by the implementation (12.2) when needed.[...]

which we don't have with the malloc and the cases covered in the standard for copying trivial copyable types seem to only refer to already existing objects in section 3.9 [basic.types]:

For any object (other than a base-class subobject) of trivially copyable type T, whether or not the object holds a valid value of type T, the underlying bytes (1.7) making up the object can be copied into an array of char or unsigned char.42 If the content of the array of char or unsigned char is copied back into the object, the object shall subsequently hold its original value[...]

and:

For any trivially copyable type T, if two pointers to T point to distinct T objects obj1 and obj2, where neither obj1 nor obj2 is a base-class subobject, if the underlying bytes (1.7) making up obj1 are copied into obj2,43 obj2 shall subsequently hold the same value as obj1.[...]

which is basically what the proposal says, so that should not be surprising.

dyp points out a fascinating discussion on this topic from the ub mailing list: [ub] Type punning to avoid copying.

Propoal p0593: Implicit creation of objects for low-level object manipulation

The proposal p0593 attempts to solve this issues but AFAIK has not been reviewed yet.

This paper proposes that objects of sufficiently trivial types be created on-demand as necessary within newly-allocated storage to give programs defined behavior.

It has some motivating examples which are similar in nature including a current std::vector implementation which currently has undefined behavior.

It proposes the following ways to implicitly create an object:

We propose that at minimum the following operations be specified as implicitly creating objects:

  • Creation of an array of char, unsigned char, or std::byte implicitly creates objects within that array.

  • A call to malloc, calloc, realloc, or any function named operator new or operator new[] implicitly creates objects in its returned storage.

  • std::allocator::allocate likewise implicitly creates objects in its returned storage; the allocator requirements should require other allocator implementations to do the same.

  • A call to memmove behaves as if it

    • copies the source storage to a temporary area

    • implicitly creates objects in the destination storage, and then

    • copies the temporary storage to the destination storage.

    This permits memmove to preserve the types of trivially-copyable objects, or to be used to reinterpret a byte representation of one object as that of another object.

  • A call to memcpy behaves the same as a call to memmove except that it introduces an overlap restriction between the source and destination.

  • A class member access that nominates a union member triggers implicit object creation within the storage occupied by the union member. Note that this is not an entirely new rule: this permission already existed in [P0137R1] for cases where the member access is on the left side of an assignment, but is now generalized as part of this new framework. As explained below, this does not permit type punning through unions; rather, it merely permits the active union member to be changed by a class member access expression.

  • A new barrier operation (distinct from std::launder, which does not create objects) should be introduced to the standard library, with semantics equivalent to a memmove with the same source and destination storage. As a strawman, we suggest:

    // Requires: [start, (char*)start + length) denotes a region of allocated
    // storage that is a subset of the region of storage reachable through start.
    // Effects: implicitly creates objects within the denoted region.
    void std::bless(void *start, size_t length);
    

In addition to the above, an implementation-defined set of non-stasndard memory allocation and mapping functions, such as mmap on POSIX systems and VirtualAlloc on Windows systems, should be specified as implicitly creating objects.

Note that a pointer reinterpret_cast is not considered sufficient to trigger implicit object creation.

@dyp 2015-05-10 16:17:49

@Shafik Yaghmour 2015-05-10 17:03:58

@dyp wow, that is an awesome discussion, it is going to take a while to digest it but it is priceless, Thank you for pointing that out.

@dyp 2015-05-10 18:01:47

Unfortunately, it is incomplete as far as I can tell (the beginning is missing and the conclusion is vague at best IMHO).

@M.M 2015-06-10 23:46:28

I think you meant "not specified" rather than "unspecified" (the latter term has a specific meaning in the C++ standard) ?

@M.M 2015-06-10 23:57:18

Also I have a corollary question (not sure if it is worth posting this as a separate question or not); do you feel it would make any difference if T had a non-trivial default constructor? (But is still trivially-copyable).

@BeeOnRope 2017-10-22 20:42:01

For what it's worth, the N3751 paper is targeting quite a different use-case than @M.M's original code. The code above copies a T object into uninitialized storage (where no object exists) and asks if the storage now contains a T (the consensus based on the current standard seems to be "no"). The N3751 paper talks about copying between two existing differently typed objects from T to U and asks if it is legal to access U afterwards, and what the semantics are with respect to the lifetime of U.

@BeeOnRope 2017-10-22 20:45:51

The proposed remedy in N3751 is simply to allow this T to U copy and to treat it as construction (not assignment). The proposed language wouldn't apply to the case posted by the OP. At their core, I think the two cases have quite different motivations: the N3751 case seems largely motivated by "type punning": allowing cases where it is useful to examine or modify the "bits" of one object through the lens of a different type, which is useful for some low-level operations if you assume a certain representation for types, e.g., floats.

@BeeOnRope 2017-10-22 20:50:38

On the other hand, the "does memcpy create an object" question seems more motivated by general purpose manipulation of trivially copyable types. For example, it seems "obvious" that when when std::vector needs to expand and copy it's underlying storage consisting of trivially copyable T objects, it can simply allocated new uninitialized storage of a larger size, and memcpy the existing over objects (indeed the standard explicitly guarantees that such copies between two T objects is well-defined). It's not allowed though because there is noT object yet in the uninitialized storage.

@BeeOnRope 2017-10-22 20:54:49

Now every standard library that I've checked simply does it via a direct memcpy anyways (as implementors, they can perhaps simply rely on the define-UB-however-you-want-it rule), but that's of little consolation for people who want to write portable code that uses the same idiom. For trivially constructible types you can probably work around it by calling placement new before copying, which the compiler will usually optimize away to nothing, but for types with non-trivial constructors there is no good general workaround. Maybe one day.

@Aconcagua 2018-06-18 10:42:58

Considering the vector, isn't such operation (from sight of C++) rather moving than copying the objects to a new location in memory? If it was "copying", wouldn't we be required to call the destructors of the source objects? So the vector example appears to me not being applicable here. On the other hand, it shows that memcpy possibly might be legally used for both moving and copy initialising objects...

@curiousguy 2018-06-18 13:23:34

"An object is created by a definition (3.1), by a new-expression (5.3.4) or by the implementation (12.2) when needed" Nobody takes that statement seriously. It means that unions cannot be used. Actually the whole C++ object/storage model is a train wreck with obvious holes everywhere.

@Shafik Yaghmour 2018-06-18 15:45:24

@Remy Lebeau 2015-05-08 02:04:13

Is this code correct?

Well, it will usually "work", but only for trivial types.

I know you did not ask for it, but lets use an example with a non-trivial type:

#include <cstdlib>
#include <cstring>
#include <string>

struct T   // trivially copyable type
{
    std::string x, y;
};

int main()
{
    void *buf = std::malloc( sizeof(T) );
    if ( !buf ) return 0;

    T a{};
    a.x = "test";

    std::memcpy(buf, &a, sizeof a);    
    T *b = static_cast<T *>(buf);

    b->x = b->y;

    free(buf);
}

After constructing a, a.x is assigned a value. Let's assume that std::string is not optimized to use a local buffer for small string values, just a data pointer to an external memory block. The memcpy() copies the internal data of a as-is into buf. Now a.x and b->x refer to the same memory address for the string data. When b->x is assigned a new value, that memory block is freed, but a.x still refers to it. When a then goes out of scope at the end of main(), it tries to free the same memory block again. Undefined behavior occurs.

If you want to be "correct", the right way to construct an object into an existing memory block is to use the placement-new operator instead, eg:

#include <cstdlib>
#include <cstring>

struct T   // does not have to be trivially copyable
{
    // any members
};

int main()
{
    void *buf = std::malloc( sizeof(T) );
    if ( !buf ) return 0;

    T *b = new(buf) T; // <- placement-new
    // calls the T() constructor, which in turn calls
    // all member constructors...

    // b is a valid self-contained object,
    // use as needed...

    b->~T(); // <-- no placement-delete, must call the destructor explicitly
    free(buf);
}

@M.M 2015-05-08 02:05:18

How is the original code not correct, exactly?

@user1095108 2016-09-01 13:57:28

struct T containing a ::std::string is not trivially copyable in c++14 and onwards

@Ben Voigt 2018-02-07 21:44:56

An object containing a std::string has never been trivially copyable. It looks like a copy+paste mistake, the code in the question has a "trivially copyable" comment, and when the code was edited for the answer, the comment wasn't updated.

@g3cko 2015-05-08 01:57:41

From a quick search.

"... lifetime begins when the properly-aligned storage for the object is allocated and ends when the storage is deallocated or reused by another object."

So, I would say by this definition, the lifetime begins with the allocation and ends with the free.

@M.M 2015-05-08 02:02:35

It seems a bit fishy to say that void *buf = malloc( sizeof(T) ) has created an object of type T. After all, it could equally well have created an object of any type whose size is sizeof(T) , we don't yet know whether this code will go on to point T *b at it, or U *u for example

@g3cko 2015-05-08 02:23:47

That's a good point; since this is not based on the scoping rules, that made the most sense to me. The alternative, maybe *b is not an object, and only b is.

@nonsensickle 2015-05-08 02:23:49

Yes but does it matter? If it is a trivial constructor we know it is mostly a noop. So long as their size and alignment constraints are met, malloc can be considered a trivial constructor equivalent can it not?

@M.M 2015-05-08 02:25:43

@nonsensickle I'm hoping for a "language lawyer" quality answer, e.g. text from the C++ standard to support that malloc can be considered a trivial constructor

@Aaron McDaid 2015-07-21 17:22:25

@MattMcNabb, memory from malloc has "no declared type". stackoverflow.com/questions/31483064/… As such, its effective type can change many times through its lifetime; each time it is written to it takes the type of the written data. In particular, that answers quotes how memcpy copies the effective type of the source data. But I guess that's C, not C++, and maybe it's different

@M.M 2015-07-22 03:39:21

@AaronMcDaid yes that's C and it is different

@curiousguy 2015-08-16 01:42:22

@MattMcNabb A trivial ctor call is trivial, equivalent to nothing. You can pretend it happened.

@curiousguy 2015-08-16 14:58:26

@AaronMcDaid This effective type invention is a mistake of the C committee.

@supercat 2015-09-15 19:05:17

@curiousguy: The Strict Aliasing Rule would be meaningless without the concept of "effective type". On the other hand, I consider the concept of type-based aliasing rules itself to be a mistake, since it simultaneously forces programmers to write inefficient code using memcpy or memmove and hope an optimizer can fix it, while failing to allow compilers to make what should be simple and easy optimizations in cases where a programmer knows (and could tell the compiler) that certain things won't alias.

@curiousguy 2015-09-15 20:49:28

@supercat With the C concept of effective type, even memcpy doesn't solve the issue of aliasing.

@supercat 2015-09-15 21:05:08

@curiousguy: I thought it did (which was the reason char got special treatment)? Though I'll admit I don't understand all the rules of what's legitimate and not, since the rules are horrible compared with what could be achieved by adding a __cache(x) {block} statement which would entitle a compiler to assume that the value of x will not be changed by any means outside the control of the attached block. Any compiler could be compatible with such a statement merely by having __cache(x) be a macro that expands to nothing, but it would allow compilers to do a lot of register...

@supercat 2015-09-15 21:05:58

...optimizations easily that would otherwise require full-program optimization [and in some cases involving indirect function calls can't be done even with FPO].

@curiousguy 2015-09-23 20:10:06

@supercat The issue is that "copying bytes" (which is not a well defined concept in the first place, as I have demonstrated in my C/C++ questions about pointers) is defined as transferring the effective type, which defeats the "use memcpy to avoid strict aliasing violation" concept.

@supercat 2015-09-23 20:18:26

@curiousguy: I think I gotcha. Sounds like the Committee discovered that there were so many cases where requiring memcpy to not invoke the Strict Aliasing Rule would impair useful optimizations that they decided the solution was to make it so memcpy could work but leave the rule in effect, rather than replace the Strict Aliasing Rule with some new qualifiers (which compilers could ignore if they didn't know how to benefit from them) which could offer far greater benefits with fewer dangers.

Related Questions

Sponsored Content

1 Answered Questions

[SOLVED] Is std::string_view trivially copyable?

21 Answered Questions

[SOLVED] Why should I use a pointer rather than the object itself?

  • 2014-03-03 11:54:16
  • gEdringer
  • 276889 View
  • 1435 Score
  • 21 Answer
  • Tags:   c++ pointers c++11

0 Answered Questions

uninitialized_copy memcpy/memmove optimization

  • 2017-11-24 00:20:14
  • csisy
  • 286 View
  • 1 Score
  • 0 Answer
  • Tags:   c++ visual-c++ stl

2 Answered Questions

2 Answered Questions

1 Answered Questions

[SOLVED] Is a class with deleted copy-constructor trivially copyable?

  • 2015-04-20 22:19:03
  • Baum mit Augen
  • 1529 View
  • 26 Score
  • 1 Answer
  • Tags:   c++ language-lawyer

1 Answered Questions

7 Answered Questions

[SOLVED] C++: accessing private fields and function from main

4 Answered Questions

[SOLVED] c++ compiler error "was not declared in this scope"

Sponsored Content