By Kerrek SB


2013-02-18 11:55:30 8 Comments

The C++ standard contains a semi-famous example of "surprising" name lookup in 3.3.2, "Point of declaration":

int x = x;

This initializes x with itself, which (being a primitive type) is uninitialized and thus has an indeterminate value (assuming it is an automatic variable).

Is this actually undefined behaviour?

According to 4.1 "Lvalue-to-rvalue conversion", it is undefined behaviour to perform lvalue-to-rvalue conversion on an uninitialized value. Does the right-hand x undergo this conversion? If so, would the example actually have undefined behaviour?

4 comments

@Johannes Schaub - litb 2013-07-07 16:52:31

An implicit conversion sequence of an expression e to type T is defined as being equivalent to the following declaration, using t as the result of the conversion (modulo value category, which will be defined depending on T), 4p3 and 4p6

T t = e;

The effect of any implicit conversion is the same as performing the corresponding declaration and initialization and then using the temporary variable as the result of the conversion.

In clause 4, the conversion of an expression to a type always yields expressions with a specific property. For example, conversion of 0 to int* yields a null pointer value, and not just one arbitrary pointer value. The value category too is a specific property of an expression and its result is defined as follows

The result is an lvalue if T is an lvalue reference type or an rvalue reference to function type (8.3.2), an xvalue if T is an rvalue reference to object type, and a prvalue otherwise.

Hence we know that in int t = e;, the result of the conversion sequence is a prvalue, because int is a non-reference type. So if we provide a glvalue, we are in obvious need of a conversion. 3.10p2 further clarifies that to leave no doubt

Whenever a glvalue appears in a context where a prvalue is expected, the glvalue is converted to a prvalue; see 4.1, 4.2, and 4.3.

@Kerrek SB 2013-07-07 18:04:34

Awesome, thanks!

@Kerrek SB 2013-07-07 18:05:50

(I'd love to give you a reward bounty, but the minimum bounty I can give is 300 -- am I that stingy or cheap? :-))

@Kerrek SB 2013-07-15 22:12:25

Check out this proposal.

@Johannes Schaub - litb 2013-07-15 23:08:35

@kerrek i already know that proposal. It is good that they are crafting clearer rules rather than using weak english casual terms.

@ob1 2013-02-26 23:16:47

The behavior is not undefined. The variable is uninitialized and stays with whatever random value uninitialized values start up with. One example from clan'g test suit:

int test7b(int y) {
  int x = x; // expected-note{{variable 'x' is declared here}}
  if (y)
    x = 1;
  // Warn with "may be uninitialized" here (not "is sometimes uninitialized"),
  // since the self-initialization is intended to suppress a -Wuninitialized
  // warning.
  return x; // expected-warning{{variable 'x' may be uninitialized when used here}}
}

Which you can find in clang/test/Sema/uninit-variables.c tests for this case explicitly.

@M.M 2015-04-09 03:52:56

The behaviour is undefined according to the C++ standard. This means that compilers may do what they like, and your example shows what clang has chosen to do.

@Shafik Yaghmour 2015-10-20 19:52:52

The variable is uninitialized and stays with whatever random value uninitialized values start up with ... No, the compiler can do anything including optimizing the code away, see an example of clang doing so here.

@Andy Prowl 2013-02-20 23:20:02

UPDATE: Following the discussion in the comments, I added some more evidence at the end of this answer.


Disclaimer: I admit this answer is rather speculative. The current formulation of the C++11 Standard, on the other hand, does not seem to allow for a more formal answer.


In the context of this Q&A, it has emerged that the C++11 Standard fails to formally specify what value categories are expected by each language construct. In the following I will mostly focus on built-in operators, although the question is about initializers. Eventually, I will end up extending the conclusions I drew for the case of operators to the case of initializers.

In the case of built-in operators, in spite of the lack of a formal specification, (non-normative) evidences are found in the Standard that the intended specification is to let prvalues be expected wherever a value is needed, and when not specified otherwise.

For instance, a note in Paragraph 3.10/1 says:

The discussion of each built-in operator in Clause 5 indicates the category of the value it yields and the value categories of the operands it expects. For example, the built-in assignment operators expect that the left operand is an lvalue and that the right operand is a prvalue and yield an lvalue as the result. User-defined operators are functions, and the categories of values they expect and yield are determined by their parameter and return types

Section 5.17 on assignment operators, on the other hand, does not mention this. However, the possibility of performing an lvalue-to-rvalue conversion is mentioned, again in a note (Paragraph 5.17/1):

Therefore, a function call shall not intervene between the lvalue-to-rvalue conversion and the side effect associated with any single compound assignment operator

Of course, if no rvalue were expected, this note would be meaningless.

Another evidence is found in 4/8, as pointed out by Johannes Schaub in the comments to linked Q&A:

There are some contexts where certain conversions are suppressed. For example, the lvalue-to-rvalue conversion is not done on the operand of the unary & operator. Specific exceptions are given in the descriptions of those operators and contexts.

This seems to imply that lvalue-to-rvalue conversion is performed on all operands of built-in operators, except when specified otherwise. This would mean, in turn, that rvalues are expected as operands of built-in operators unless specified otherwise.


CONJECTURE:

Even though initialization is not assignment, and therefore operators do not enter the discussion, my suspicion is that this area of the specification is affected by the very same problem described above.

Traces supporting this belief can be found even in Paragraph 8.5.2/5, about the initialization of references (for which the value of the lvalue initializer expression is not needed):

The usual lvalue-to-rvalue (4.1), array-to-pointer (4.2), and function-to-pointer (4.3) standard conversions are not needed, and therefore are suppressed, when such direct bindings to lvalues are done.

The word "usual" seems to imply that when initializing objects which are not of a reference type, lvalue-to-rvalue conversion is meant to apply.

Therefore, I believe that although requirements on the expected value category of initializers are ill-specified (if not completely missing), on the grounds of the evidences provided it makes sense to assume that the intended specification is that:

Wherever a value is required by a language construct, a prvalue is expected unless specified otherwise.

Under this assumption, an lvalue-to-rvalue conversion would be required in your example, and that would lead to Undefined Behavior.


ADDITIONAL EVIDENCE:

Just to provide further evidence to support this conjecture, let's assume it wrong, so that no lvalue-to-rvalue conversion is indeed required for copy-initialization, and consider the following code (thanks to jogojapan for contributing):

int y;
int x = y; // No UB
short t;
int u = t; // UB! (Do not like this non-uniformity, but could accept it)
int z;
z = x; // No UB (x is not uninitialized)
z = y; // UB! (Assuming assignment operators expect a prvalue, see above)
       // This would be very counterintuitive, since x == y

This non-uniform behavior does not make a lot of sense to me. What makes more sense IMO is that wherever a value is required, a prvalue is expected.

Moreover, as Jesse Good correctly points out in his answer, the key Paragraph of the C++ Standard is 8.5/16:

— Otherwise, the initial value of the object being initialized is the (possibly converted) value of the initializer expression. Standard conversions (Clause 4) will be used, if necessary, to convert the initializer expression to the cv-unqualified version of the destination type; no user-defined conversions are considered. If the conversion cannot be done, the initialization is ill-formed. [ Note: An expression of type “cv1 T” can initialize an object of type “cv2 T” independently of the cv-qualifiers cv1 and cv2.

However, while Jesse mainly focuses on the "if necessary" bit, I would also like to stress the word "type". The paragraph above mentions that standard conversions will be used "if necessary" to convert to the destination type, but does not say anything about category conversions:

  1. Will category conversions be performed if needed?
  2. Are they needed?

For what concerns the second question, as discussed in the original part of the answer, the C++11 Standard currently does not specify whether category conversions are needed or not, because nowhere it is mentioned whether copy-initialization expects a prvalue as an initializer. Thus, a clear-cut answer is impossible to give. However, I believe I provided enough evidence to assume this to be the intended specification, so that the answer would be "Yes".

As for the first question, it seems reasonable to me that the answer is "Yes" as well. If it were "No", obviously correct programs would be ill-formed:

int y = 0;
int x = y; // y is lvalue, prvalue expected (assuming the conjecture is correct)

To sum it up (A1 = "Answer to question 1", A2 = "Answer to question 2"):

          | A2 = Yes   | A2 = No |
 ---------|------------|---------|
 A1 = Yes |     UB     |  No UB  | 
 A1 = No  | ill-formed |  No UB  |
 ---------------------------------

If A2 is "No", A1 does not matter: there's no UB, but the bizarre situations of the first example (e.g. z = y giving UB, but not z = x even though x == y) show up. If A2 is "Yes", on the other hand, A1 becomes crucial; yet, enough evidence has been given to prove it would be "Yes".

Therefore, my thesis is that A1 = "Yes" and A2 = "Yes", and we should have Undefined Behavior.


FURTHER EVIDENCE:

This defect report (courtesy of Jesse Good) proposes a change that is aimed at giving Undefined Behavior in this case:

[...] In addition, 4.1 [conv.lval] paragraph 1 says that applying the lvalue-to-rvalue conversion to an “object [that] is uninitialized” results in undefined behavior; this should be rephrased in terms of an object with an indeterminate value.

In particular, the proposed wording for Paragraph 4.1 says:

When an lvalue-to-rvalue conversion occurs in an unevaluated operand or a subexpression thereof (Clause 5 [expr]) the value contained in the referenced object is not accessed. In all other cases, the result of the conversion is determined according to the following rules:

— If T is (possibly cv-qualified) std::nullptr_t, the result is a null pointer constant (4.10 [conv.ptr]).

— Otherwise, if the glvalue T has a class type, the conversion copy-initializes a temporary of type T from the glvalue and the result of the conversion is a prvalue for the temporary.

— Otherwise, if the object to which the glvalue refers contains an invalid pointer value (3.7.4.2 [basic.stc.dynamic.deallocation], 3.7.4.3 [basic.stc.dynamic.safety]), the behavior is implementation-defined.

— Otherwise, if T is a (possibly cv-qualified) unsigned character type (3.9.1 [basic.fundamental]), and the object to which the glvalue refers contains an indeterminate value (5.3.4 [expr.new], 8.5 [dcl.init], 12.6.2 [class.base.init]), and that object does not have automatic storage duration or the glvalue was the operand of a unary & operator or it was bound to a reference, the result is an unspecified value. [Footnote: The value may be different each time the lvalue-to-rvalue conversion is applied to the object. An unsigned char object with indeterminate value allocated to a register might trap. —end footnote]

Otherwise, if the object to which the glvalue refers contains an indeterminate value, the behavior is undefined.

— Otherwise, if the glvalue has (possibly cv-qualified) type std::nullptr_t, the prvalue result is a null pointer constant (4.10 [conv.ptr]). Otherwise, the value contained in the object indicated by the glvalue is the prvalue result.

@Kerrek SB 2013-02-20 23:52:33

Hm, there's a lot of talk about "operators" in your post, but my question has nothing to do with operators...

@Andy Prowl 2013-02-20 23:56:11

@KerrekSB: Yes, I'm aware of this. That's why I marked my answer as a "conjecture". My assumption is that in the same way that value category requirements were left unspecified for operators, they were left unspecified for initializers. And since the intended specification for operators is (EDIT: seems to be) that wherever a value is needed, a prvalue is expected unless specified otherwise, it makes sense IMO to make the same assumption for initializers. A purely formal answer to your question can't be given I'm afraid, because the Standard itself lacks a well-defined specification.

@jogojapan 2013-02-22 14:40:42

+1, clearly useful, even though I don't know whether the conjecture is correct.

@Andy Prowl 2013-02-22 14:42:26

@jogojapan: Thank you. Neither do I, which why I called it a conjecture of course ;-) However, IMHO it makes more much sense to assume it true than false.

@Jesse Good 2013-02-22 23:46:28

Ok, deleted. Also, slightly related is defect report 616 and its related issues, but AFAICT it doesn't cover the OP's case.

@Andy Prowl 2013-02-23 00:12:32

@JesseGood: Actually that defect report seems to show that the intended behavior is to give UB: "if the object to which the glvalue refers contains an indeterminate value, the behavior is undefined." Or am I too biased by my own viewpoint?

@Jesse Good 2013-02-23 00:23:57

@AndyProwl: That part is true only if x is converted to a prvalue. It is still unclear whether that is the case (although I believe it isn't the case as it is not needed).

@Andy Prowl 2013-02-23 00:29:02

@JesseGood: True, it requires the conversion. I do believe it is (intended to be) needed, but again, this is about opinions.

@tc. 2013-02-23 01:22:08

Consider volatile int x; volatile int y=x;. What happens if x happens to be a trap representation?

@Andy Prowl 2013-02-23 01:29:29

@tc: Not sure what you mean. In the C++11 Standard, the word "trap" appears just a few times, and always in an unrelated context. Do you think this should not be UB? If so, why?

@tc. 2013-02-23 01:49:41

@AndyProwl I'm not intricately familiar with C++11, but in C99, -INT_MAX-1 (two's complement) or negative zero (ones' complement or sign-magnitude) are allowed to be "trap" representations, as are integers with incorrect padding bits. It's unclear why int x=y; would be valid but int x;x=y; would be UB (and if so, are compilers on such platforms required to take steps to ensure that the former doesn't trap?). Other curiosities are int x=x=x; (UB?) or volatile int x=x; (what memory accesses are required?).

@Andy Prowl 2013-02-23 01:58:40

@tc.: OK, I'm not familiar with C99, so this will be hard :-) However, I am advocating that both int x = y and int x; x = y should lead to UB (the latter certainly does, which is why I believe the former should as well). In my view, int x=x=x; would be UB as well of course. It seems to me that you share this view, or don't you? If not, why so?

@tc. 2013-02-23 02:23:52

@AndyProwl In C99, an "indeterminate value" is either "unspecified" (any valid value) or a trap representation, which suggests that they are UB in general but an unspecified value on common architectures (because int has no trap representations). However, in C, x=x=x; is UB because it writes to x twice, which suggests that int x=x=x; should be too (though x is only "modified" once); I think C++ differs, but I don't remember if it's only in the presence of operator overloading.

@jogojapan 2013-02-23 13:26:37

@AndyProwl For the short/int example: The original code I suggested was short x; int y = x;, i.e. it converted from short to int, not vice versa. I think this is better, because it requires an implicit conversion, but it avoids potential overflow situations, which complicate the discussion.

@Andy Prowl 2013-02-23 14:28:12

@jogojapan: Right, I'm going to edit it. Thank you!

@Jesse Good 2013-08-18 22:04:33

+1 because I finally agree that you were right after all this time!

@Andy Prowl 2013-08-19 16:02:30

@JesseGood: Thank you, but after all I think it's Johannes who gave the clear-cut answer (mine is more a collection of evidence) :)

@chenming 2013-02-18 12:46:26

this is not undefined behaviour.You just don't know its specific values, because there is no initialization. If the variable is global and built-in type so the compiler will put it initialized to the right value. If the variable is local so the compiler not initialize it,So all of the variables are initialized to yourself, don't rely on the compiler.

@Kerrek SB 2013-02-18 12:50:23

Assume it's an automatic variable - I updated the question.

@Arpit 2013-02-18 12:56:46

in case of automatic type it's an error. ` variable 'auto x' with 'auto' type used in its own initializer `

@Kerrek SB 2013-02-18 12:58:03

@Arpit: There's no auto in the question (and that's not what "automatic" means!).

@Arpit 2013-02-18 13:00:13

oh! i just consider the automatic variable to auto type. my mistake

@Kerrek SB 2013-02-18 13:25:49

@Arpit: auto isn't a type. It's a keyword.

@Arpit 2013-02-18 13:28:52

@KerrekSB Don't be so serious.:) i know its a type specifier.

@chenming 2013-02-18 14:11:10

It seems that all the variable default are automatic variables."auto"Can omit

Related Questions

Sponsored Content

11 Answered Questions

[SOLVED] What are rvalues, lvalues, xvalues, glvalues, and prvalues?

2 Answered Questions

[SOLVED] lvalue to rvalue implicit conversion

11 Answered Questions

[SOLVED] Why is f(i = -1, i = -1) undefined behavior?

0 Answered Questions

lvalue conversion and assignment operator

  • 2016-12-01 15:10:01
  • Rodvi
  • 55 View
  • 0 Score
  • 0 Answer
  • Tags:   c++

3 Answered Questions

[SOLVED] When is it valid to access a pointer to a "dead" object?

2 Answered Questions

[SOLVED] Lvalues which do not designate objects in C++14

2 Answered Questions

[SOLVED] Does swap() cause undefined behaviour?

Sponsored Content