By max1000001


2019-03-14 19:23:08 8 Comments

Given the following program:

#include <stdio.h>
int main(void)
{
    int i = 1, j = 2;
    int val = (++i > ++j) ? ++i : ++j;
    printf("%d\n", val); // prints 4
    return 0;
}

The initialization of val seems like it could be hiding some undefined behavior, but I don't see any point at which an object is either modified more than once or modified and used without a sequence point in between. Could someone either correct or corroborate me on this?

4 comments

@Damon 2019-03-15 12:53:58

I was going to comment on @Doug Currie that signed integer overflow was a tidbit too far fetched, although technically correct as answer. On the contrary!

On a second thought, I think Doug's answer is not only correct, but assuming a not entirely trivial three-liner as in the example (but a program with maybe a loop or such) should be extended to a clear, definite "yes". Here's why:

The compiler sees int i = 1, j = 2;, so it knows that ++i will be equal to j and thus cannot possibly be larger than j or even ++j. Modern optimizers see such trivial things.

Unless of course, one of them overflows. But the optimizer knows that this would be UB, and therefore assumes that, and optimizes according to, it will never happen.

So the ternary operator's condition is always-false (in this easy example certainly, but even if invoked repeatedly in a loop this would be the case!), and i will only ever be incremented once, whereas j will always be incremented twice. Thus not only is j always larger than i, it even gains at every iteration (until overflow happens, but this never happens per our assumption).

Thus, the optimizer is allowed to turn this into ++i; j += 2; unconditionally, which surely isn't what one would expect.

The same applies for e.g. a loop with unknown values of i and j, such as user-supplied input. The optimizer might very well recognize that the sequence of operations only depends on the initial values of i and j. Thus, the sequence of increments followed by a conditional move can be optimized by duplicating the loop, once for each case, and switching between the two with a single if(i>j). And then, while we're at it, it might fold the loop of repeated increment-by-twos into something like (j-i)<<1 which it just adds. Or something.
Under the assumption that overflow never happens -- which is the assumption that the optimizer is allowed to make, and does make -- such a modification which may completely changes the entire sense and mode of operation of the program is perfectly fine.

Try and debug that.

@alinsoar 2019-03-14 20:39:07

too late, but maybe useful.

(++i > ++j) ? ++i : ++j;

In the document ISO/IEC 9899:201xAnnex C(informative)Sequence points we find that there is a sequence point

Between the evaluations of the first operand of the conditional ?: operator and whichever of the second and third operands is evaluated

In order to be well defined behavior one must not modify 2 times (via side-effects) the same object between 2 sequence points.

In your expression the only conflict that could appear would be between the first and second ++i or ++j.

At every sequence point the value last stored in the object shall agree with that prescribed by the abstract machine (this is what you would compute on paper, like on a turing machine).

Quote from 5.1.2.3p3 Program execution

The presence of a sequence point between the evaluation of expressions A and B implies that every value computation and side effect associated with A is sequenced before every value computation and side effect associated with B.

When you have side-effects in your code, they are sequenced by different expressions. The rule says that between 2 sequence points you can permute these expressions as you wish.

For example. i = i++. Because none of the operators involved in this expression represent sequence points, you can permute the expressions that are side-effects as you want. The C language allows you to use any of these sequences

i = i; i = i+1; or i = i+1; i=i; or tmp=i; i = i+1 ; i = tmp; or tmp=i; i = tmp; i = i+1; or anything that provides the same result as the abstract semantics of computation asks for interpretation of this computation. The Standard ISO9899 defines the C language as abstract semantics.

@max1000001 2019-03-14 20:46:07

I think the part where you enumerate the modifications to i and j as "possible conflicts" adds something new and useful to the analysis. I hadn't thought of that, thanks!

@Doug Currie 2019-03-14 20:51:52

There may be no UB in your program, but in the question: Does the statement int val = (++i > ++j) ? ++i : ++j; invoke undefined behavior?

The answer is yes. Either or both of the increment operations may overflow, since i and j are signed, in which case all bets are off.

Of course this doesn't happen in your full example because you've specified the values as small integers.

@machine_1 2019-03-14 21:48:38

I assure you that the question is not about signed integer overflow. It's about whether there is a sequence point between the first operand of the ternary operator and whichever wins from the second and the third operands.

@eckes 2019-03-15 00:15:48

The question was about „some undefined behavior“ and reminders about data types beeing implementation specific are totally appropriate for such an open question. And a signed integer Flow is UB.

@Holger 2019-03-15 07:45:23

@eckes but the question was “Does the …”, so an unconditional “yes” is a wrong answer. If the question was “Can the …” or “May the …”, the answer would be correct.

@Damon 2019-03-15 12:54:30

Was going to complain, but on a second thought +1 :)

@dbush 2019-03-14 19:40:15

The behavior of this code is well defined.

The first expression in a conditional is guaranteed to be evaluated before either the second expression or the third expression, and only one of the second or third will be evaluated. This is described in section 6.5.15p4 of the C standard:

The first operand is evaluated; there is a sequence point between its evaluation and the evaluation of the second or third operand (whichever is evaluated). The second operand is evaluated only if the first compares unequal to 0; the third operand is evaluated only if the first compares equal to 0; the result is the value of the second or third operand (whichever is evaluated), converted to the type described below.

In the case of your expression:

int val = (++i > ++j) ? ++i : ++j;

++i > ++j is evaluated first. The incremented values of i and j are used in the comparison, so it becomes 2 > 3. The result is false, so then ++j is evaluated and ++i is not. So the (again) incremented value of j (i.e. 4) is then assigned to val.

@max1000001 2019-03-14 19:46:12

Thank you for your detailed and direct answer!

Related Questions

Sponsored Content

5 Answered Questions

[SOLVED] Undefined behavior and sequence points

9 Answered Questions

1 Answered Questions

[SOLVED] printf with consecutive function as argument, undefined behavior

5 Answered Questions

[SOLVED] Does a[a[0]] = 1 produce undefined behavior?

5 Answered Questions

[SOLVED] Undefined behavior and sequence points reloaded

4 Answered Questions

[SOLVED] Undefined behavior with: c = (b=a+2) - (a=1) ;

1 Answered Questions

[SOLVED] Undefined behavior and sequence point

4 Answered Questions

[SOLVED] In C is there any guarantee with code prior to undefined behavior?

  • 2010-10-23 02:35:42
  • Good Person
  • 311 View
  • 11 Score
  • 4 Answer
  • Tags:   c standards

Sponsored Content