By Channel72


2010-10-15 17:16:11 8 Comments

At first glance, this question may seem like a duplicate of How to detect integer overflow?, however it is actually significantly different.

I've found that while detecting an unsigned integer overflow is pretty trivial, detecting a signed overflow in C/C++ is actually more difficult than most people think.

The most obvious, yet naive, way to do it would be something like:

int add(int lhs, int rhs)
{
 int sum = lhs + rhs;
 if ((lhs >= 0 && sum < rhs) || (lhs < 0 && sum > rhs)) {
  /* an overflow has occurred */
  abort();
 }
 return sum; 
}

The problem with this is that according to the C standard, signed integer overflow is undefined behavior. In other words, according to the standard, as soon as you even cause a signed overflow, your program is just as invalid as if you dereferenced a null pointer. So you can't cause undefined behavior, and then try to detect the overflow after the fact, as in the above post-condition check example.

Even though the above check is likely to work on many compilers, you can't count on it. In fact, because the C standard says signed integer overflow is undefined, some compilers (like GCC) will optimize away the above check when optimization flags are set, because the compiler assumes a signed overflow is impossible. This totally breaks the attempt to check for overflow.

So, another possible way to check for overflow would be:

int add(int lhs, int rhs)
{
 if (lhs >= 0 && rhs >= 0) {
  if (INT_MAX - lhs <= rhs) {
   /* overflow has occurred */
   abort();
  }
 }
 else if (lhs < 0 && rhs < 0) {
  if (lhs <= INT_MIN - rhs) {
   /* overflow has occurred */
   abort();
  }
 }

 return lhs + rhs;
}

This seems more promising, since we don't actually add the two integers together until we make sure in advance that performing such an add will not result in overflow. Thus, we don't cause any undefined behavior.

However, this solution is unfortunately a lot less efficient than the initial solution, since you have to perform a subtract operation just to test if your addition operation will work. And even if you don't care about this (small) performance hit, I'm still not entirely convinced this solution is adequate. The expression lhs <= INT_MIN - rhs seems exactly like the sort of expression the compiler might optimize away, thinking that signed overflow is impossible.

So is there a better solution here? Something that is guaranteed to 1) not cause undefined behavior, and 2) not provide the compiler with an opportunity to optimize away overflow checks? I was thinking there might be some way to do it by casting both operands to unsigned, and performing checks by rolling your own two's-complement arithmetic, but I'm not really sure how to do that.

12 comments

@tbodt 2017-06-29 16:35:43

The fastest possible way is to use the GCC builtin:

int add(int lhs, int rhs) {
    int sum;
    if (__builtin_add_overflow(lhs, rhs, &sum))
        abort();
    return sum;
}

On x86, GCC compiles this into:

    mov %edi, %eax
    add %esi, %eax
    jo call_abort 
    ret
call_abort:
    call abort

which uses the processor's built-in overflow detection.

If you're not OK with using GCC builtins, the next fastest way is to use bit operations on the sign bits. Signed overflow in addition occurs when:

  • the two operands have the same sign, and
  • the result has a different sign than the operands.

The sign bit of ~(lhs ^ rhs) is on iff the operands have the same sign, and the sign bit of lhs ^ sum is on iff the result has a different sign than the operands. So you can do the addition in unsigned form to avoid undefined behavior, and then use the sign bit of ~(lhs ^ rhs) & (lhs ^ sum):

int add(int lhs, int rhs) {
    unsigned sum = (unsigned) lhs + (unsigned) rhs;
    if ((~(lhs ^ rhs) & (lhs ^ sum)) & 0x80000000)
        abort();
    return (int) sum;
}

This compiles into:

    lea (%rsi,%rdi), %eax
    xor %edi, %esi
    not %esi
    xor %eax, %edi
    test %edi, %esi
    js call_abort
    ret
call_abort:
    call abort

which is quite a lot faster than casting to a 64-bit type on a 32-bit machine (with gcc):

    push %ebx
    mov 12(%esp), %ecx
    mov 8(%esp), %eax
    mov %ecx, %ebx
    sar $31, %ebx
    clt
    add %ecx, %eax
    adc %ebx, %edx
    mov %eax, %ecx
    add $-2147483648, %ecx
    mov %edx, %ebx
    adc $0, %ebx
    cmp $0, %ebx
    ja call_abort
    pop %ebx
    ret
call_abort:
    call abort

@atomsymbol 2017-06-06 13:18:16

In case of adding two long values, portable code can split the long value into low and high int parts (or into short parts in case long has the same size as int):

static_assert(sizeof(long) == 2*sizeof(int), "");
long a, b;
int ai[2] = {int(a), int(a >> (8*sizeof(int)))};
int bi[2] = {int(b), int(b >> (8*sizeof(int))});
... use the 'long' type to add the elements of 'ai' and 'bi'

Using inline assembly is the fastest way if targeting a particular CPU:

long a, b;
bool overflow;
#ifdef __amd64__
    asm (
        "addq %2, %0; seto %1"
        : "+r" (a), "=ro" (overflow)
        : "ro" (b)
    );
#else
    #error "unsupported CPU"
#endif
if(overflow) ...
// The result is stored in variable 'a'

@Jens Gustedt 2010-10-16 06:41:18

No, your 2nd code isn't correct, but you are close: if you set

int half = INT_MAX/2;
int half1 = half + 1;

the result of an addition is INT_MAX. (INT_MAX is always an odd number). So this is valid input. But in your routine you will have INT_MAX - half == half1 and you would abort. A false positive.

This error can be repaired by putting < instead of <= in both checks.

But then also your code isn't optimal. The following would do:

int add(int lhs, int rhs)
{
 if (lhs >= 0) {
  if (INT_MAX - lhs < rhs) {
   /* would overflow */
   abort();
  }
 }
 else {
  if (rhs < INT_MIN - lhs) {
   /* would overflow */
   abort();
  }
 }
 return lhs + rhs;
}

To see that this is valid, you have to symbolically add lhs on both sides of the inequalities, and this gives you exactly the arithmetical conditions that your result is out of bounds.

@chux - Reinstate Monica 2014-08-17 15:51:22

+1 for the best answer. Minor: suggest /* overflow will occurred */ to emphasize that the whole point is to detect that overflow would have occurred if code did lhs + rhs without doing actually the sum.

@Shafik Yaghmour 2015-08-31 18:18:15

For the gcc case, from gcc 5.0 Release notes we can see it now provides a __builtin_add_overflow for checking overflow in addition:

A new set of built-in functions for arithmetics with overflow checking has been added: __builtin_add_overflow, __builtin_sub_overflow and __builtin_mul_overflow and for compatibility with clang also other variants. These builtins have two integral arguments (which don't need to have the same type), the arguments are extended to infinite precision signed type, +, - or * is performed on those, and the result is stored in an integer variable pointed to by the last argument. If the stored value is equal to the infinite precision result, the built-in functions return false, otherwise true. The type of the integer variable that will hold the result can be different from the types of the first two arguments.

For example:

__builtin_add_overflow( rhs, lhs, &result )

We can see from the gcc document Built-in Functions to Perform Arithmetic with Overflow Checking that:

[...]these built-in functions have fully defined behavior for all argument values.

clang also provides a set of checked arithmetic builtins:

Clang provides a set of builtins that implement checked arithmetic for security critical applications in a manner that is fast and easily expressable in C.

in this case the builtin would be:

__builtin_sadd_overflow( rhs, lhs, &result )

@chux - Reinstate Monica 2015-08-31 18:38:00

This function appears to be very useful except for one thing: int result; __builtin_add_overflow(INT_MAX, 1, &result); does not explicitly say what is stored in result on overflow and unfortunately is quiet on specifying undefined behavior does not occur. Certainly that was the intent - no UB. Better if it specified that.

@Shafik Yaghmour 2015-08-31 18:42:31

@chux good point, it states here the result is always defined, I updated my answer. It would be rather ironic if that was not the case.

@chux - Reinstate Monica 2015-08-31 19:40:58

Interesting your new reference does not have a (unsigned) long long *result for __builtin_(s/u)addll_overflow. Certainly these are in error. Makes one wonder as to the veracity of other aspects. IAC, good to see these __builtin_add/sub/mull_overflow(). Hope they make it to the C spec someday.

@Alex Reinking 2019-05-08 10:41:16

+1 this generates much better assembly than anything you could get in standard C, at least not without relying on your compiler's optimizer to figure out what you're doing. One should detect when such builtins are available and only use a standard solution when the compiler doesn't provide one.

@supercat 2010-10-15 20:47:32

How about:

int sum(int n1, int n2)
{
  int result;
  if (n1 >= 0)
  {
    result = (n1 - INT_MAX)+n2; /* Can't overflow */
    if (result > 0) return INT_MAX; else return (result + INT_MAX);
  }
  else
  {
    result = (n1 - INT_MIN)+n2; /* Can't overflow */
    if (0 > result) return INT_MIN; else return (result + INT_MIN);
  }
}

I think that should work for any legitimate INT_MIN and INT_MAX (symmetrical or not); the function as shown clips, but it should be obvious how to get other behaviors).

@R.. 2010-10-16 07:14:41

+1 for nice alternate approach that's perhaps more intuitive.

@davmac 2013-08-05 14:42:36

I think this - result = (n1 - INT_MAX)+n2; - could overflow, if n1 was small (say 0) and n2 was negative.

@supercat 2016-09-06 20:51:24

@davmac: Hmm... maybe it's necessary to break out three cases: start with one for (n1 ^ n2) < 0, which on a two's-complement machine would imply that the values have opposite sign and may be added directly. If the values have the same sign, then the approach given above would be safe. On the other hand, I'm curious if the authors of the Standard expected that implementations for two's-complement silent-overflow hardware would jump the rails in case of overflow in a way that didn't force an immediate Abnormal Program Termination, but caused unpredictable disruption of other computations.

@Chris Dodd 2010-10-15 22:22:32

The obvious solution is to convert to unsigned, to get the well-defined unsigned overflow behavior:

int add(int lhs, int rhs) 
{ 
   int sum = (unsigned)lhs + (unsigned)rhs; 
   if ((lhs >= 0 && sum < rhs) || (lhs < 0 && sum > rhs)) { 
      /* an overflow has occurred */ 
      abort(); 
   } 
   return sum;  
} 

This replaces the undefined signed overflow behavior with the implementation-defined conversion of out-of-range values between signed and unsigned, so you need to check your compiler's documentation to know exactly what will happen, but it should at least be well defined, and should do the right thing on any twos-complement machine that doesn't raise signals on conversions, which is pretty much every machine and C compiler built in the last 20 years.

@R.. 2010-10-16 04:56:46

You're still storing the result in sum, which is an int. That results in either an implementation-defined result or an implementation-defined signal being raised if the value of (unsigned)lhs + (unsigned)rhs is greater than INT_MAX.

@Chris Dodd 2010-10-16 06:18:48

@R: that's the whole point -- the behavior is implementation defined, rather than undefined, so the implementation must document what it does, and do it consistently. A signal can only be raised if the implementation documents it, in which case it must always be raised and you can use that behavior.

@rook 2010-10-15 17:32:31

If you use inline assembler you can check the overflow flag. Another possibility is taht you can use a safeint datatype. I recommend that read this paper on Integer Security.

@Mike DeSimone 2010-10-15 17:39:15

+1 This is another way of saying "If C won't define it, then you're forced into platform-specific behavior." So many things that are easily taken care of in assembly are undefined in C, creating mountains out of molehills in the name of portability.

@rook 2010-10-15 17:51:15

@Mike DeSimone eloquent.

@R.. 2010-10-15 18:00:54

Signed overflow if perfectly detectable and avoidable in plain C. -1 for suggesting asm instead of suggesting C code that will generate the correct asm on a good compiler.

@rook 2010-10-15 18:02:21

@R.. but if it is a valid solution then why would down vote?

@wheaties 2010-10-15 18:17:23

@R I have to agree with Rook.

@Matthieu M. 2010-10-15 18:26:00

Thanks for the paper :)

@Matthieu M. 2010-10-15 18:28:13

@R. perfectly detectable yes, however for integers operations one needs to care about performance. Many libraries are quite intensive in their integer usage (compression / decompression, encoding / decoding), it would be great to be able to use safe integer there too.

@R.. 2010-10-15 19:22:02

I gave a downvote for an asm answer to a C question. As I have said, there are correct, portable ways to write the check in C which will generate the exact same asm that you would write by hand. Naturally if you use these, the performance impact will be the same, and it will be much less of an impact than the C++ safeint stuff you also recommended.

@David Thornley 2010-10-15 20:25:59

@Matthieu: If you are writing code that will only be used on one implementation, and that implementation guarantees that something will work, and you need good integer performance, you can certainly use implementation-specific tricks. That isn't what the OP was asking for, though.

@Amir Zadeh 2010-10-15 20:29:12

Agreed with rook. I guess overflow flags are present in many platforms.

@R.. 2010-10-15 21:04:21

C distinguishes implementation-defined behavior and undefined behavior for good reasons, and even if something with UB "works" in the current version of your implementation, that doesn't mean it will continue to work in future versions. Consider gcc and signed overflow behavior...

@R.. 2010-10-16 00:29:35

Since I based my -1 on a claim that we could get C code to generate the identical asm, I guess it's only fair to retract it when all the major compilers turn out to be junk in this regard..

@supercat 2017-06-06 21:34:23

Defining "did last operation overflow" flag behavior can be very expensive, since optimal code generation often requires rearranging computations. If the expression x+y is evaluated within a loop but neither x nor y changes within that loop, it may be helpful for a compiler to evaluate that expression just once--before the loop starts. Such reordering could totally disrupt any code that tries to examine the overflow flag, however.

@ruslik 2010-10-15 20:46:35

By me, the simpliest check would be checking the signs of the operands and of the results.

Let's examine sum: the overflow could occur in both directions, + or -, only when both operands have the same sign. And, obviosly, the overflow will be when the sign of the result won't be the same as the sign of the operands.

So, a check like this will be enough:

int a, b, sum;
sum = a + b;
if  (((a ^ ~b) & (a ^ sum)) & 0x80000000)
    detect_oveflow();

Edit: as Nils suggested, this is the correct if condition:

((((unsigned int)a ^ ~(unsigned int)b) & ((unsigned int)a ^ (unsigned int)sum)) & 0x80000000)

And since when the instruction

add eax, ebx 

leads to undefined behavior? There is no such thing in the Intel x86 instruction set refference..

@Channel72 2010-10-15 21:12:35

You're missing the point here. Your second line of code sum = a + b could yield undefined behavior.

@Nils Pipenbrinck 2010-10-15 21:27:45

if you cast sum, a and b to unsigned during your test-addition your code will work btw..

@ruslik 2010-10-15 21:30:24

It's undefined not because the program will crash or will behave differently. It's the exact thing the processor is doing to compute the OF flag. The standard is just trying to protect itself from non-standard cases, but it doesn't mean that you are not allowed to do this.

@ruslik 2010-10-15 21:31:20

@Nils yeah, i wanted to do that, but I thought that four (usngined int)s will make it much more unreadable. (you know, you first read it, and try it only if you liked it).

@R.. 2010-10-16 04:55:16

Well you could make it more readable by taking out the useless int keywords. unsigned int is just an overly verbose way of writing unsigned.

@davmac 2013-08-05 14:39:03

@ruslik, it's undefined which specifically means the compiler can assume it won't happen. In this case the compiler can assume that the addition won't overflow, and a really smart compiler could then recognise that the condition in the 'if' statement must then always be false and so completely remove the check; you're back to the original problem.

@phuclv 2015-07-02 16:39:18

the undefined behavior is in C, not after compiling to assembly

@R.. 2010-10-15 17:57:06

Your approach with subtraction is correct and well-defined. A compiler cannot optimize it away.

Another correct approach, if you have a larger integer type available, is to perform the arithmetic in the larger type and then check that the result fits in the smaller type when converting it back

int sum(int a, int b)
{
    long long c;
    assert(LLONG_MAX>INT_MAX);
    c = (long long)a + b;
    if (c < INT_MIN || c > INT_MAX) abort();
    return c;
}

A good compiler should convert the entire addition and if statement into an int-sized addition and a single conditional jump-on-overflow and never actually perform the larger addition.

Edit: As Stephen pointed out, I'm having trouble getting a (not-so-good) compiler, gcc, to generate the sane asm. The code it generates is not terribly slow, but certainly suboptimal. If anyone knows variants on this code that will get gcc to do the right thing, I'd love to see them.

@R.. 2010-10-15 18:11:59

To anyone wanting to use this, be sure you're looking at my edited version. In the original I stupidly omitted the cast to long long before the addition.

@Mike DeSimone 2010-10-15 18:51:22

Then, by your definition gcc 4.4.4 is not a good compiler. I just tried this at -O2 and -O3 and got the same code, 12 lines of assembly (not counting assembler commands) where the actual addition is performed by leaq (%rax,%rcx), %rcx, or addl %eax, %ecx; adcl %edx, %ebx if the -m32 flag is given.

@Mike DeSimone 2010-10-15 18:52:58

Also, gcc didn't remove the if statement from the generated code.

@Channel72 2010-10-15 19:04:36

@R, are you certain that the approach with subtraction is correct? Since signed-overflow is undefined, isn't it possible the compiler could see the expression (INT_MAX - lhs <= rhs) and optimize it away, thinking that such an expression must always be true?

@R.. 2010-10-15 19:12:56

Yes I am certain. There is no overflow in the expression INT_MAX - lhs <= rhs (assuming lhs is positive). Depending on the values of lhs and rhs, this expression can evaluate to 0 or 1. A compiler cannot optimize away the computation of expressions with nonconstant values. It could, of course, optimize this to an addition followed by an overflow check, if the machine supports such a thing.

@Stephen Canon 2010-10-15 19:40:15

Out of curiosity, have you been successful in getting a compiler to do this optimization? A quick test against a few compilers didn't turn up any that could do it.

@David Thornley 2010-10-15 20:23:32

@Channel72: Also importantly, none of the subexpressions can overflow. If lhs is an int and nonnegative, INT_MAX - lhs yields a value between 0 and INT_MAX, inclusive, and that's perfectly legal, so nothing about this triggers undefined behavior.

@ruslik 2010-10-15 20:37:18

What I can't understand is why someone who is worried about int32 overflows on the 64bit machine will continue to use int32? And using this in 32bit mode will be quite inneficient.

@R.. 2010-10-15 20:57:05

On x86_64, there's nothing inefficient about using 32-bit integers. Performance is identical to 64-bit ones. One motivation for using smaller-than-native-word-size types is that it's extremely efficient to handle overflow conditions or carry (for arbitrary-precision arithmetic) since the overflow/carry happens in a directly-accessible location.

@R.. 2010-10-15 21:00:57

@Stephen: Oddly I can't get it to work at the moment either, but I've seen plenty of cases where gcc optimizes out unnecessary 64-bit arithmetic like this. I wonder if it's too stupid to figure out how to do it for signed values; I normally only use unsigned arithmetic for things like this.

@Jens Gustedt 2010-10-16 06:46:11

@R., @Steven: no the subtraction code that the OP gave is not correct, please see my answer. I also give a code there that just does it with at most two comparisons. Perhaps the compilers will do better with that.

@Jens Gustedt 2010-10-16 08:24:09

@R.: after looking into it more closely it is also not wonder that gcc doesn't get away with just one jump on overflow. In fact, the conditions for overflow and underflow are in general not symmetric: on nowadays architectures usually INT_MIN == (-INT_MAX) - 1. So the two bound checks must be different.

@Jens Gustedt 2010-10-16 08:36:32

@R.: after thinking it over once more, I am not even convinced that your proposal of blowing it up to a wider type is worth it. Just use the corresponding unsigned type.

@R.. 2010-10-16 15:47:52

@Jens: Which unsigned approach? There are lots and I'm not sure which is most efficient for handling signed inputs.

@R.. 2010-10-16 15:51:20

@Jens: As for gcc, x86 has a perfectly good jo instruction (jump on overflow) it should be using.

@Pacerier 2013-09-22 18:00:38

@R.. so with two choices 1) using subtration and 2) converting to a wider type, which would be more performant?

@chux - Reinstate Monica 2014-08-17 17:53:35

This approach does not work in the uncommon platform where sizeof(long long) == sizeof(int). C specifies only that sizeof(long long) >= sizeof(int).

@nategoose 2010-10-15 19:39:06

I think that this works:

int add(int lhs, int rhs) {
   volatile int sum = lhs + rhs;
   if (lhs != (sum - rhs) ) {
       /* overflow */
       //errno = ERANGE;
       abort();
   }
   return sum;
}

Using volatile keeps the compiler from optimizing away the test because it thinks that sum may have changed between the addition and the subtraction.

Using gcc 4.4.3 for x86_64 the assembly for this code does do the addition, the subtraction, and the test, though it stores everything on the stack and of unneeded stack operations. I even tried register volatile int sum = but the assembly was the same.

For a version with only int sum = (no volatile or register) the function did not do the test and did the addition using only one lea instruction (lea is Load Effective Address and is often used to do addition without touching the flags register).

Your version is larger code and has a lot more jumps, but I don't know which would be better.

@R.. 2010-10-16 00:30:58

-1 for misuse of volatile to mask undefined behavior. If it "works", you're still just "getting lucky".

@nategoose 2010-10-18 14:39:54

@R: If it doesn't work the compiler isn't implementing volatile correctly. All I was trying for was a simpler solution to a very common problem on an already answered question.

@nategoose 2010-10-18 14:49:38

Where it might fail, though, would be a system whose numeric representation wrap to lower values upon overflow for integers.

@nategoose 2010-10-18 21:04:31

That last comment should have a "didn't" or "doesn't" in it.

@davmac 2013-08-05 13:27:46

@nategoose, your assertion that "if it doesn't work the compiler isn't implementing volatile correctly" is wrong. For one thing, in two's complement arithmetic, it will always be true that lhs = sum - rhs even if overflow occurred. Even if that were not the case, and although this particular example is a bit contrived, the compiler might for instance generate code which performs the addition, stores the result value, reads the value back into another register, compares the stored value with the read value and notices that they are the same and therefore assume no overflow has occurred.

@davmac 2013-08-05 13:55:11

(you're also assuming that causing the overflow won't cause the comparison afterwards to go wrong or even be skipped, which "undefined behavior" allows).

@Jonathan 2010-10-15 17:31:11

You may have better luck converting to 64-bit integers and testing similar conditions like that. For example:

#include <stdint.h>

...

int64_t sum = (int64_t)lhs + (int64_t)rhs;
if (sum < INT_MIN || sum > INT_MAX) {
    // Overflow occurred!
}
else {
    return sum;
}

You may want to take a closer look at how sign extension will work here, but I think it is correct.

@R.. 2010-10-15 17:59:40

Remove the bitwise-and and cast from the return statement. They're incorrect as written. Conversion from larger signed integer types to smaller ones is perfectly well-defined as long as the value fits in the smaller type, and it does not need an explicit cast. Any compiler that gives a warning and suggests that you add a cast when you're just checked that the value does not overflow is a broken compiler.

@Jonathan 2010-10-15 18:41:46

@R You are correct, I just like being explicit about my casts. I'll change it for correctness, though. For future readers, the return line read return (int32_t)(sum & 0xffffffff);.

@R.. 2010-10-15 19:17:26

Note that if you write sum & 0xffffffff, sum is implicitly converted to type unsigned int (assuming 32-bit int) because 0xffffffff has type unsigned int. Then the result of the bitwise and is an unsigned int, and if sum was negative, it will be outside the range of values supported by int32_t. The conversion to int32_t then has implementation-defined behavior.

@JaredPar 2010-10-15 17:32:48

IMHO, the eastiest way to deal with overflow sentsitive C++ code is to use SafeInt<T>. This is a cross platform C++ template hosted on code plex which provides the safety guarantees that you desire here.

I find it very intuitive to use as it provides the many of the same usage patterns as normal numerical opertations and expresses over and under flows via exceptions.

Related Questions

Sponsored Content

23 Answered Questions

[SOLVED] What is the "-->" operator in C++?

37 Answered Questions

31 Answered Questions

[SOLVED] How do I detect unsigned integer multiply overflow?

13 Answered Questions

[SOLVED] What is the effect of extern "C" in C++?

1 Answered Questions

[SOLVED] The Definitive C++ Book Guide and List

  • 2008-12-23 05:23:56
  • grepsedawk
  • 2247683 View
  • 4246 Score
  • 1 Answer
  • Tags:   c++ c++-faq

3 Answered Questions

6 Answered Questions

[SOLVED] Is using an unsigned rather than signed int more likely to cause bugs? Why?

5 Answered Questions

6 Answered Questions

[SOLVED] Why does integer overflow on x86 with GCC cause an infinite loop?

Sponsored Content