By fips197


2018-09-14 14:30:33 8 Comments

While checking the return value of strcmp function, I found some strange behavior in gcc. Here's my code:

#include <stdio.h>
#include <string.h>

char str0[] = "hello world!";
char str1[] = "Hello world!";

int main() {
    printf("%d\n", strcmp("hello world!", "Hello world!"));
    printf("%d\n", strcmp(str0, str1));
}

When I compile this with clang, both calls to strcmp return 32. However, when compiling with gcc, the first call returns 1, and the second call returns 32. I don't understand why the first and second calls to strcmp return different values when compiled using gcc.

Below is my test environment.

  • Ubuntu 18.04 64bit
  • gcc 7.3.0
  • clang 6.0.0

4 comments

@Benjamin Maurer 2018-09-14 14:36:40

The standard defines the result of strcmp to be negative, if lhs appears before rhs in lexical order, zero if they are equal, or a positive value if lhs appears lexically after rhs.

It's up to the implementation how to implement that and what exactly to return. You must not depend on a specific value in your programs, or they won't be portable. Simply check with comparisons (<, >, ==).

See https://en.cppreference.com/w/c/string/byte/strcmp

Background

One simple implementation might just calculate the difference of each character c1 - c2 and do that until the result is not zero, or one of the strings ends. The result will then be the numeric difference between the first character, in which the two strings differed.

For example, this GLibC implementation: https://sourceware.org/git/?p=glibc.git;a=blob_plain;f=string/strcmp.c;hb=HEAD

@melpomene 2018-09-14 14:41:55

It looks like you didn't enable optimizations (e.g. -O2).

From my tests it looks like gcc always recognizes strcmp with constant arguments and optimizes it, even with -O0 (no optimizations). Clang needs at least -O1 to do so.

That's where the difference comes from: The code produced by clang calls strcmp twice, but the code produced by gcc just does printf("%d\n", 1) in the first case because it knows that 'h' > 'H' (ASCIIbetically, that is). It's just constant folding, really.

Live example: https://godbolt.org/z/8Hg-gI

As the other answers explain, any positive value will do to indicate that the first string is greater than the second, so the compiler optimizer simply chooses 1. The strcmp library function apparently uses a different value.

@John Bollinger 2018-09-14 14:51:13

Although it's interesting that Clang and GCC can be induced to compile the program either such that their respective results produce the same output or such that they don't, I don't like interpreting that as optimization or lack thereof being the reason for the output to differ. It would be better to generalize that to "implementation details", as optimization is only one reason why the results might differ, whether in this specific case or (even more so) in the general case.

@Shafik Yaghmour 2018-09-14 15:33:38

@dbush 2018-09-14 14:34:46

The exact values returned by strcmp in the case of the strings not being equal are not specified. From the man page:

#include <string.h>
int strcmp(const char *s1, const char *s2);
int strncmp(const char *s1, const char *s2, size_t n);

The strcmp() and strncmp() functions return an integer less than, equal to, or greater than zero if s1 (or the first n bytes thereof) is found, respectively, to be less than, to match, or be greater than s2.

Since str1 compares greater than str2, the value must be positive, which it is in both cases.

As for the difference between the two compilers, it appears that clang is returning the difference between the ASCII values for the corresponding characters that mismatched, while gcc is opting for a simple -1, 0, or 1. Both are valid, so your code should only need to check if the value is 0, greater than 0, or less than 0.

@Christian Gibbons 2018-09-14 14:42:31

The interesting thing is that gcc only gave 1 when passing in the string literals. I suspect it may have been an optimization knowing that the result would always be the same.

@Some programmer dude 2018-09-14 14:33:45

The strcmp function is only specified to return a value larger than zero, zero, or less than zero. There's nothing specified what those positive and negative values have to be.

Related Questions

Sponsored Content

0 Answered Questions

6 Answered Questions

3 Answered Questions

[SOLVED] clang seems to use the gcc libraries

  • 2014-06-21 13:59:56
  • bolov
  • 11842 View
  • 21 Score
  • 3 Answer
  • Tags:   c++ gcc clang

1 Answered Questions

[SOLVED] C strcmp() not returning 0 as expected

  • 2017-06-06 22:11:57
  • Ostküste
  • 161 View
  • 1 Score
  • 1 Answer
  • Tags:   c debugging strcmp

1 Answered Questions

[SOLVED] clang appears to use gcc

  • 2017-04-20 15:22:02
  • cauchy
  • 120 View
  • 2 Score
  • 1 Answer
  • Tags:   gcc clang++

1 Answered Questions

[SOLVED] Pointer changes value after printf

1 Answered Questions

[SOLVED] I can't compile anything with Code::Blocks

1 Answered Questions

[SOLVED] C source inclusion name length

  • 2016-09-15 10:11:57
  • neorg
  • 100 View
  • 13 Score
  • 1 Answer
  • Tags:   c gcc clang

1 Answered Questions

[SOLVED] Compile-time counter in template class

3 Answered Questions

[SOLVED] clang compiler produces different object files from same sources

Sponsored Content