By David Cary


2014-06-22 16:54:25 8 Comments

In the responses to the question Reading In A String and comparing it C, more than one person discouraged the use of strcmp(), saying things like

I also strongly, strongly advise you to get used to using strncmp() now, ... to avoid many problems down the road.

or (in Why does my string comparison fail? )

Make certain you use strncmp and not strcmp. strcmp is profoundly unsafe.

What problems are they alluding to?

The reason scanf() with string specifiers and gets() are strongly discouraged is because they almost inevitably lead to buffer overflow vulnerabilities. However, it's not possible to overflow a buffer with strcmp(), right?

"A buffer overflow, or buffer overrun, is an anomaly where a program, while writing data to a buffer, overruns the buffer's boundary and overwrites adjacent memory."

( -- Wikipedia: buffer overflow).

Since the strcmp() function never writes to any buffer, the strcmp() function cannot cause a buffer overflow, right?

What is the reason people discourage the use of strcmp(), and recommend strncmp() instead?

3 comments

@Jonathon Reinhart 2014-06-22 17:40:18

While strncmp can prevent you from overrunning a buffer, its primary purpose isn't for safety. Rather, it exists for the case where one wants to compare only the first N characters of a (properly possibly NUL-terminated) string.

From the man page:

The strcmp() function compares the two strings s1 and s2. It returns an integer less than, equal to, or greater than zero if s1 is found, respectively, to be less than, to match, or be greater than s2.

The strncmp() function is similar, except it compares the only first (at most) n bytes of s1 and s2.

Note that strncmp in this case cannot be replaced with a simple memcmp, because you still need to take advantage of its stop-on-NUL behavior, in case one of the strings is shorter than n.

If strcmp causes a buffer overrun, then one of two things is true:

  1. Your data isn't expected to be NUL-terminated, and you should be using memcmp instead.
  2. Your data is expected to be NUL-terminated, but you've already screwed up when you populated the buffer, by somehow not NUL-terminating it.

Note that reading past the end of a buffer is still considered a buffer overrun. While it may seem harmless, it can be just as dangerous as writing past the end.

Reading, writing, executing... it doesn't matter. Any memory reference to an unintended address is undefined behavior. In the most apparent scenario, you attempt to access a page that isn't mapped into your process's address space, causing a page fault, and subsequent SIGSEGV. In the worst case, you sometimes run into a \0 byte, but other times you run into some other buffer, causing inconstant program behavior.

@David Cary 2014-06-23 03:02:04

I don't see how strcmp() can cause a buffer overrun, even if both those things are true. Could you say a few more words about what exactly goes wrong?

@Jonathon Reinhart 2014-06-23 03:04:36

Imagine you have char buf[100] in which every character is 'a' (it is not NUL-terminated). If you pass this buffer to strcmp (assuming the other parameter is a longer string, then strcmp will continue comparing at buf[100] and so on, overrunning the buffer.

@David Cary 2014-06-23 03:12:58

I can see how writing past the end of a buffer causes problems. But strcmp() doesn't do that, right? Would you mind adding a few words to your answer about what exactly goes wrong if strcmp() goes on reading past the end of a buffer?

@Jonathon Reinhart 2014-06-23 03:15:37

Reading, writing, executing... it doesn't matter. Any memory reference to an unintended address is undefined behavior. In the most apparent scenario, you attempt to access a page that isn't mapped into your process's address space, causing a page fault, and subsequent SIGSEGV. In the worst case, you sometimes run into a \0 byte, but othertimes you run into some other buffer, causing inconstant program behavior.

@David Cary 2014-06-23 03:25:58

Good point. Please hit the above "edit" button and add it to your answer. I've done a lot of programming on machines that never page fault, so that particular "page fault" problem never happens on those machines, but this is exactly the sort of thing I want to know about so my C is portable to machines where this sort of thing can and does happen.

@Jonathon Reinhart 2014-06-23 03:37:29

If you want to ensure portability, you should ensure correctness. No matter what the machine, reading past the end of a buffer will invoke undefined behavior. The CPU may happily read zeros, or the memory controller might catch fire.

@chux - Reinstate Monica 2014-06-23 19:30:46

"it exists for the case where one wants to compare only the first N characters of a (properly NUL-terminated) string." is not correct. From the C spec "The strncmp function returns an integer ... accordingly as the possibly null-terminated array pointed to by s1". Neither s1 nor s2 of int strncmp(const char *s1, const char *s2, size_t n); need to be C strings. Independently, they can simply be strings or they can be arrays of char without being "properly NUL-terminated".

@Jonathon Reinhart 2014-06-23 19:48:39

@chux Hmm, it seems I will have to re-visit my answer. Thanks for that.

@Keith Thompson 2014-06-23 06:01:31

A string is by definition "a contiguous sequence of characters terminated by and including the first null character".

The only case where strncmp() would be safer than strcmp() is when you're comparing two character arrays as strings, you're certain that both arrays are at least n bytes long (the 3rd argument passed to strncmp()), and you're not certain that both arrays contain strings (i.e., contain a '\0' null character terminator).

In most cases, your code (if it's correct) will guarantee that any arrays that are supposed to contain null-terminated strings actually do contain null-terminated strings.

That added n in strncmp() is not a magic wand that makes unsafe code safe. It doesn't guard against null pointers, uninitialized pointers, uninitialized arrays, an incorrect value of n, or just passing incorrect data. You can shoot yourself in the foot with either function.

And if you're trying to call strcmp or strncmp with an array that you thought contained a null-terminated string but actually doesn't, then your code already has a bug. Using strncmp() might help you avoid the immediate symptom of that bug, but it won't fix it.

@Pablo Francisco Pérez Hidalgo 2014-06-22 16:59:08

strcmp compares two strings character to character until a difference has been detected or the \0 is found at one of them.

On the other hand, strncmp provides a way to limit the number of characters to be compared so if the strings do not end with \0 the function won't continue checking after the size limit has been reached.

Imagine what would happen if you are comparing two strings at this two memory regions:

0x40, 0x41, 0x42,... 0x40, 0x41, 0x42,...

And you are only interested in the two first characters. Somehow \0 has been removed from the end of the strings and the third byte happens to coincide at the two regions. strncmp would avoid comparing this third byte if num parameter is 2.

EDIT As the comments below indicate, this situation is derived from a wrong or very concrete use of the language.

@Kerrek SB 2014-06-22 17:07:00

If you want to compare memory regions, use memcmp. In C, a "string" is a null-terminated character sequence. If you have strings, use strcmp. If you don't, don't.

@Oliver Charlesworth 2014-06-22 17:07:43

Sure, strcmp would then make the comparison "safe", but that's really just deferring the problem. Your non-null-terminated string would then cause undefined behaviour later on in your program.

@Pablo Francisco Pérez Hidalgo 2014-06-22 17:10:25

@OliCharlesworth I do agree, just wanted to pinpoint a case where strncmp is safer to use than strcmp

@Jonathon Reinhart 2014-06-22 17:31:29

As I understand it, strncmp doesn't exist for "safety", but rather "I want to compare the first N characters of these strings".

@Oliver Charlesworth 2014-06-22 17:37:20

@JonathonReinhart: You should make that an answer.

@M.M 2014-06-23 04:19:48

Things that don't end in \0 are not strings

Related Questions

Sponsored Content

5 Answered Questions

[SOLVED] Why is “while ( !feof (file) )” always wrong?

8 Answered Questions

[SOLVED] What's the best way to check if a file exists in C?

18 Answered Questions

[SOLVED] What's the rationale for null terminated strings?

23 Answered Questions

3 Answered Questions

[SOLVED] strcmp returns wrong value

  • 2019-02-14 12:51:45
  • Raven
  • 277 View
  • 2 Score
  • 3 Answer
  • Tags:   c strcmp

5 Answered Questions

[SOLVED] Is there a JavaScript strcmp()?

9 Answered Questions

[SOLVED] What's the @ in front of a string in C#?

  • 2009-02-17 09:54:17
  • Klaw
  • 260213 View
  • 613 Score
  • 9 Answer
  • Tags:   c# .net string

2 Answered Questions

[SOLVED] How to input a string to C with null character in it via gets?

1 Answered Questions

[SOLVED] What's wrong with strcmp in this program?

  • 2012-11-30 19:11:11
  • user1867232
  • 203 View
  • 2 Score
  • 1 Answer
  • Tags:   c strcmp

Sponsored Content