By hdn


2009-11-09 22:34:21 8 Comments

In C, one can use a string literal in a declaration like this:

char s[] = "hello";

or like this:

char *s = "hello";

So what is the difference? I want to know what actually happens in terms of storage duration, both at compile and run time.

12 comments

@Ciro Santilli 新疆改造中心 六四事件 法轮功 2015-06-05 07:32:20

C99 N1256 draft

There are two different uses of character string literals:

  1. Initialize char[]:

    char c[] = "abc";      
    

    This is "more magic", and described at 6.7.8/14 "Initialization":

    An array of character type may be initialized by a character string literal, optionally enclosed in braces. Successive characters of the character string literal (including the terminating null character if there is room or if the array is of unknown size) initialize the elements of the array.

    So this is just a shortcut for:

    char c[] = {'a', 'b', 'c', '\0'};
    

    Like any other regular array, c can be modified.

  2. Everywhere else: it generates an:

    So when you write:

    char *c = "abc";
    

    This is similar to:

    /* __unnamed is magic because modifying it gives UB. */
    static char __unnamed[] = "abc";
    char *c = __unnamed;
    

    Note the implicit cast from char[] to char *, which is always legal.

    Then if you modify c[0], you also modify __unnamed, which is UB.

    This is documented at 6.4.5 "String literals":

    5 In translation phase 7, a byte or code of value zero is appended to each multibyte character sequence that results from a string literal or literals. The multibyte character sequence is then used to initialize an array of static storage duration and length just sufficient to contain the sequence. For character string literals, the array elements have type char, and are initialized with the individual bytes of the multibyte character sequence [...]

    6 It is unspecified whether these arrays are distinct provided their elements have the appropriate values. If the program attempts to modify such an array, the behavior is undefined.

6.7.8/32 "Initialization" gives a direct example:

EXAMPLE 8: The declaration

char s[] = "abc", t[3] = "abc";

defines "plain" char array objects s and t whose elements are initialized with character string literals.

This declaration is identical to

char s[] = { 'a', 'b', 'c', '\0' },
t[] = { 'a', 'b', 'c' };

The contents of the arrays are modifiable. On the other hand, the declaration

char *p = "abc";

defines p with type "pointer to char" and initializes it to point to an object with type "array of char" with length 4 whose elements are initialized with a character string literal. If an attempt is made to use p to modify the contents of the array, the behavior is undefined.

GCC 4.8 x86-64 ELF implementation

Program:

#include <stdio.h>

int main(void) {
    char *s = "abc";
    printf("%s\n", s);
    return 0;
}

Compile and decompile:

gcc -ggdb -std=c99 -c main.c
objdump -Sr main.o

Output contains:

 char *s = "abc";
8:  48 c7 45 f8 00 00 00    movq   $0x0,-0x8(%rbp)
f:  00 
        c: R_X86_64_32S .rodata

Conclusion: GCC stores char* it in .rodata section, not in .text.

If we do the same for char[]:

 char s[] = "abc";

we obtain:

17:   c7 45 f0 61 62 63 00    movl   $0x636261,-0x10(%rbp)

so it gets stored in the stack (relative to %rbp).

Note however that the default linker script puts .rodata and .text in the same segment, which has execute but no write permission. This can be observed with:

readelf -l a.out

which contains:

 Section to Segment mapping:
  Segment Sections...
   02     .text .rodata

@brice 2016-09-21 01:03:38

Thank you for the great answer :)

@bdonlan 2009-11-09 22:45:01

First off, in function arguments, they are exactly equivalent:

void foo(char *x);
void foo(char x[]); // exactly the same in all respects

In other contexts, char * allocates a pointer, while char [] allocates an array. Where does the string go in the former case, you ask? The compiler secretly allocates a static anonymous array to hold the string literal. So:

char *x = "Foo";
// is approximately equivalent to:
static const char __secret_anonymous_array[] = "Foo";
char *x = (char *) __secret_anonymous_array;

Note that you must not ever attempt to modify the contents of this anonymous array via this pointer; the effects are undefined (often meaning a crash):

x[1] = 'O'; // BAD. DON'T DO THIS.

Using the array syntax directly allocates it into new memory. Thus modification is safe:

char x[] = "Foo";
x[1] = 'O'; // No problem.

However the array only lives as long as its contaning scope, so if you do this in a function, don't return or leak a pointer to this array - make a copy instead with strdup() or similar. If the array is allocated in global scope, of course, no problem.

@Muzab 2012-11-21 10:39:42

Just to add: you also get different values for their sizes.

printf("sizeof s[] = %zu\n", sizeof(s));  //6
printf("sizeof *s  = %zu\n", sizeof(s));  //4 or 8

As mentioned above, for an array '\0' will be allocated as the final element.

@Mohit 2016-11-11 11:46:04

char *str = "Hello";

The above sets str to point to the literal value "Hello" which is hard-coded in the program's binary image, which is flagged as read-only in memory, means any change in this String literal is illegal and that would throw segmentation faults.

char str[] = "Hello";

copies the string to newly allocated memory on the stack. Thus making any change in it is allowed and legal.

means str[0] = 'M';

will change the str to "Mello".

For more details, please go through the similar question:

Why do I get a segmentation fault when writing to a string initialized with "char *s" but not "char s[]"?

@Rickard 2009-11-09 22:38:26

The difference here is that

char *s = "Hello world";

will place "Hello world" in the read-only parts of the memory, and making s a pointer to that makes any writing operation on this memory illegal.

While doing:

char s[] = "Hello world";

puts the literal string in read-only memory and copies the string to newly allocated memory on the stack. Thus making

s[0] = 'J';

legal.

@pmg 2009-11-09 22:42:33

The literal string "Hello world" is in "read-only parts of the memory" in both examples. The example with the array points there, the example with the array copies the characters to the array elements.

@caf 2009-11-09 22:46:00

pmg: In the second case the literal string does not necessarily exist in memory as a single contiguous object at all - it's just an initialiser, the compiler could quite resonably emit a series of "load immediate byte" instructions that contain the character values embedded within them.

@CB Bailey 2009-11-09 22:46:56

@pmg: to be fair it depends on the context. For a global variable, the compiler can just put "Hello world" directly in a writeable section loadable on startup, for an automatic variable the array does need to be reinitialized every time.

@caf 2009-11-09 22:47:44

The char array example does not necessarily place the string on the stack - if it appears at file level, it will probably be in some kind of initialised data segment instead.

@pmg 2009-11-09 23:00:34

@caf, @Charles: The Standard (n1401.pdf) says @ 6.4.5 String literals /5 "... The multibyte character sequence is then used to initialize an array of static storage duration and length just sufficient to contain the sequence. ..." It doesn't "speak" of 'plain' character sequences, but I think the same applies. I also think an implementation can ignore this particular bit of the Standard for performance reasons :)

@paxdiablo 2009-11-10 12:35:35

I'd like to point out that char s = "xx" doesn't have to be in read-only memory (some implementations have no MMUs, for example). The n1362 c1x draft simply states that modifying such an array causes undefined behavior. But +1 anyway, since relying on that behavior is a silly thing to do.

@gcbenison 2012-04-27 15:44:14

I get a clean compile on a file containing just char msg[] = "hello, world!"; the string ends up in the initialized data section. When declared char * const to end up in the read-only data section. gcc-4.5.3

@Nishant Kumar 2012-05-16 14:42:07

@Rickard: "Hello world" will store twice in Read only memory or only one times for both. and another question is memory is allocated for *p = "Hello" at compile time or run time.

@Keith Thompson 2014-02-26 21:29:51

The most current C99 draft is N1256. The most current C11 draft is N1570.

@Pacerier 2015-05-13 03:44:18

@paxdiablo, So do you mean that the program may actually not get segmentation fault by writing to strings?

@paxdiablo 2015-05-13 03:50:20

@Pacerier: correct. Undefined means exactly that, anything can happen. It could silently ignore what you're trying to do, it could make your computer explode in your face, it could erase your wedding photos. It could even, and this is the most insidious case, actually work. Right up to the point where you try your code elsewhere, or change the compiler, or try to build during a blue moon, at which point you'll suffer much angst.

@Pacerier 2015-05-24 14:31:33

@gcbenison, Regarding "the string ends up in the initialized data section", How did you verify that?

@Ciro Santilli 新疆改造中心 六四事件 法轮功 2015-06-05 08:27:48

@Pacerier I have verified it in my answer.

@Ciro Santilli 新疆改造中心 六四事件 法轮功 2015-06-05 08:28:25

Both are "legal" C programs: it's just that one is undefined behavior :-)

@MycrofD 2017-05-11 15:04:47

did you mean s[0] = 'H'; instead of s[0] = 'J';

@Rickard 2017-05-12 11:32:57

@MycrofD no the s[0] = 'J'; was what I intended to type. "Jello world" is underestimated.

@MycrofD 2017-05-12 13:16:08

@Rickard if you are making fun, you should say "Bazinga" (TBBT) at the end of your above comment and edit your answer. :) else.. well, pardon me, I quite didn't get you, care to explain in details in the answer itself, again, by editing it?

@Dean P 2017-08-24 08:59:55

"makes any writing operation on this memory illegal" Illegal or undefined behaviour? My GCC compiler allows writing to the read only part of memory but the programme acts randomly. The compiler doesnt flag it as an error

@Nick Louloudakis 2015-11-30 10:22:03

As an addition, consider that, as for read-only purposes the use of both is identical, you can access a char by indexing either with [] or *(<var> + <index>) format:

printf("%c", x[1]);     //Prints r

And:

printf("%c", *(x + 1)); //Prints r

Obviously, if you attempt to do

*(x + 1) = 'a';

You will probably get a Segmentation Fault, as you are trying to access read-only memory.

@glglgl 2016-07-15 10:07:40

This is in no way different from x[1] = 'a'; which will segfault as well (depending on the platform, of course).

@CB Bailey 2009-11-09 22:40:09

char s[] = "hello";

declares s to be an array of char which is long enough to hold the initializer (5 + 1 chars) and initializes the array by copying the members of the given string literal into the array.

char *s = "hello";

declares s to be a pointer to one or more (in this case more) chars and points it directly at a fixed (read-only) location containing the literal "hello".

@psihodelia 2011-11-08 13:26:20

What method is preferable to use in functions if s will not be changed, f(const char s[]) or f(const char *s) ?

@CB Bailey 2011-11-08 14:20:30

@psihodelia: In a function declaration there is no difference. In both cases s is a pointer to const char.

@John Bode 2009-11-09 23:03:50

Given the declarations

char *s0 = "hello world";
char s1[] = "hello world";

assume the following hypothetical memory map:

                    0x01  0x02  0x03  0x04
        0x00008000: 'h'   'e'   'l'   'l'
        0x00008004: 'o'   ' '   'w'   'o'
        0x00008008: 'r'   'l'   'd'   0x00
        ...
s0:     0x00010000: 0x00  0x00  0x80  0x00
s1:     0x00010004: 'h'   'e'   'l'   'l'
        0x00010008: 'o'   ' '   'w'   'o'
        0x0001000C: 'r'   'l'   'd'   0x00

The string literal "hello world" is a 12-element array of char (const char in C++) with static storage duration, meaning that the memory for it is allocated when the program starts up and remains allocated until the program terminates. Attempting to modify the contents of a string literal invokes undefined behavior.

The line

char *s0 = "hello world";

defines s0 as a pointer to char with auto storage duration (meaning the variable s0 only exists for the scope in which it is declared) and copies the address of the string literal (0x00008000 in this example) to it. Note that since s0 points to a string literal, it should not be used as an argument to any function that would try to modify it (e.g., strtok(), strcat(), strcpy(), etc.).

The line

char s1[] = "hello world";

defines s1 as a 12-element array of char (length is taken from the string literal) with auto storage duration and copies the contents of the literal to the array. As you can see from the memory map, we have two copies of the string "hello world"; the difference is that you can modify the string contained in s1.

s0 and s1 are interchangeable in most contexts; here are the exceptions:

sizeof s0 == sizeof (char*)
sizeof s1 == 12

type of &s0 == char **
type of &s1 == char (*)[12] // pointer to a 12-element array of char

You can reassign the variable s0 to point to a different string literal or to another variable. You cannot reassign the variable s1 to point to a different array.

@midnightBlue 2016-11-09 21:41:57

I think the hypothetical memory map makes it easy to understand!

@Sailaja 2009-11-09 22:55:19

char s[] = "Hello world";

Here, s is an array of characters, which can be overwritten if we wish.

char *s = "hello";

A string literal is used to create these character blocks somewhere in the memory which this pointer s is pointing to. We can here reassign the object it is pointing to by changing that, but as long as it points to a string literal the block of characters to which it points can't be changed.

@Pankaj Mahato 2014-01-29 18:28:52

@bo Persson Why the block of characters can't be changed in the second case?

@user182669 2009-11-09 23:20:09

In the light of comments here it should be obvious that : char * s = "hello" ; Is a bad idea, and should be used in very narrow scope.

This might be a good opportunity to point out that "const correctness" is a "good thing". Whenever and wherever You can, use the "const" keyword to protect your code, from "relaxed" callers or programmers, which are usually most "relaxed" when pointers come into play.

Enough melodrama, here is what one can achieve when adorning pointers with "const". (Note: One has to read pointer declarations right-to-left.) Here are the 3 different ways to protect yourself when playing with pointers :

const DBJ* p means "p points to a DBJ that is const" 

— that is, the DBJ object can't be changed via p.

DBJ* const p means "p is a const pointer to a DBJ" 

— that is, you can change the DBJ object via p, but you can't change the pointer p itself.

const DBJ* const p means "p is a const pointer to a const DBJ" 

— that is, you can't change the pointer p itself, nor can you change the DBJ object via p.

The errors related to attempted const-ant mutations are caught at compile time. There is no runtime space or speed penalty for const.

(Assumption is you are using C++ compiler, of course ?)

--DBJ

@Fabio Turati 2015-11-05 17:29:52

This is all correct, but it has nothing to do with the question. And as far as your assumption about a C++ compiler, the question is tagged as C, not as C++.

@Paul Smith 2017-09-07 11:31:25

There is nothing bad about char *s = "const string";

@Lee-Man 2009-11-09 22:57:04

In the case of:

char *x = "fred";

x is an lvalue -- it can be assigned to. But in the case of:

char x[] = "fred";

x is not an lvalue, it is an rvalue -- you cannot assign to it.

@caf 2009-11-09 23:02:40

Technically, x is a non-modifiable lvalue. In almost all contexts though, it will evaluate to a pointer to its first element, and that value is an rvalue.

@caf 2009-11-09 22:42:18

This declaration:

char s[] = "hello";

Creates one object - a char array of size 6, called s, initialised with the values 'h', 'e', 'l', 'l', 'o', '\0'. Where this array is allocated in memory, and how long it lives for, depends on where the declaration appears. If the declaration is within a function, it will live until the end of the block that it is declared in, and almost certainly be allocated on the stack; if it's outside a function, it will probably be stored within an "initialised data segment" that is loaded from the executable file into writeable memory when the program is run.

On the other hand, this declaration:

char *s ="hello";

Creates two objects:

  • a read-only array of 6 chars containing the values 'h', 'e', 'l', 'l', 'o', '\0', which has no name and has static storage duration (meaning that it lives for the entire life of the program); and
  • a variable of type pointer-to-char, called s, which is initialised with the location of the first character in that unnamed, read-only array.

The unnamed read-only array is typically located in the "text" segment of the program, which means it is loaded from disk into read-only memory, along with the code itself. The location of the s pointer variable in memory depends on where the declaration appears (just like in the first example).

@Nishant Kumar 2012-05-16 15:00:03

In both declaration for "hello" memory is allocated at comiple time ?.And another thing char *p = "hello" here "hello" is stored in text segment as you stated in your answer...and what about char s[] = "hello" will it also store first in text segment part and during run time it will copy in stack as Rickard has stated in there answer. please clarify this point.

@caf 2012-05-17 01:28:50

@Nishant: In the char s[] = "hello" case, the "hello" is just an initialiser telling the compiler how the array should be initialised. It may or may not result in a corresponding string in the text segment - for example, if s has static storage duration then it is likely that the only instance of "hello" will be in the initialised data segment - the object s itself. Even if s has automatic storage duration, it can be initialised by a sequence of literal stores rather than a copy (eg. movl $1819043176, -6(%ebp); movw $111, -2(%ebp)).

@Nishant Kumar 2012-05-17 04:55:45

Thanks caf for your clarification.

@Ciro Santilli 新疆改造中心 六四事件 法轮功 2015-06-05 07:33:27

More precisely, GCC 4.8 puts it into .rodata, which the linker script then dumps into the same segment as .text. See my answer.

@jay 2019-01-07 10:11:21

@caf In the first answer by Rickard, It's written that char s[] = "Hello world"; puts the literal string in read-only memory and copies the string to newly allocated memory on the stack. But, your answer only speaks about the literal string put in read-only memory and skips the second part of the sentence which says: copies the string to newly allocated memory on the stack. So, is your answer incomplete for not specifying the second part?

@caf 2019-01-08 09:39:51

@AjaySinghNegi: As I've stated in other comments (to this answer, and Rickard's answer), the string in char s[] = "Hellow world"; is only an initializer and is not necessarily stored as a separate read-only copy at all. If s has static storage duration then the only copy of the string is likely to be in a read-write segment at the location of s, and even if not then the compiler may choose to initialize the array with load-immediate instructions or similar rather than copying from a read-only string. The point is that in this case, the initializer string itself has no runtime presence.

Related Questions

Sponsored Content

4 Answered Questions

[SOLVED] What does the C ??!??! operator do?

  • 2011-10-19 16:56:59
  • Peter Olson
  • 228270 View
  • 1719 Score
  • 4 Answer
  • Tags:   c operators trigraphs

10 Answered Questions

[SOLVED] Improve INSERT-per-second performance of SQLite?

16 Answered Questions

[SOLVED] What is the difference between const int*, const int * const, and int const *?

19 Answered Questions

[SOLVED] Why is char[] preferred over String for passwords?

57 Answered Questions

[SOLVED] What is the difference between String and string in C#?

30 Answered Questions

[SOLVED] What is the difference between const and readonly?

31 Answered Questions

29 Answered Questions

[SOLVED] How would you count occurrences of a string (actually a char) within a string?

  • 2009-02-12 15:57:40
  • inspite
  • 541376 View
  • 750 Score
  • 29 Answer
  • Tags:   c# string

11 Answered Questions

[SOLVED] How to convert a char to a String?

8 Answered Questions

[SOLVED] How to convert a std::string to const char* or char*?

  • 2008-12-07 19:30:56
  • user37875
  • 850801 View
  • 822 Score
  • 8 Answer
  • Tags:   c++ string char const

Sponsored Content