By Zan Lynx


2009-06-12 18:08:50 8 Comments

I have seen it asserted several times now that the following code is not allowed by the C++ Standard:

int array[5];
int *array_begin = &array[0];
int *array_end = &array[5];

Is &array[5] legal C++ code in this context?

I would like an answer with a reference to the Standard if possible.

It would also be interesting to know if it meets the C standard. And if it isn't standard C++, why was the decision made to treat it differently from array + 5 or &array[4] + 1?

13 comments

@jalf 2009-06-12 18:23:30

Your example is legal, but only because you're not actually using an out of bounds pointer.

Let's deal with out of bounds pointers first (because that's how I originally interpreted your question, before I noticed that the example uses a one-past-the-end pointer instead):

In general, you're not even allowed to create an out-of-bounds pointer. A pointer must point to an element within the array, or one past the end. Nowhere else.

The pointer is not even allowed to exist, which means you're obviously not allowed to dereference it either.

Here's what the standard has to say on the subject:

5.7:5:

When an expression that has integral type is added to or subtracted from a pointer, the result has the type of the pointer operand. If the pointer operand points to an element of an array object, and the array is large enough, the result points to an element offset from the original element such that the difference of the subscripts of the resulting and original array elements equals the integral expression. In other words, if the expression P points to the i-th element of an array object, the expressions (P)+N (equivalently, N+(P)) and (P)-N (where N has the value n) point to, respectively, the i+n-th and i−n-th elements of the array object, provided they exist. Moreover, if the expression P points to the last element of an array object, the expression (P)+1 points one past the last element of the array object, and if the expression Q points one past the last element of an array object, the expression (Q)-1 points to the last element of the array object. If both the pointer operand and the result point to elements of the same array object, or one past the last element of the array object, the evaluation shall not produce an overflow; otherwise, the behavior is undefined.

(emphasis mine)

Of course, this is for operator+. So just to be sure, here's what the standard says about array subscripting:

5.2.1:1:

The expression E1[E2] is identical (by definition) to *((E1)+(E2))

Of course, there's an obvious caveat: Your example doesn't actually show an out-of-bounds pointer. it uses a "one past the end" pointer, which is different. The pointer is allowed to exist (as the above says), but the standard, as far as I can see, says nothing about dereferencing it. The closest I can find is 3.9.2:3:

[Note: for instance, the address one past the end of an array (5.7) would be considered to point to an unrelated object of the array’s element type that might be located at that address. —end note ]

Which seems to me to imply that yes, you can legally dereference it, but the result of reading or writing to the location is unspecified.

Thanks to ilproxyil for correcting the last bit here, answering the last part of your question:

  • array + 5 doesn't actually dereference anything, it simply creates a pointer to one past the end of array.
  • &array[4] + 1 dereferences array+4 (which is perfectly safe), takes the address of that lvalue, and adds one to that address, which results in a one-past-the-end pointer (but that pointer never gets dereferenced.
  • &array[5] dereferences array+5 (which as far as I can see is legal, and results in "an unrelated object of the array’s element type", as the above said), and then takes the address of that element, which also seems legal enough.

So they don't do quite the same thing, although in this case, the end result is the same.

@Evan Teran 2009-06-12 18:24:47

this pointer he is trying to create is one past the end...

@Matthew Flaschen 2009-06-12 18:27:05

&array[5] is pointing to one past. However, this is not a legal way to get that address.

@Evan Teran 2009-06-12 18:30:29

agreed, I would say &array[5] is UB. (even though it may work as expected in practice).

@jalf 2009-06-12 18:42:13

@Evan: Yeah, I realized that too, and edited my post. Note that the question title asks about out-of-bounds though. The answer should describe both cases now.

@jalf 2009-06-12 18:49:02

Oh, I should probably mention that this is based on draft n1905 (from 2005). I don't have access to the "real" standard here, and this was the first one Google turned up.

@user83255 2009-06-12 18:52:23

the last sentence is incorrect. "array + 5" and "&array[4] + 1" do NOT dereference one past the end, while array[5] DOES. (i also assume you meant &array[5], but the comment still stands). The first two simply point one past the end.

@jalf 2009-06-12 18:59:42

@ilproxyil: You're right. Fixed it. Hopefully, that's all. SO is starting to throw CAPTCHA's at me now for repeatedly editing this post... ;)

@jalf 2009-06-12 19:03:58

@Martin: Don't you start, I'm tired of editing this thing. ;) Anyway, the standard says that array[5] is equivalent by definition to (array + 5), so surely it *does dereference the address. Or am I missing something (again)? ;)

@Martin York 2009-06-12 19:04:01

@ilproxy: array[5] does not de-reference the address. You can consider it as an expression that is a 'reference to value'. It only de-references the address if it is used to retrive the value or write to the value. Here we are taking the address. This is explicitly allowed by the standard

@Martin York 2009-06-12 19:04:53

@jalf: Yes you are missing somthing. It is a reference to an rvalue.

@jalf 2009-06-12 19:12:29

Which is? Where do you get the rvalue from?

@David Thornley 2009-06-12 21:59:15

array[5] is the same thing as *(array + 5). Note the *: that's the dereference operator. "array + 5" is perfectly legal, since it's one past the end of the array. array[5], by itself, is not legal, since it accesses memory past the end of the array. &array[5] is probably legal, since &i does not touch the actual value of i, but it would take somebody more skilled with standard-fu than me to prove it.

@Johannes Schaub - litb 2009-06-12 22:20:27

@Martin, array[5] surely dereferences (look at 3.8/5 and 8.3.2/4, for example) It just does not read the stored value located there.

@Johannes Schaub - litb 2009-06-12 22:21:18

@jalf that note - i think it merely wants to say that "a + sizeof a" is equally valid to "&b" if b is directly allocated after the array "a", and that the resulting addresses equally "point to" the same object. Not more. Remember that all notes are informative (non-normative): If it would state such fundamental important facts like that there are objects after an array object that are located at the past-the-end, then such rule would need to be made normative

@Zan Lynx 2009-06-12 23:52:16

I'm accepting this answer even though I also like several others. This one references the standards. I'd also accept Adam Rosenfield's if I could.

@Adam Rosenfield 2009-06-13 01:56:04

As far as ANSI C (C89/C90) is concerned, this is the correct answer. If you follow the standard to the letter, &array[5] is technically invalid, whereas array+5 is valid, even though pretty much every compiler will produce the same code for both expressions. C99 updates the standard to explicitly allow &array[5]. See my answer for full details.

@Johannes Schaub - litb 2009-06-13 13:06:26

@jalf, also the whole text that note is in starts with "If an object of type T is located at an address A..." <- That says "The following text assumes there is an object at address A." So your quote doesn't (and can't, under this condition) say that there is always an object at address A.

@jalf 2009-06-13 14:09:51

True, on both points. I guess I should have read the full text before that note. ;) But yeah, I'm not sure either. Even if dereferencing it is well-defined, then obviously the state of the object you access is not. So you might be able to take the address of it, but nothing else really.

@Martin York 2009-06-13 19:09:37

Section 5.3.1.1 Unary operator '*': 'the result is an lvalue referring to the object or function'. Section 5.2.1 Subscripting The expression E1[E2] is identical (by definition) to *((E1)+(E2)). By my reading of the standard here. There is no de-refrencing of the resulting pointer.

@CB Bailey 2009-06-13 20:04:37

@litb: so are we saying that for a T* which points at one past the end of an array, that there is an object pointed to by the pointer - even if it's only a byte (which is an object) and not actually a T - and therefore unary * is well defined, returning an lvalue of type T but which may not actually be a complete T? Sounds like a plausible interpretation.

@Johannes Schaub - litb 2009-06-13 23:17:57

@Charles, yes that's what i think is going on. It would be all fine, as long as you don't try to read a value (lvalue->rvalue). If you would try, you would fall into 3.10/15 and 4.1/1. Thus, this would be well defined always: unsigned char c[1]; unsigned char c1 = c[1]; But this not always, because you don't know what might be located there besides that byte: float s[1]; float s1 = s[1]; But contrary, this is always fine, i think: s[1]; (no read happening).

@Crazy Eddie 2010-12-20 22:36:29

The selection you are citing in order to justify your answer is from a paragraph that is clearly explaining the TYPE of the pointer (array+size) and is NOT claiming that there is a valid object at that location that is legal to dereference.

@John Dibling 2012-11-07 21:01:59

@jalf: I know it's been a long time, but could you please reconsider this answer? I believe that this actually is Undefined Behavior -- see my new answer below.

@Martin York 2009-06-13 19:15:29

This is legal:

int array[5];
int *array_begin = &array[0];
int *array_end = &array[5];

Section 5.2.1 Subscripting The expression E1[E2] is identical (by definition) to *((E1)+(E2))

So by this we can say that array_end is equivalent too:

int *array_end = &(*((array) + 5)); // or &(*(array + 5))

Section 5.3.1.1 Unary operator '*': The unary * operator performs indirection: the expression to which it is applied shall be a pointer to an object type, or a pointer to a function type and the result is an lvalue referring to the object or function to which the expression points. If the type of the expression is “pointer to T,” the type of the result is “T.” [ Note: a pointer to an incomplete type (other than cv void) can be dereferenced. The lvalue thus obtained can be used in limited ways (to initialize a reference, for example); this lvalue must not be converted to an rvalue, see 4.1. — end note ]

The important part of the above:

'the result is an lvalue referring to the object or function'.

The unary operator '*' is returning a lvalue referring to the int (no de-refeference). The unary operator '&' then gets the address of the lvalue.

As long as there is no de-referencing of an out of bounds pointer then the operation is fully covered by the standard and all behavior is defined. So by my reading the above is completely legal.

The fact that a lot of the STL algorithms depend on the behavior being well defined, is a sort of hint that the standards committee has already though of this and I am sure there is a something that covers this explicitly.

The comment section below presents two arguments:

(please read: but it is long and both of us end up trollish)

Argument 1

this is illegal because of section 5.7 paragraph 5

When an expression that has integral type is added to or subtracted from a pointer, the result has the type of the pointer operand. If the pointer operand points to an element of an array object, and the array is large enough, the result points to an element offset from the original element such that the difference of the subscripts of the resulting and original array elements equals the integral expression. In other words, if the expression P points to the i-th element of an array object, the expressions (P)+N (equivalently, N+(P)) and (P)-N (where N has the value n) point to, respectively, the i + n-th and i − n-th elements of the array object, provided they exist. Moreover, if the expression P points to the last element of an array object, the expression (P)+1 points one past the last element of the array object, and if the expression Q points one past the last element of an array object, the expression (Q)-1 points to the last element of the array object. If both the pointer operand and the result point to elements of the same array object, or one past the last element of the array object, the evaluation shall not produce an overflow; otherwise, the behavior is undefined.

And though the section is relevant; it does not show undefined behavior. All the elements in the array we are talking about are either within the array or one past the end (which is well defined by the above paragraph).

Argument 2:

The second argument presented below is: * is the de-reference operator.
And though this is a common term used to describe the '*' operator; this term is deliberately avoided in the standard as the term 'de-reference' is not well defined in terms of the language and what that means to the underlying hardware.

Though accessing the memory one beyond the end of the array is definitely undefined behavior. I am not convinced the unary * operator accesses the memory (reads/writes to memory) in this context (not in a way the standard defines). In this context (as defined by the standard (see 5.3.1.1)) the unary * operator returns a lvalue referring to the object. In my understanding of the language this is not access to the underlying memory. The result of this expression is then immediately used by the unary & operator operator that returns the address of the object referred to by the lvalue referring to the object.

Many other references to Wikipedia and non canonical sources are presented. All of which I find irrelevant. C++ is defined by the standard.

Conclusion:

I am wiling to concede there are many parts of the standard that I may have not considered and may prove my above arguments wrong. NON are provided below. If you show me a standard reference that shows this is UB. I will

  1. Leave the answer.
  2. Put in all caps this is stupid and I am wrong for all to read.

This is not an argument:

Not everything in the entire world is defined by the C++ standard. Open your mind.

@Lightness Races in Orbit 2013-08-06 11:23:09

Why do you assert that there is no dereference? * is the dereference operator.

@Martin York 2013-08-06 11:38:21

@LightnessRacesinOrbit: the result is an lvalue **referring** to the object or function. Unless you actually read from the lvalue there is no de-referencing therefore no undefined behavior. If you read from the lvalue (of one past the end then you have undefined behavior) but if all you do is take its address then you are fine. The standards commitee has asserted that taking the address of one past the end of a memory block does not lead to undefined behavior (as long as you don't look at the value).

@Martin York 2013-08-06 11:43:52

@LightnessRacesinOrbit: &array[5] => &*(array + 5) => (array + 5). No de-referencing here. There is only a de-referencce if you actually read the value from the resulting reference.

@Lightness Races in Orbit 2013-08-06 11:58:47

According to whom? According to which passage? The * performs a dereference. It is the dereference operator. This is what it does. Arguably the fact that you then obtain a new pointer to the resulting value (using &) is irrelevant. You can't just present a sequence of evaluation, present the final expression semantics and pretend that the intermediate steps didn't happen (or that the language's rules did not apply to each).

@Martin York 2013-08-06 11:59:38

@LightnessRacesinOrbit: I quote the standard above: Section 5.3.1.1 Unary operator '*': The important part is the result is an lvalue **referring** to the object or function. The result of the * operator is an alias.

@Lightness Races in Orbit 2013-08-06 12:00:20

I quote from the same passage: the result is an lvalue referring to the object or function to which the expression points. It is clear that if no such object exists, there is no behaviour defined for this operator. Your subsequent statement is returning a lvalue referring to the int (no de-refeference) is what makes no sense to me. Why do you think that this is not a dereference?

@Martin York 2013-08-06 12:02:48

@LightnessRacesinOrbit: Note '*' is not the dereference operator. It is the unary operator *. Calling it the de-reference operator does not change what it is actually doing. It returns a reference to what is being pointed at. Unless you read the value via the reference there is no de-reference. It is the act of reading (or writing) the value that will cause a de-reference.

@Lightness Races in Orbit 2013-08-06 12:04:38

It returns a reference to what is being pointed at. What is this, if not a dereference? The passage says that * performs indirection, and indirection from pointer to pointee is called dereferencing. Your argument essentially asserts that pointers and references are the same thing or, at least, implicitly linked, which is simply not true. int x = 0; int* ptr = &x; int& y = *x; Here I dereference x. I don't need to use y for that to be true.

@Martin York 2013-08-06 12:05:46

@LightnessRacesinOrbit: I disagree that it is a de-reference. It is an alternative name (just like a reference variable). This int& y = *x is not a de-reference. You a returning a reference. Not a de-reference. If you had done int z = *x then that is a de-reference. The result of the *x is a reference the operator = then causes a read of the reference which is a de-reference.

@Martin York 2013-08-06 12:10:19

@LightnessRacesinOrbit: I am happy to agree to disagree. Obviously few people agree with my opinion (hence only one vote).

@Martin York 2013-08-06 12:21:09

@LightnessRacesinOrbit: int& y = *x is covered by 8.5.3 References paragraph 5. <quote>then the reference is bound directly to the initializer expression lvalue in the first case, and the reference is bound to the lvalue result of the conversion in the second case</quote> This is not a de-reference it is a binding of a reference. Which strengths my argument that * is not a de-reference but returns a reference. Ie it does not return the value it returns a reference to the value. To get the value you must de-reference the reference.

@Lightness Races in Orbit 2013-08-06 12:50:15

You a returning a reference. Not a de-reference. That just makes no sense. You can't "return a de-reference". Do you think a "de-reference" is a thing? It's not... a dereference operation takes a pointer and gives you the thing it points to. Whether you then use that object directly or initialise a reference from it is completely irrelevant. There is a reference binding, yes: the reference is bound to the result of your dereference operation. And there is no such thing as "de-referencing a reference". I don't see anyone else claiming the same mystical bizarrity to which you subscribe.

@Lightness Races in Orbit 2013-08-06 12:51:48

Which strengths my argument that * is not a de-reference but returns a reference. Ie it does not return the value it returns a reference to the value. No, * yields an lvalue that is the original object, not a reference. Read the passage that you cited in bold in your answer.

@Lightness Races in Orbit 2013-08-06 13:01:12

For the record, the standard has never explicitly defined "dereference" (instead relying on a wider understanding of the term, as in to follow a path of indirection from pointer to pointee), though it is non-normatively mentioned in the definition for unary *. In the most recent C++14 draft, all utterances of "dereference" are removed, because there was this standard bug in that no text was sufficiently clear for me to prove to you how horribly wrong you are.

@Martin York 2013-08-06 13:04:21

@LightnessRacesinOrbit: Lets not get personal. Sorry if you don't like arguing the point (I tend to like the discussion). I will stop if you want to (I am not that vested in this answer). It is hard to have this discussion in comments and I am not expressing myself well (as you seem to think I think a de-reference is a thing). I think if we got in a room together we could talk this threw and you could probably convince my of your arguments. But currently I am not convinced. PS. I can't find the place in the standard that says accessing beyond the end is undefined behavior. Do you know?

@Martin York 2013-08-06 13:05:44

@LightnessRacesinOrbit: Do you know where it is?

@Lightness Races in Orbit 2013-08-06 13:06:43

It's within the definition for binary + (which is all-encompassing, since subscripting is defined in terms of binary +): [C++11: 5.7/5]: If both the pointer operand and the result point to elements of the same array object, or one past the last element of the array object, the evaluation shall not produce an overflow; otherwise, the behavior is undefined.

@Martin York 2013-08-06 13:11:04

I was searching for "undefined behavior" I wish they would be consistent to make searching easier. Thanks. PS. I don't want to be an argumentative git (troll). I truly want to either work through the difference (I don't mind being proven wrong and have been many times (each time I learn something new)). I like my views being challenged. But I don't want to start a war over it. As I said I am not that vested in my argument (as it only has one vote).

@Lightness Races in Orbit 2013-08-06 13:12:28

If you're looking for instances of UB in the language, grepping the standard for "undefined behaviour" is a fool's errand in the first place, even if you were to use every permutation of the phrase. There are a gazillion things that invoke undefined behaviour in C++ merely by being unmentioned in the standard.

@Martin York 2013-08-06 13:16:37

@Lightness Races in Orbit: So I have found (searching for undefined behavior). I thought it would be easy to spot. The reference you gave me is not the correct one. This is about doing pointer maths not about reading the element one passed the end.

@Lightness Races in Orbit 2013-08-06 13:19:22

Access to array elements is defined in terms of pointer maths. You already quoted 5.2.1 in your answer which states this.

@Martin York 2013-08-06 13:35:08

@LightnessRacesinOrbit: Yes I agree. But this does not indicate that accessing one passed the end is illegal. This is about maths on the pointers and overflow not about access. Maths of one past the end does not invoke overflow according to this: If both the pointer operand and the result point to elements of the same array object, or one past the last element of the array object, the evaluation shall not produce an overflow; otherwise, the behavior is undefined.. So I don't think this is what we are looking for.

@Martin York 2013-08-06 13:43:30

@LightnessRacesinOrbit: So summary we get arr[5] => *(arr + 5). We both agree that (arr+5) is legal (I hope). The sticking point is the *. You argue this is a de-reference operator. I argue this is unary * operator that returns a lvalue referring to the object (which is why I don't like the term de-reference operator it implies an action that does not exist). So the argument is really about does returning an lvalue referring to an object cause a de-reference correct? or are we further apart than that?

@Lightness Races in Orbit 2013-08-06 13:49:06

Yes, that's the argument we've been having. I find it disingenuous to suggest that applying unary * to a pointer could be called "not a dereference", but because the standard never bothered to explicitly define "dereference" (instead relying on common sense that apparently is not as prevalent in the C++ community as one might hope) I can't prove this completely normal and widely understood term to you. That's also why they removed it from C++14 entirely. The point remains that dereferencing past-the-end is invalid, and I maintain that that is what you're doing with arr[5].

@Martin York 2013-08-06 14:00:35

@LightnessRacesinOrbit: So your argument is that * returns a de-referenced object (I hope I am not putting words in your mouth) which would make it illegal (I agree if your interpretation is correct then it is an illegal operation). I on the other hand believe that * returns a lvalue referencing to an object (as said by all versions of the standard going back to n1804). I interpret this as a reference (this may be shakey). To me having a reference to an object is not the same as de-referencing it (and thus if my interpretation is correct the action is legal).

@Lightness Races in Orbit 2013-08-06 14:02:56

I on the other hand believe that * returns a lvalue referencing to an object Yes, correct. It got that by dereferencing its pointer operand. | I interpret this as a reference No, reference types are a completely different language feature. | To me having a reference to an object is not the same as de-referencing it That is correct.

@Martin York 2013-08-06 14:02:59

I find it disingenuous to suggest that applying unary * to a pointer could be called "not a dereference". It has never been called de-referencing. It may be common slang. But the unary operator has always said that it returns an lvalue referring to the object. I find it pointless arguing about common slang as it is not in the standard.

@Lightness Races in Orbit 2013-08-06 14:03:22

It has never been called de-referencing Yes, it has been called de-referencing since time began, and is still called that now. Using & you perform indirection from an object to a pointer-to-that-object; using * you do the opposite, performing indirection from a pointer to its pointee, or dereferencing that pointer.

@Martin York 2013-08-06 14:05:07

OK. So our argument is around the phrase lvalue referring to an object. OK I agree that using the term reference is not correct as it is overloaded and has meaning in a language context as well as general computer science context. So lets not go down that road.

@Lightness Races in Orbit 2013-08-06 14:06:01

Yes, I think you've seen referring and assumed that means "references" are involved, which is not true. By the way, see C99 footnote 83: "Among the invalid values for dereferencing a pointer by the unary * operator are a null pointer, an address inappropriately aligned for the type of object pointed to, and the address of an object after the end of its lifetime." I'm not just making this stuff up!

@Martin York 2013-08-06 14:06:06

* you do the opposite, performing indirection from a pointer to its pointee, or dereferencing that pointer. you are interpreting with out refernce to the standard.

@Lightness Races in Orbit 2013-08-06 14:07:08

The standard assumes a baseline understanding of computer terminology. It also does not define what "arithmetic" or "subtraction" means, or what a logarithm is. Or what "container" means in English. You are expected to know basic terms first. Wikipedia is not normative in proving a language behaviour but to demonstrate again that this is a commonly understood term, see en.wikipedia.org/wiki/Dereference_operator. I also invite you to read this comprehensive answer on the old question "What does 'dereferencing' a pointer mean?"

@Martin York 2013-08-06 14:08:47

@LightnessRacesinOrbit: Lets not get off topic with wikipedia or arguing about * being a de-reference operator. Those are just side issues. The real disagreement we have is on the term lvalue referring to an object. So let me go read a bit more.

@Martin York 2013-08-06 14:09:47

Sorry. What is the disagreement. We were having two threads at the same time I may have got confused.

@Lightness Races in Orbit 2013-08-06 14:09:59

Okay you may read. The phrase means that the result of dereferencing a pointer is an lvalue that aliases the original object. This is the same as in C. Conventionally I would have expected to get a reference instead, but this is not the case here.

@Lightness Races in Orbit 2013-08-06 14:10:16

No, there is only one thread. And I know you are confused. :)

@Martin York 2013-08-06 14:10:57

I would agree: lvalue that aliases the original object. You don;t agree that this does not require a de-reference

@Martin York 2013-08-06 14:12:47

@LightnessRacesinOrbit: I am still trying to be polite. If we want to be trolls and sarcastic that is easy to do. Can we agree that that the argument is over: lvalue referring to an object or please state the argument in your terms.

@Martin York 2013-08-06 14:18:42

@LightnessRacesinOrbit: If your argument is that lvalue referring to an object is lvalue that aliases the original object I agree. I would also point out that this is my whole point. It is an alias to the object not the object and thus no-dereferencing. If you don't agree then I will go read. But now I have to go to work. I will pick up afterwords.

@Lightness Races in Orbit 2013-08-06 14:28:19

"and thus no-dereferencing" is a non-sequitur. It doesn't matter that you get an alias -- how else could it work? You do get "the original object". You simply get a fresh lvalue for it. This has nothing to do with the fact that you just dereferenced a pointer to get there. Look. Step 1: You have a pointer. Step 2: You dereference the pointer. Step 3: You now have an lvalue that refers (NB. similarity in words notwithstanding, this is not a C++ "reference") to the pointee object. Simples!

@Martin York 2013-08-06 15:44:43

@LightnessRacesinOrbit: Show me the standard for step 2. You are just making stuff because you think * is a de-reference operator. That's your fundamental mistake. Its unary * operator that returns an lvalue referring to an object. All quotes from the standard. There is nothing here that says de-reference. I think I have proved my point with your own words. It just an alias and the subsequent & operator takes the address of the alias and returns it as the result of the expression. If you don't agree then I think we are fundamentally going to have to agree to disagree.

@Lightness Races in Orbit 2013-08-06 17:02:33

You will have to refer to my previous comments, because I'm not going to repeat them all. It's not mentioned in the standard, I've explained this, and I've explained why that doesn't matter. Taking a pointer and retrieving its pointee — in any form — fundamentally involves a dereference. I provided several citations. End of story! Please read my comments.

@Mooing Duck 2013-08-06 17:25:25

@LokiAstari: I have a question, what do you think "dereferencing" means if not "calling the unary * operator that returns an lvalue referring to the object to which the expression points"? (Note that the subsequent sentence of the standard does refer to this process as dereferencing in the C++11 spec)

@Martin York 2013-08-09 12:55:56

@LightnessRacesinOrbit: You provide no standard citations. End of story.

@Martin York 2013-08-09 13:35:09

@@MooingDuck: The "Undefined Behavior" that you are eluding to (attached to the ill-defined term De-referencing) is reading/writing to the actual memory. That is not required in &arr[5] Which is why the standard uses the term lvalue referring to. This is why the operator * is not called the de-reference operator it is called the unary * operator.

@Lightness Races in Orbit 2013-08-09 14:03:21

@Loki: I already explained why standard citations are not required. The C++ standard does not define the entirety of maths and technology, nor does it have to. I'm sorry to see that you still refuse to engage in this discussion in a professional manner.

@Martin York 2013-08-09 22:20:00

@LightnessRacesinOrbit: Seriously. It seems I am the only one quoting an authoritative source (the standard). Your argument is based on you opinion that the '*' is called the de-reference operator (even that does not explain why you think it is illegal) and thus has some magical properties that cause undefined behavior. Please quote the standard that shows undefined behavior

@Lightness Races in Orbit 2013-08-09 23:44:46

@LokiAstari: I already explained that the standard doesn't list undefined behaviour. Rather, everything unstated is undefined. That's what undefined means. And I've never said those things you ascribe to me; when did I claim that operator* has "some magical properties that cause undefined behaviour"? I quoted the standard passage that causes out-of-bounds array access to be undefined (whether through derefencing, subscripting, or whatever). Read my comments: I shan't respond to this thread any further until you have. You are the most frustrating person I've ever dealt with here.

@Lightness Races in Orbit 2013-08-09 23:46:28

The standard is a complex web of subtly linked rules, and you can't rationalise about it in the simple way you seem to be trying to.

@Martin York 2013-08-10 01:10:09

@LightnessRacesinOrbit: The standard is complex (yes I agree). But the question is well defined and very simple. It easy to answer because you can look up every operator (all two of them). There is not undefined behavior. Your inability to show me a single reference that shows it is undefined basically shows you are simply trolling (I am willing to accept I could be wrong but there is no follow through on a reference). I have been polite and well behaved throughout this conversation (unlike you). So I think we can easily see who is being professional here.

@Lightness Races in Orbit 2013-08-10 01:11:21

@LokiAstari: I showed you references days and days ago; you simply refuse to acknowledge that they exist, for some reason. How you can justify this behaviour is beyond me, but you must be the one trolling.

@Martin York 2013-08-10 01:12:49

@LightnessRacesinOrbit: Quote it now then. Just one. I will delete this answer if it shows UB. Yes. you showed one reference that does not even apply to this situation. I am glad you provided the link again to just show your skill stackoverflow.com/questions/988158/… If you look it up that reference is about pointer arithmetic. And in this case shows the cose above is valid.

@Lightness Races in Orbit 2013-08-10 01:13:29

@LokiAstari: I just linked you to the one I already gave you. Why didn't you follow that link, and read? Then you need to read all my other comments to find out why it's relevant. This is not a topic that you can prove with a ten-word soundbite. to summarise: your entire answer on why &*array[N] is okay hinges on the fantasy that * does not perform a "de-refeference", which is a nonsense. We've fully covered why that it so, now, and I have provided all required supporting evidence.

@Lightness Races in Orbit 2013-08-10 01:17:42

@LokiAstari: [replying to your edit] Yes, I know the passage is about pointer arithmetic. As I already explained several comments up, that is wholly relevant since array subscripting is defined in terms of pointer arithmetic. We're going around in circles, as long as you're not listening to anything I say.

@Martin York 2013-08-10 01:18:48

@LightnessRacesinOrbit: I see and am so glad. Proving yourself wrong. <quote>if the expression P points to the last element of an array object, the expression (P)+1 points one past the last element of the array object, and if the expression Q points one past the last element of an array object, the expression (Q)-1 points to the last element of the array object. If both the pointer operand and the result point to elements of the same array object, or one past the last element of the array object, the evaluation shall not produce an overflow; otherwise, the behavior is undefined</quote>

@Martin York 2013-08-10 01:20:07

@LightnessRacesinOrbit: By that definition there is no UB. You are quoting the wrong part of the standard as this has nothing no do with de-referencing here.

@Lightness Races in Orbit 2013-08-10 01:20:37

@LokiAstari: What on earth are you talking about? This entire question is about &array[N]. This is equivalent to &(*(array+N)). How is dereferencing not relevant?

@Martin York 2013-08-10 01:21:10

@LightnessRacesinOrbit: What are you talking about as you obviously did not read it.

@Lightness Races in Orbit 2013-08-10 01:21:41

@Loki: I can only assume at this point that you are just deliberately trolling me, as no-one could possibly be this obtuse.

@Martin York 2013-08-10 01:22:33

@LightnessRacesinOrbit: You obviousy either do not understand the quote or are basic your opinion on a misunderstanding of what the name of the operator is without actually understanding what it does. Please for the love of god read the standard.

@Lightness Races in Orbit 2013-08-10 01:22:55

@LokiAstari: I read the standard almost every day. Why don't you take a moment to read the comments in this thread where the standard has been analysed in detail. You keep repeating "your reference proves my point" over and over, when it does no such thing. It's impossible to discuss anything with you. I'm done here now. I wish you luck in coming to understand what "dereference" means.

@Martin York 2013-08-10 01:23:11

@LightnessRacesinOrbit: The only person here that is trolling is you. As the only standard reference you provided proves my point.

@Martin York 2013-08-10 01:24:54

You keep saying that its about de-referenicing, but the only standard reference you bring out is about pointer arithmatic. Which has nothing to do with the point you are trying to make. And there is not UB in terms of pointer airthmatic.

@Lightness Races in Orbit 2013-08-10 01:26:15

@Loki: Just so that you're aware, since you clearly have no clue, we were talking about your bizarre assertion that * does not perform dereferencing. (Read the first comment.) If you think that pointer arithmetic has nothing to do with dereferencing array positions, then you're out of your mind and I can't do anything to help you.

@Martin York 2013-08-10 01:28:20

@LightnessRacesinOrbit: Show we a standard reference that says '*' performce de-referencing. That is the only point you have to make and I will belive you. But that this is not the standard you are quoting. You have to keep your arguments coherent currently you are rambling between two different points. Pointer airthmatic is valid (as proved by your reference to the standard). But no standard quote about de-referencing.

@Lightness Races in Orbit 2013-08-10 01:28:22

I have requested that all these comments be moved to chat, mainly so that I am not repeatedly tempted by the SO notifications system to continue to be drawn into this utter insanity.

@Lightness Races in Orbit 2013-08-10 01:29:26

There is no standard quote about dereferencing, and it does not matter. I have explained this in great detail in my comments above. Read. Them. Please. Do it now, before replying again. Everybody else in the entire world knows that * performs dereferencing. It's just you.

@Martin York 2013-08-10 01:30:07

No point. Show my a reference in the standard that '*' actually is a de-reference. Otherwise there is nothing to talk about.

@Lightness Races in Orbit 2013-08-10 01:30:27

@Loki: No point? No point in reading the answer to your question? That demonstrates categorically that you have no interest in a proper discussion, and have just been trolling me all along.

@Martin York 2013-08-10 01:32:34

@LightnessRacesinOrbit: Yes. You continuous refusal to show a relevant quote from the standard. Your constant trolling just show you have nothing to talk about and are doing this solely to stoke controversy. Show me a quote!!!!! from the standard (I have show you section 5.7 (5) is not relevant to this argument.

@Lightness Races in Orbit 2013-08-10 01:33:47

@Loki: I told you, again, again, again, again, again, that there is none. I told you why this does not matter. I told you why the term is clear regardless. I gave another example that "arithmetic" is not defined in the standard either, yet you accept that without question. Why not this? There is 100% universal acceptance on what dereferencing means in C and C++, everybody but you. I proved that with several links. You ignored all of it.

@Martin York 2013-08-10 01:35:03

@LightnessRacesinOrbit: So this argument boils down to you told me your reason (without quoting a relevant point in the standard). While I keep quoting you the standard. Makes you totally correct. Yes

@Martin York 2013-08-10 01:35:20

@LightnessRacesinOrbit: Good Night.

@Lightness Races in Orbit 2013-08-10 01:35:21

@Loki: Not everything in the entire world is defined by the C++ standard. Open your mind. Good night.

@Martin York 2013-08-10 01:36:27

@LightnessRacesinOrbit: We are arguing about C++ not philosophy. The standard defines the language. If your argument is "Its obvious" you have no argument. Show me a standard quote

@Mooing Duck 2013-08-10 02:37:12

@LokiAstari: You misspelled my name so I missed the ping. I made no mention of undefined behavior whatsoever. What I asked was what do you think "dereferencing" means if not "calling the unary * operator that returns an lvalue referring to the object to which the expression points"?. That was meant as an actual question (which has been asked by several people now) which you have not answered.

@Martin York 2013-08-10 06:46:28

@MooingDuck: What I think does not matter. What does the standard say "dereferencing" means? <quote>The unary * operator performs indirection: the expression to which it is applied shall be a pointer to an object type and the result is an lvalue referring to the object to which the expression points. If the type of the expression is “pointer to T,” the type of the result is “T.” [this lvalue must not be converted to a prvalue, see 4.1. —end note ]</quote>

@Martin York 2013-08-10 06:59:05

@MooingDuck: I believe UB (in this context) is accessing memory (read/write) beyond the end of the array. I see nothing that indicates memory accesses (read/write) in the above statement. I don't particularly want to argue further unless you have a clause from the standard that we can discuss (about either (a) why this is UB or (b) a definition of "dereferencing").

@Martin York 2013-08-10 07:08:20

@MooingDuck: PS. I apologize for miss-spelling your name.

@Mooing Duck 2013-08-10 15:45:48

@LokiAstari: I think it isn't UB, I agree with you 100%, other than the definition of "dereference". My point is (A) The standard uses but does not define the word dereference. From this we must assume some definition exists that is not quoted from the standard. (B) A tentative definition has been put forth as matching the colloquial definition by Lightness. (C) No one has suggested an alternative definition. (D) You must agree there is a definition, but refuse to agree on any suggestion??? Hopefully you can see why Lightness is upset here :P

@milleniumbug 2013-08-10 18:06:09

From the quote Loki posted "[ Note: a pointer to an incomplete type (other than cv void) can be dereferenced. The lvalue thus obtained can be used in limited ways (to initialize a reference, for example); this lvalue must not be converted to an rvalue, see 4.1. — end note ]" (emphasis mine)

@Mooing Duck 2013-08-11 02:44:20

@LokiAstari: The term "dereference" is used 77 times in the C++ standard. As for Section 5.3.1.1 that you refer, that section contains: "A pointer to an incomplete type (other than cv void) can be dereferenced". This is still in the paragraph detailing Unary operator *.

@Martin York 2013-08-11 04:37:03

If you must have a definition of "de-reference" why not the one provided by the standard for unary * operator? It defines the 'unary * operator' (which is apparently de-referencing) as "Returning an lvalue referring (ie an alias) to an object. So by extension a de-refernece expression is an expression that "returns an lvalue referring (ie an alias) to an object".

@David Stone 2013-12-07 16:54:16

I think I'm going to search around a bit and see if there is any standard language around operator* and volatile. volatile states that reads from memory are visible, so if using operator* on a pointer to volatile is considered visible (regardless of what you do with the result), then it seems that this answer is wrong on a consistency basis.

@Lightness Races in Orbit 2014-01-01 22:13:28

The latest working draft has replaced utterances of "dereference" with "performs indirection". So not only are the two obviously intended to be synonymous, but this is now going to be "fixed" in the standard such that people like Loki can finally understand. :)

@Lightness Races in Orbit 2014-01-01 22:54:38

@Martin York 2014-01-01 23:22:54

@LightnessRacesinOrbit: Your point being. Is there anything here that changes your or my arguments?

@Lightness Races in Orbit 2014-01-01 23:26:17

It backs mine up. The committee is demonstrating that the term "dereference" means what I said it means. They have helpfully disambiguated it for you. I just thought you might be interested. No need to get defensive.

@Martin York 2014-01-01 23:31:22

@LightnessRacesinOrbit: You are going to have to be more specific. So which sentence in the standard changes so that your argument is prevalent. Has this changed? unary * operator returns a lvalue referring to the object if not then nothing relevant to my argument changed.

@Lightness Races in Orbit 2014-01-01 23:38:13

You could read the comments again and see that I never disputed what unary operator* does. Yes, it returns an lvalue referring to the object. It does that by deferencing, and that has been proven. I'm not going to get into this again with you, though. I just thought you would find the link interesting.

@Lightness Races in Orbit 2014-01-05 19:22:54

@curiousguy 2016-06-09 03:15:17

A clarification is needed here: *p is not a reference; no expression has reference type and no operator returns a reference. *p is a lvalue of non reference type; a function declared with reference return type give you a lvalue of non reference type when called.

@JohnB 2013-12-16 05:36:57

It should be undefined behaviour, for the following reasons:

  1. Trying to access out-of-bounds elements results in undefined behaviour. Hence the standard does not forbid an implementation throwing an exception in that case (i.e. an implementation checking bounds before an element is accessed). If & (array[size]) were defined to be begin (array) + size, an implementation throwing an exception in case of out-of-bound access would not conform to the standard anymore.

  2. It's impossible to make this yield end (array) if array is not an array but rather an arbitrary collection type.

@rlbond 2009-06-12 18:14:31

Even if it is legal, why depart from convention? array + 5 is shorter anyway, and in my opinion, more readable.

Edit: If you want it to by symmetric you can write

int* array_begin = array; 
int* array_end = array + 5;

@Zan Lynx 2009-06-12 23:45:49

I think that the style I use in the question looks more symmetrical: the array declaration and the begin/end pointers, or sometimes I pass those directly to an STL function. That is why I use it instead of the shorter version.

@Zan Lynx 2010-03-16 18:44:12

To be symmetrical I think it'd need to be array_begin = array + 0; array_end = array + 5; How's that for a long delayed comment response?

@rlbond 2010-03-16 20:56:26

It might be a world record :)

@codymanix 2009-06-15 15:12:03

It is perfectly legal.

The vector<> template class from the stl does exactly this when you call myVec.end(): it gets you a pointer (here as an iterator) which points one element past the end of the array.

@Ben Voigt 2014-10-10 01:54:44

But it does so via pointer arithmetic, not by forming a reference to past-the-end and then applying the address-of operator to that lvalue.

@David Thornley 2009-06-12 22:11:08

C++ standard, 5.19, paragraph 4:

An address constant expression is a pointer to an lvalue....The pointer shall be created explicitly, using the unary & operator...or using an expression of array (4.2)...type. The subscripting operator []...can be used in the creation of an address constant expression, but the value of an object shall not be accessed by the use of these operators. If the subscripting operator is used, one of its operands shall be an integral constant expression.

Looks to me like &array[5] is legal C++, being an address constant expression.

@CB Bailey 2009-06-12 22:28:41

I'm not sure that the original question is necessarily talking about an array with static storage. Even if it is I wonder if &array[5] isn't a address constant expression precisely because it doesn't point to an lvalue designating an object?

@David Thornley 2009-06-13 15:40:08

I don't think it matters whether the array is static or stack-allocated.

@CB Bailey 2009-06-13 19:52:30

It does if your referencing 5.19. The part that you elided with ... says "... designating an object of static storage duration, a string literal or a function. ...". This means that if your expression involves a stack allocated array you can't use 5.19 to reason about the validity of those expressions.

@M.M 2016-02-13 00:06:11

Your quote is saying that if &array[5] is legal, and referred to a static storage duration array, then that would be an address constant. (Compare with &array[99] for example, no text in this paragraph distinguishes between those two cases).

@Matthew Flaschen 2009-06-12 18:22:33

Working draft (n2798):

"The result of the unary & operator is a pointer to its operand. The operand shall be an lvalue or a qualified-id. In the first case, if the type of the expression is “T,” the type of the result is “pointer to T.”" (p. 103)

array[5] is not a qualified-id as best I can tell (the list is on p. 87); the closest would seem to be identifier, but while array is an identifier array[5] is not. It is not an lvalue because "An lvalue refers to an object or function. " (p. 76). array[5] is obviously not a function, and is not guaranteed to refer to a valid object (because array + 5 is after the last allocated array element).

Obviously, it may work in certain cases, but it's not valid C++ or safe.

Note: It is legal to add to get one past the array (p. 113):

"if the expression P [a pointer] points to the last element of an array object, the expression (P)+1 points one past the last element of the array object, and if the expression Q points one past the last element of an array object, the expression (Q)-1 points to the last element of the array object. If both the pointer operand and the result point to elements of the same array object, or one past the last element of the array object, the evaluation shall not produce an overflow"

But it is not legal to do so using &.

@Matthew Flaschen 2009-06-12 18:40:37

Care to explain the down-vote?

@Johannes Schaub - litb 2009-06-12 19:16:10

I upvoted you, because you are correct. There is no object guaranteed to be located at the past-the-end location. The person that downvoted you probably misunderstood you (you sound like you say any array-index-op refers to no object at all). I think here is an interesting thing: It is an lvalue, but it also does not refer to an object. And so here is a contradiction to what the standard says. And so, this yields undefined behavior :) This is also related to this one: open-std.org/jtc1/sc22/wg21/docs/cwg_active.html#232

@jalf 2009-06-12 19:25:44

@litb: According to 3.9.2:3, there is "an unrelated object of the array's element type" at the past-the-end location. Doesn't that mean that the result of array[5] is an lvalue?

@Johannes Schaub - litb 2009-06-12 19:28:32

@jalf, the note says "that might be located at that address". It's not guaranteed that there is one located :)

@Matthew Flaschen 2009-06-12 21:44:49

litb, thanks. However, I still think it is not an lvalue. Because there is no object guaranteed to be at array[5], array[5] can not legally /refer/ to an object. Thus (see def. quoted in my answer), it is not an lvalue, and &array[5] is illegal. I also see that Bill Gibbons says on the Active Issues page, "dereferencing a pointer to the end of an array should be allowed as long as the value is not used", but they do not claim it /is/ allowed (and I'm not sure I agree it /should/ be).

@CB Bailey 2009-06-12 21:49:11

The standard says that the result of op* must be an lvalue, but it only says what that lvalue is if the operand is a pointer which actually points to an object. That would imply (bizarrely) that if one past the end didn't happen to point at a suitable object, that the implementation would have to find a suitable lvalue from somewhere else and use that. That really would mess up &array[sizeof array]!

@Johannes Schaub - litb 2009-06-12 21:58:04

"However, I still think it is not an lvalue. Because there is no object guaranteed to be at array[5], array[5] can not legally /refer/ to an object." <- That is exactly why i think it is undefined behavior: It relies on some behavior not explicitly specified by the standard, and thus falls within 1.3.12[defns.undefined]

@Matthew Flaschen 2009-06-12 22:48:54

litb, fair enough. Let's say it's /not definitely/ an lvalue, and thus /definitely not/ 100% safe.

@CB Bailey 2009-06-12 18:31:58

I don't believe that it is illegal, but I do believe that the behaviour of &array[5] is undefined.

  • 5.2.1 [expr.sub] E1[E2] is identical (by definition) to *((E1)+(E2))

  • 5.3.1 [expr.unary.op] unary * operator ... the result is an lvalue referring to the object or function to which the expression points.

At this point you have undefined behaviour because the expression ((E1)+(E2)) didn't actually point to an object and the standard does say what the result should be unless it does.

  • 1.3.12 [defns.undefined] Undefined behaviour may also be expected when this International Standard omits the description of any explicit definition of behaviour.

As noted elsewhere, array + 5 and &array[0] + 5 are valid and well defined ways of obtaining a pointer one beyond the end of array.

@Richard Corden 2009-06-12 18:41:52

The key point is: "the result of '*' is an lvalue". From what I can tell, it only becomes UB iff you have an lvalue to rvalue conversion on that result.

@CB Bailey 2009-06-12 18:52:55

I would contend that as the result of '*' is only defined in terms of the object to which the expression to which the operator is applied, then it is undefined - by omission - what the result is if the expression didn't have a value which actually referred to an object. It's far from clear, though.

@Aditya Sehgal 2009-06-12 18:26:54

If your example is NOT a general case but a specific one, then it is allowed. You can legally, AFAIK, move one past the allocated block of memory. It does not work for a generic case though i.e where you are trying to access elements farther by 1 from the end of an array.

Just searched C-Faq : link text

@Aditya Sehgal 2009-06-12 18:40:32

the top answer says "its legal" and I also say the same thing. Why the down vote then :). Is something wrong with my answer?

@Adam Rosenfield 2009-06-12 18:57:57

Yes, it's legal. From the C99 draft standard:

§6.5.2.1, paragraph 2:

A postfix expression followed by an expression in square brackets [] is a subscripted designation of an element of an array object. The definition of the subscript operator [] is that E1[E2] is identical to (*((E1)+(E2))). Because of the conversion rules that apply to the binary + operator, if E1 is an array object (equivalently, a pointer to the initial element of an array object) and E2 is an integer, E1[E2] designates the E2-th element of E1 (counting from zero).

§6.5.3.2, paragraph 3 (emphasis mine):

The unary & operator yields the address of its operand. If the operand has type ‘‘type’’, the result has type ‘‘pointer to type’’. If the operand is the result of a unary * operator, neither that operator nor the & operator is evaluated and the result is as if both were omitted, except that the constraints on the operators still apply and the result is not an lvalue. Similarly, if the operand is the result of a [] operator, neither the & operator nor the unary * that is implied by the [] is evaluated and the result is as if the & operator were removed and the [] operator were changed to a + operator. Otherwise, the result is a pointer to the object or function designated by its operand.

§6.5.6, paragraph 8:

When an expression that has integer type is added to or subtracted from a pointer, the result has the type of the pointer operand. If the pointer operand points to an element of an array object, and the array is large enough, the result points to an element offset from the original element such that the difference of the subscripts of the resulting and original array elements equals the integer expression. In other words, if the expression P points to the i-th element of an array object, the expressions (P)+N (equivalently, N+(P)) and (P)-N (where N has the value n) point to, respectively, the i+n-th and i−n-th elements of the array object, provided they exist. Moreover, if the expression P points to the last element of an array object, the expression (P)+1 points one past the last element of the array object, and if the expression Q points one past the last element of an array object, the expression (Q)-1 points to the last element of the array object. If both the pointer operand and the result point to elements of the same array object, or one past the last element of the array object, the evaluation shall not produce an overflow; otherwise, the behavior is undefined. If the result points one past the last element of the array object, it shall not be used as the operand of a unary * operator that is evaluated.

Note that the standard explicitly allows pointers to point one element past the end of the array, provided that they are not dereferenced. By 6.5.2.1 and 6.5.3.2, the expression &array[5] is equivalent to &*(array + 5), which is equivalent to (array+5), which points one past the end of the array. This does not result in a dereference (by 6.5.3.2), so it is legal.

@CB Bailey 2009-06-12 19:32:57

Interesting, so it's legal and explicitly well defined in C which may be different from C++ (see other discussions!).

@Matthew Flaschen 2009-06-12 21:34:18

He explicitly asked about C++. This is the kind of subtle difference that can not be relied when porting between the two.

@CB Bailey 2009-06-12 21:42:14

He asked about both: "It would also be interesting to know if it meets the C standard."

@Adam Rosenfield 2009-06-12 22:00:31

@Matthew Flaschen: The C++ standard incorporates the C standard by reference. Annex C.2 contains a list of changes (incompatibilities between ISO C and ISO C++), and none of the changes relate to these clauses. Hence, &array[5] is legal in C and C++.

@CB Bailey 2009-06-12 22:20:54

The C standard is a normative reference in the C++ standard. That means that provisions in the C standard that are referenced by the C++ standard are part of the C++ standard. It does not mean that everything in the C standard applies. In particular Annex C is informative, not normative, so just because a difference isn't highlighted in this section doesn't mean that the C 'version' applies to C++.

@Johannes Schaub - litb 2009-06-13 13:01:24

I wonder whether it's worth to submit a issue report, asking them to support the &*-is-noop semantics. If the operand is an incompatible class type that has operator& overloaded, using operator& is already undefined, so they don't even have to change anything, i think. They just have to introduce this no-op rule, as a syntactical transformation. I think this will greatly reduce the current problems.

@Richard Corden 2009-06-16 16:05:23

@Charles Bailey: There's a difference between the C standard (which is probably C89, or C90?) and C99 which was standardised after the first C++ stndard (ie. C++ 98). IMHO, the C++ committee has tried to incorporate C99 fixed and additions where possible, but sometimes it just seems that C99 has solved problems in ways that make compatibility difficult at best. Either way, what you say does not apply to C99, only to the earlier standard.

@Adam Rosenfield 2009-06-16 17:36:56

C89 was the C standard published by ANSI in 1989; C90 was the C standard published by ISO in 1990. They are essentially identical; I don't know if they are 100% identical. In any case, though, you're right -- the current C++ standard, C++03, refers to C90, not to C99. I don't know if the next C++ standard, C++0x, will refer to C90 or C99.

@Richard Corden 2009-06-17 07:49:13

@Adam: Thanks for pointing that out. I have never quite been sure of C's history. Re C++0x referring to C99, my little knowledge of the changes in C99, I'm pretty sure that C++ will continue to refer to C89/90 and will cherry pick the "desirable" changes from C99 on a case by case basis. This question/answer is a good example of this. I'd say that C++ will continue to use the no "lvalue-to-rvalue" therefore no undefined behaviour, rather than integrating the "&* == no-op" wording.

@Richard Corden 2009-06-17 07:51:56

@litb: It's already undefined behaviour to use unary-& on an incomplete class type that later declares a member "operator&". 5.2.1/4 says: "The address of an object of incomplete type can be taken, but if the complete type of that object is a class type that declares operator&() as a member function, then the behavior is undefined (and no diagnostic is required)."

@John Dibling 2012-11-07 21:03:56

Adam, please reconsider this. I believe this is Undefined Behavior because, according to the passage you quote, the out-of-bounds pointer has already been dereferenced (by definition). See my answer below.

@PoweredByRice 2017-04-21 02:05:11

"This does not result in a dereference". YES it does. &*(array + 5) it's the * operator. You cannot count on the compiler to optimize &* out. &*(array + 5) is certainly not equivalent to (array+5)

@Adam Rosenfield 2017-04-21 20:41:34

@PoweredByRice: It's legal in C99, please read the quoted passage of the standard above where it explicitly says that neither the the & nor the * operators are evaluated. C++ is different. C++11 — which was still being drafted at the time this answer was originally written — does not have a similar clause, from what I can find.

@Todd Gardner 2009-06-12 18:42:36

In addition to the above answers, I'll point out operator& can be overridden for classes. So even if it was valid for PODs, it probably isn't a good idea to do for an object you know isn't valid (much like overriding operator&() in the first place).

@David Rodríguez - dribeas 2009-06-12 19:28:32

+1 on bringing operator& into discussion, even if experts recommend never overriding it as some STL containers depend on it returning a pointer into the element. It is one of those things that got into the standard before they knew better.

@Richard Corden 2009-06-12 18:40:31

I believe that this is legal, and it depends on the 'lvalue to rvalue' conversion taking place. The last line Core issue 232 has the following:

We agreed that the approach in the standard seems okay: p = 0; *p; is not inherently an error. An lvalue-to-rvalue conversion would give it undefined behavior

Although this is slightly different example, what it does show is that the '*' does not result in lvalue to rvalue conversion and so, given that the expression is the immediate operand of '&' which expects an lvalue then the behaviour is defined.

@CB Bailey 2009-06-12 19:25:17

+1 for the interesting link. I'm still not sure that I agree that p=0;*p; is well defined as I'm not convinced that '*' is well defined for an expression whose value is not a pointer to an actual object.

@David Thornley 2009-06-12 22:01:58

A statement that's an expression is legal, and means to evaluate that expression. *p is an expression that invokes undefined behavior, so anything the implementation does is according to the standard (including emailing your boss, or downloading baseball statistics).

@musiphil 2015-07-02 23:32:34

Note that the status of that issue is still "drafting" and it hasn't made it into the standard (yet), at least those draft versions of C++11 and C++14 I could find.

@Tyler McHenry 2009-06-12 18:19:36

It is legal.

According to the gcc documentation for C++, &array[5] is legal. In both C++ and in C you may safely address the element one past the end of an array - you will get a valid pointer. So &array[5] as an expression is legal.

However, it is still undefined behavior to attempt to dereference pointers to unallocated memory, even if the pointer points to a valid address. So attempting to dereference the pointer generated by that expression is still undefined behavior (i.e. illegal) even though the pointer itself is valid.

In practice, I imagine it would usually not cause a crash, though.

Edit: By the way, this is generally how the end() iterator for STL containers is implemented (as a pointer to one-past-the-end), so that's a pretty good testament to the practice being legal.

Edit: Oh, now I see you're not really asking if holding a pointer to that address is legal, but if that exact way of obtaining the pointer is legal. I'll defer to the other answerers on that.

@Tyler McHenry 2009-06-12 18:33:50

I'd say you're correct, if and only if the C++ spec does not say that &* must be treated as a no-op. I'd imagine it probably does not say that.

@Evan Teran 2009-06-12 18:34:16

he page you reference (correctly) says that it is legal to point one past the end. &array[5], technically first dereferences (array + 5), then references it again. So it technically is like this: (&*(array + 5)). Fortunately, compiler are smart enough to know that &* can be factored to nothing. However, they don't have to do that, therefore, I'd say it is UB.

@Richard Corden 2009-06-12 18:44:29

@Evan: There's more to this. Check out the last line of core issue 232: std.dkuug.dk/JTC1/SC22/WG21/docs/cwg_active.html#232. The last example there just looks wrong - but they clearly explain that the distinction is on the "lvalue-to-rvalue" conversion, which in this case doesn't take place.

@Evan Teran 2009-06-12 18:46:55

@Richard: interesting, seems there is some debate on the subject. I'd even agree that it should be allowed :-P.

@Martin York 2009-06-12 19:13:10

@Evan Teran: No it does not de-reference the member unless you try and read/write to the area. Think of it as a reference to the member it will not be de-referenced unless you try and obtain the value or change the value. Taking the address does not cause a read or write and thus does not de-reference the value.

@Johannes Schaub - litb 2009-06-12 19:21:44

It's is the same kind of undefined behavior as is the "reference-to-NULL" thing people kept discussing about and where seemingly all voted up the answer saying "it is undefined behavior"

@Johannes Schaub - litb 2009-06-12 20:07:58

@Richard, note also that they agree so far that the difference should be an lvalue to rvalue conversion. But they find that this is not well reflected in the Standard. The same issue report can be found here which has the other points they noted included (including the concept of an "empty lvalue"): open-std.org/jtc1/sc22/wg21/docs/cwg_active.html#232

@Richard Corden 2009-06-16 08:15:53

@litb: Agreed. So the conclusion is that the standard performs badly here, but when you take into account the behaviour of C99 (&* == no-op), and the general comments from the committee it seems clear that the behaviour here is supposed to be well defined. Probably the only remaining question to ask is: are there any compilers that, given '&*p', first attempt to read the value in '*p'?

@Steve Jessop 2014-02-18 10:29:16

@RichardCorden: there are other possible problems if it is UB. Are there any compilers that see int array[5]; &array[5];, apply a compile-time bounds check to the sub-expression array[5] and refuse to compile it? If it does have UB they are entitled to do this, although given the legality in C and the fact that people rely on it, it would probably not be the most popular error that compiler implemented ;-)

@Richard Corden 2014-02-23 16:56:22

@SteveJessop: The C++ committee is made up of a lot of compiler vendors (eg. Microsoft, Clang, GCC, EDG and more), which is why I feel the note against 232 is important. Also 5.3.1/1 of a recent draft has a note on incomplete types. The result is that the following is legal: void foo (class A * pA) { A & a (*pA); }. The key is that the '*pA' in other contexts would be illegal, but as there isn't a conversion to a 'prvalue' it's OK here. I really do believe the intent is that it's the same for &array[N].

@Spikatrix 2015-09-12 09:25:20

Your first link is dead.

@M.M 2016-02-13 00:09:27

The GCC documentation only documents the behaviour of gcc.

Related Questions

Sponsored Content

24 Answered Questions

[SOLVED] Image Processing: Algorithm Improvement for 'Coca-Cola Can' Recognition

10 Answered Questions

[SOLVED] Improve INSERT-per-second performance of SQLite?

14 Answered Questions

[SOLVED] Purpose of Unions in C and C++

35 Answered Questions

10 Answered Questions

6 Answered Questions

[SOLVED] What's wrong with cplusplus.com?

  • 2011-06-29 11:40:39
  • Kerrek SB
  • 18815 View
  • 185 Score
  • 6 Answer
  • Tags:   c++

10 Answered Questions

19 Answered Questions

[SOLVED] int a[] = {1,2,}; Weird comma allowed. Any particular reason?

1 Answered Questions

[SOLVED] A positive lambda: '+[]{}' - What sorcery is this?

Sponsored Content