By janojlicz

2019-05-15 17:03:57 8 Comments

Going through some C interview questions, I've found a question stating "How to find the size of an array in C without using the sizeof operator?", with the following solution. It works, but I cannot understand why.

#include <stdio.h>

int main() {
    int a[] = {100, 200, 300, 400, 500};
    int size = 0;

    size = *(&a + 1) - a;
    printf("%d\n", size);

    return 0;

As expected, it returns 5.

edit: people pointed out this answer, but the syntax does differ a bit, i.e. the indexing method

size = (&arr)[1] - arr;

so I believe both questions are valid and have a slightly different approach to the problem. Thank you all for the immense help and thorough explanation!


@John Bode 2019-05-15 19:09:57

When you add 1 to a pointer, the result is the location of the next object in a sequence of objects of the pointed-to type (i.e., an array). If p points to an int object, then p + 1 will point to the next int in a sequence. If p points to a 5-element array of int (in this case, the expression &a), then p + 1 will point to the next 5-element array of int in a sequence.

Subtracting two pointers (provided they both point into the same array object, or one is pointing one past the last element of the array) yields the number of objects (array elements) between those two pointers.

The expression &a yields the address of a, and has the type int (*)[5] (pointer to 5-element array of int). The expression &a + 1 yields the address of the next 5-element array of int following a, and also has the type int (*)[5]. The expression *(&a + 1) dereferences the result of &a + 1, such that it yields the address of the first int following the last element of a, and has type int [5], which in this context "decays" to an expression of type int *.

Similarly, the expression a "decays" to a pointer to the first element of the array and has type int *.

A picture may help:

int [5]  int (*)[5]     int      int *

+---+                   +---+
|   | <- &a             |   | <- a
| - |                   +---+
|   |                   |   | <- a + 1
| - |                   +---+
|   |                   |   |
| - |                   +---+
|   |                   |   |
| - |                   +---+
|   |                   |   |
+---+                   +---+
|   | <- &a + 1         |   | <- *(&a + 1)
| - |                   +---+
|   |                   |   |
| - |                   +---+
|   |                   |   |
| - |                   +---+
|   |                   |   |
| - |                   +---+
|   |                   |   |
+---+                   +---+

This is two views of the same storage - on the left, we're viewing it as a sequence of 5-element arrays of int, while on the right, we're viewing it as a sequence of int. I also show the various expressions and their types.

Be aware, the expression *(&a + 1) results in undefined behavior:

If the result points one past the last element of the array object, it shall not be used as the operand of a unary * operator that is evaluated.

C 2011 Online Draft, 6.5.6/9

@Eric Postpischil 2019-05-15 20:25:11

That “shall not be used” text is official: C 2018 6.5.6 8.

@John Bode 2019-05-15 20:36:04

@EricPostpischil: Do you have a link to the 2018 pre-pub draft (similar to N1570.pdf)?

@Eric Postpischil 2019-05-15 20:46:40

@JohnBode: This answer has a link to the Wayback Machine. I checked the official standard in my purchased copy.

@Gizmo 2019-05-16 11:18:08

So if one wrote size = (int*)(&a + 1) - a; this code would be completely valid? :o

@Leushenko 2019-05-17 10:08:52

@Gizmo they probably originally didn't write that because that way you have to specify the element type; the original was probably written defined as a macro for type-generic use on different element types.

@Gizmo 2019-05-17 10:32:40

@Leushenko yeah that much I did figure, as C doesn't have templates/different prototypes based on parameter types.

@JL2210 2019-05-15 17:09:40

This line is of most importance:

size = *(&a + 1) - a;

As you can see, it first takes the address of a and adds one to it. Then, it dereferences that pointer and subtracts the original value of a from it.

Pointer arithmetic in C causes this to return the size of the array, or 5. Adding one and &a is essentially a pointer to the next byte after a. After that, this code dereferences the resulting pointer and subtracts a (an array type decayed to a pointer) from that, giving the size of the array.

Details on how pointer arithmetic works:

Say you have a pointer xyz that points to an int type and contains the value 160. When you subtract any number from xyz, C specifies that the actual amount subtracted from xyz is that number times the size of the type that it points to. For example, if you subtracted 5 from xyz, the value of xyz resulting would be xyz - sizeof(*xyz) * 5, not taking into account pointer arithmetic.

As a is an array of 5 int types, the resulting value will be 5. However, this will not work with a pointer, only with an array. If you try this with a pointer, the result will always be 1.

Note that this is undefined behavior, and should not be used under any circumstances. Do not expect the behavior of this to be consistent across all platforms, and do not use it in production programs.

@Eric Postpischil 2019-05-15 20:19:34

The behavior of the expression is not defined by the C standard because *(&a + 1) attempts to dereference an object that does not exist.

@Peter Cordes 2019-05-16 09:50:45

@MartinBonner: furthermore, the difference is critical in this case! This method of course fails if you only int *a in the first place, not an array. (sizeof doesn't work either in that case, it just gives you the width of a pointer.)

@Gem Taylor 2019-05-15 17:19:41

Hmm, I suspect this is something that would not have worked back in the early days of C. It is clever though.

Taking the steps one at a time:

  • &a gets a pointer to an object of type int[5]
  • +1 gets the next such object assuming there is an array of those
  • * effectively converts that address into type pointer to int
  • -a subtracts the two int pointers, returning the count of int instances between them.

I'm not sure it is completely legal (in this I mean language-lawyer legal - not will it work in practice), given some of the type operations going on. For example you are only "allowed" to subtract two pointers when they point to elements in the same array. *(&a+1) was synthesised by accessing another array, albeit a parent array, so is not actually a pointer into the same array as a. Also, while you are allowed to synthesise a pointer past the last element of an array, and you can treat any object as an array of 1 element, the operation of dereferencing (*) is not "allowed" on this synthesised pointer, even though it has no behaviour in this case!

I suspect that in the early days of C (K&R syntax, anyone?), an array decayed into a pointer much more quickly, so the *(&a+1) might only return the address of the next pointer of type int**. The more rigorous definitions of modern C++ definitely allow the pointer to array type to exist and know the array size, and probably the C standards have followed suit. All C function code only takes pointers as arguments, so the technical visible difference is minimal. But I am only guessing here.

This sort of detailed legality question usually applies to a C interpreter, or a lint type tool, rather than the compiled code. An interpretter might implement a 2D array as an array of pointers to arrays, because there is one less runtime feature to implement, in which case dereferencing the +1 would be fatal, and even if it worked would give the wrong answer.

Another possible weakness may be that the C compiler might align the outer array. Imagine if this was an array of 5 chars (char arr[5]), when the program performs &a+1 it is invoking "array of array" behaviour. The compiler might decide that an array of array of 5 chars (char arr[][5]) is actually generated as an array of array of 8 chars (char arr[][8]), so that the outer array aligns nicely. The code we are discussing would now report the array size as 8, not 5. I'm not saying a particular compiler would definitely do this, but it might.

@JL2210 2019-05-15 17:25:55

This was still valid even before the days of ANSI C.

@Gem Taylor 2019-05-15 17:26:45

Fair enough. However for reasons hard to explain, everyone uses sizeof()/sizeof() ?

@JL2210 2019-05-15 17:28:08

Most people do. For example, sizeof(array)/sizeof(array[0]) gives the number of elements in an array.

@John Bollinger 2019-05-15 18:14:30

It is not valid, @JL2210, because undefined behavior results from evaluating the unary * expression (and only for that reason). The case would be less clear if that operation were replaced with a cast to type int *.

@Kevin 2019-05-15 18:15:16

The C compiler is allowed to align the array, but I'm unconvinced it's allowed to change the type of the array after doing so. Alignment would be more realistically implemented by inserting padding bytes.

@Gem Taylor 2019-05-15 18:19:35

@Kevin I'm not saying it would change the type, but by dereferencing the array, the programmer has invoked the "array of array" behaviour, and the alignment of the elements of that outer array is what I question.

@wrtlprnft 2019-05-15 19:07:23

Can you circumvent the dereferencing issue by using ((char *)(&a + 1) - (char *)a) / ((char *)(a + 1) - (char *)a) or does this just invoke some other kind of undefined behavior?

@Gem Taylor 2019-05-15 19:12:10

@wrtlprnft whrrrrr?????? OK, I can see where you are coming from, but I think you have just written sizeof() :-)

@Eric Postpischil 2019-05-15 20:10:51

Subtracting of pointers is not limited to just two pointers into the same array—the pointers are also allowed to be one past the end of the array. &a+1 is defined. As John Bollinger notes, *(&a+1) is not, since it attempts to dereference an object that does not exist.

@Eric Postpischil 2019-05-15 20:15:20

A compiler cannot implement a char [][5] as char arr[][8]. An array is just the repeated objects in it; there is no padding. Additional, this would break the (non-normative) example 2 in C 2018 7, which tells us we can compute the number of elements in an array with sizeof array / sizeof array[0].

@JL2210 2019-05-16 11:12:56

@JohnBollinger Correction: This would have still worked even before the days of ANSI C. It's not valid, as it's undefined behavior.

@Gem Taylor 2019-05-16 12:46:59

@EricPostpischil , but I does that tell us that sizeof(array[n]) is the same beast as array[][n]? I do know that sizeof()/sizeof() is the traditional way to calculate an array size, and it interesting that that is written into the standard. It is also interesting that I can create odd-sized structs in C (by setting alignment flags sometimes), and you are saying that arrays of them would not be realigned?

Related Questions

Sponsored Content

79 Answered Questions

[SOLVED] How do I remove a particular element from an array in JavaScript?

  • 2011-04-23 22:17:18
  • Walker
  • 5713304 View
  • 7106 Score
  • 79 Answer
  • Tags:   javascript arrays

44 Answered Questions

[SOLVED] How to check if an object is an array?

25 Answered Questions

[SOLVED] How do I determine whether an array contains a particular value in Java?

  • 2009-07-15 00:03:21
  • Mike Sickler
  • 1785356 View
  • 2090 Score
  • 25 Answer
  • Tags:   java arrays

21 Answered Questions

[SOLVED] How do I determine the size of my array in C?

  • 2008-09-01 06:49:22
  • Mark Harrison
  • 1623266 View
  • 828 Score
  • 21 Answer
  • Tags:   c arrays memory

46 Answered Questions

18 Answered Questions

[SOLVED] How do I empty an array in JavaScript?

  • 2009-08-05 09:08:39
  • akano1
  • 2212505 View
  • 2199 Score
  • 18 Answer
  • Tags:   javascript arrays

30 Answered Questions

[SOLVED] How to append something to an array?

14 Answered Questions

[SOLVED] How to insert an item into an array at a specific index (JavaScript)?

18 Answered Questions

[SOLVED] Determine whether an array contains a value

Sponsored Content