#### [SOLVED] Why are Standard iterator ranges [begin, end) instead of [begin, end]?

By Puppy

Why does the Standard define `end()` as one past the end, instead of at the actual end?

#### @Kerrek SB 2012-04-01 09:45:23

The best argument easily is the one made by Dijkstra himself:

• You want the size of the range to be a simple difference end − begin;

• including the lower bound is more "natural" when sequences degenerate to empty ones, and also because the alternative (excluding the lower bound) would require the existence of a "one-before-the-beginning" sentinel value.

You still need to justify why you start counting at zero rather than one, but that wasn't part of your question.

The wisdom behind the [begin, end) convention pays off time and again when you have any sort of algorithm that deals with multiple nested or iterated calls to range-based constructions, which chain naturally. By contrast, using a doubly-closed range would incur off-by-ones and extremely unpleasant and noisy code. For example, consider a partition [n0, n1)[n1, n2)[n2,n3). Another example is the standard iteration loop `for (it = begin; it != end; ++it)`, which runs `end - begin` times. The corresponding code would be much less readable if both ends were inclusive – and imagine how you'd handle empty ranges.

Finally, we can also make a nice argument why counting should start at zero: With the half-open convention for ranges that we just established, if you are given a range of N elements (say to enumerate the members of an array), then 0 is the natural "beginning" so that you can write the range as [0, N), without any awkward offsets or corrections.

In a nutshell: the fact that we don't see the number `1` everywhere in range-based algorithms is a direct consequence of, and motivation for, the [begin, end) convention.

#### @Krazy Glew 2012-04-05 16:31:31

The typical C for loop iterating over an array of size N is "for(i=0;i<N;i++) a[i]=0;". Now, you can't express that directly with iterators - many folks wasted time trying to make < meaningful. But it is almost equally obvious to say "for(i=0;i!=N;i++)..." Mapping 0 to begin and N to end is therefore convenient.

#### @Kerrek SB 2012-04-05 20:20:38

@KrazyGlew: I didn't put types in my loop example deliberately. If you think of `begin` and `end` as `int`s with values `0` and `N`, respectively, it fits perfectly. Arguably, it's the `!=` condition that's more natural than the traditional `<`, but we never discovered that until we started thinking about more general collections.

#### @Krazy Glew 2012-04-06 21:39:35

@KerrekSB: I agree that "we never dscovered that [!= is better] until we started thinking about more general collections." IMHO that is one of the things Stepanov deserves credit for - speaking as someone who tried to write such template libraries before the STL. However, I'll argue about "!=" being more natural - or, rather, I'll argue that != has probably introduced bugs, that < would catch. Think for(i=0;i!=100;i+=3)...

#### @Kerrek SB 2012-04-08 11:00:09

@KrazyGlew: Your last point is somewhat off-topic, since the sequence {0, 3, 6, ..., 99} is not of the form that the OP asked about. If you wanted it to be thus, you should write a `++`-incrementable iterator template `step_by<3>`, which would then have the originally advertised semantics.

#### @Phil1970 2017-05-12 03:55:59

@KrazyGlew Even if < would sometime hide a bug, it is a bug anyway. If someone use `!=` when he should use `<`, then it is a bug. By the way, that king of error is easy to find with unit testing or assertions.

#### @celtschk 2012-04-02 09:18:56

Actually, a lot of iterator related stuff suddenly makes much more sense if you consider the iterators not pointing at the elements of the sequence but in between, with dereferencing accessing the next element right to it. Then the "one past end" iterator suddenly makes immediate sense:

``````   +---+---+---+---+
| A | B | C | D |
+---+---+---+---+
^               ^
|               |
begin            end
``````

Obviously `begin` points to the beginning of the sequence, and `end` points to the end of the same sequence. Dereferencing `begin` accesses the element `A`, and dereferencing `end` makes no sense because there's no element right to it. Also, adding an iterator `i` in the middle gives

``````   +---+---+---+---+
| A | B | C | D |
+---+---+---+---+
^       ^       ^
|       |       |
begin     i      end
``````

and you immediately see that the range of elements from `begin` to `i` contains the elements `A` and `B` while the range of elements from `i` to `end` contains the elements `C` and `D`. Dereferencing `i` gives the element right of it, that is the first element of the second sequence.

Even the "off-by-one" for reverse iterators suddenly becomes obvious that way: Reversing that sequence gives:

``````   +---+---+---+---+
| D | C | B | A |
+---+---+---+---+
^       ^       ^
|       |       |
rbegin     ri     rend
(end)    (i)   (begin)
``````

I've written the corresponding non-reverse (base) iterators in parentheses below. You see, the reverse iterator belonging to `i` (which I've named `ri`) still points in between elements `B` and `C`. However due to reversing the sequence, now element `B` is on the right to it.

#### @supercat 2015-05-22 18:01:55

This is IMHO the best answer, though I think it might be better illustrated if the iterators pointed at numbers, and the elements were between the numbers (the syntax `foo[i]`) is a shorthand for the item immediately after position `i`). Thinking about it, I wonder if it might be useful for a language to have separate operators for "item immediately after position i" and "item immediately before position i", since a lot of algorithms work with pairs of adjacent items, and saying "The items on either side of position i" may be cleaner than "The items at positions i and i+1".

#### @celtschk 2015-09-17 17:00:57

@supercat: The numbers were not supposed to indicate iterator positions/indices, but to indicate the elements themselves. I'll replace the numbers with letters to make that clearer. Indeed, with the numbers as given, `begin[0]` (assuming a random access iterator) would access element `1`, since there's no element `0` in my example sequence.

#### @user1741137 2019-10-01 11:42:23

Why is the word "begin" used rather than "start"? After all, "begin" is a verb.

#### @Fareanor 2020-06-24 12:50:47

@user1741137 I think "begin" is meant to be the abbreviation of "beginning" (which now makes sense). "beginning" being too long, "begin" sounds like a nice fit. "start" would be conflicting with the verb "start" (for example when you have to define a function `start()` in your class for starting a specific process or whatever, it would be annoying if it conflicts with an already existing one).

#### @Andreas DM 2014-11-27 15:34:48

1. If a container is empty, `begin() == end()`.
2. C++ Programmers tend to use `!=` instead of `<` (less than) in loop conditions, therefore having `end()` pointing to a position one off-the-end is convenient.

#### @Anders Abel 2012-04-01 09:47:12

With the `end()` pointing one past the end, it is easy to iterate a collection with a for loop:

``````for (iterator it = collection.begin(); it != collection.end(); it++)
{
DoStuff(*it);
}
``````

With `end()` pointing to the last element, a loop would be more complex:

``````iterator it = collection.begin();
while (!collection.empty())
{
DoStuff(*it);

if (it == collection.end())
break;

it++;
}
``````

#### @user541686 2012-04-01 09:44:58

Because then

``````size() == end() - begin()   // For iterators for whom subtraction is valid
``````

and you won't have to do awkward things like

``````// Never mind that this is INVALID for input iterators...
bool empty() { return begin() == end() + 1; }
``````

and you won't accidentally write erroneous code like

``````bool empty() { return begin() == end() - 1; }    // a typo from the first version
// of this post
// (see, it really is confusing)

bool empty() { return end() - begin() == -1; }   // Signed/unsigned mismatch
// Plus the fact that subtracting is also invalid for many iterators
``````

Also: What would `find()` return if `end()` pointed to a valid element?
Do you really want another member called `invalid()` which returns an invalid iterator?!
Two iterators is already painful enough...

Oh, and see this related post.

## Also:

If the `end` was before the last element, how would you `insert()` at the true end?!

#### @underscore_d 2018-09-19 19:51:37

This is a highly underrated answer. The examples are concise and straight to the point, and the "Also"s were not said by anyone else and are the kind of things that seem very obvious in retrospect but hit me like revelations.

#### @user541686 2018-09-19 19:52:29

@underscore_d: Thank you!! :)

#### @underscore_d 2018-09-19 19:55:04

btw, in case I seem like a hypocrite for not upvoting, that's because I already did way back in July 2016!

#### @user541686 2018-09-19 20:15:04

@underscore_d: hahaha I didn't even notice, but thanks! :)

#### @Alok Save 2012-04-01 09:42:30

Why does the Standard define `end()` as one past the end, instead of at the actual end?

Because:

1. It avoids special handling for empty ranges. For empty ranges, `begin()` is equal to `end()` &
2. It makes the end criterion simple for loops that iterate over the elements: The loops simply continue as long as `end()` is not reached.

#### @Ken Bloom 2012-04-01 11:37:51

The iterator idiom of half-closed ranges `[begin(), end())` is originally based on pointer arithmetic for plain arrays. In that mode of operation, you would have functions that were passed an array and a size.

``````void func(int* array, size_t size)
``````

Converting to half-closed ranges `[begin, end)` is very simple when you have that information:

``````int* begin;
int* end = array + size;

for (int* it = begin; it < end; ++it) { ... }
``````

To work with fully-closed ranges, it's harder:

``````int* begin;
int* end = array + size - 1;

for (int* it = begin; it <= end; ++it) { ... }
``````

Since pointers to arrays are iterators in C++ (and the syntax was designed to allow this), it's much easier to call `std::find(array, array + size, some_value)` than it is to call `std::find(array, array + size - 1, some_value)`.

Plus, if you work with half-closed ranges, you can use the `!=` operator to check for the end condition, becuase (if your operators are defined correctly) `<` implies `!=`.

``````for (int* it = begin; it != end; ++ it) { ... }
``````

However there's no easy way to do this with fully-closed ranges. You're stuck with `<=`.

The only kind of iterator that supports `<` and `>` operations in C++ are random-access iterators. If you had to write a `<=` operator for every iterator class in C++, you'd have to make all of your iterators fully comparable, and you'd fewer choices for creating less capable iterators (such as the bidirectional iterators on `std::list`, or the input iterators that operate on `iostreams`) if C++ used fully-closed ranges.