By Readonly


2008-09-16 01:43:39 8 Comments

Is there any performance difference between tuples and lists when it comes to instantiation and retrieval of elements?

8 comments

@Raymond Hettinger 2014-03-03 06:30:39

Summary

Tuples tend to perform better than lists in almost every category:

1) Tuples can be constant folded.

2) Tuples can be reused instead of copied.

3) Tuples are compact and don't over-allocate.

4) Tuples directly reference their elements.

Tuples can be constant folded

Tuples of constants can be precomputed by Python's peephole optimizer or AST-optimizer. Lists, on the other hand, get built-up from scratch:

    >>> from dis import dis

    >>> dis(compile("(10, 'abc')", '', 'eval'))
      1           0 LOAD_CONST               2 ((10, 'abc'))
                  3 RETURN_VALUE   

    >>> dis(compile("[10, 'abc']", '', 'eval'))
      1           0 LOAD_CONST               0 (10)
                  3 LOAD_CONST               1 ('abc')
                  6 BUILD_LIST               2
                  9 RETURN_VALUE 

Tuples do not need to be copied

Running tuple(some_tuple) returns immediately itself. Since tuples are immutable, they do not have to be copied:

>>> a = (10, 20, 30)
>>> b = tuple(a)
>>> a is b
True

In contrast, list(some_list) requires all the data to be copied to a new list:

>>> a = [10, 20, 30]
>>> b = list(a)
>>> a is b
False

Tuples do not over-allocate

Since a tuple's size is fixed, it can be stored more compactly than lists which need to over-allocate to make append() operations efficient.

This gives tuples a nice space advantage:

>>> import sys
>>> sys.getsizeof(tuple(iter(range(10))))
128
>>> sys.getsizeof(list(iter(range(10))))
200

Here is the comment from Objects/listobject.c that explains what lists are doing:

/* This over-allocates proportional to the list size, making room
 * for additional growth.  The over-allocation is mild, but is
 * enough to give linear-time amortized behavior over a long
 * sequence of appends() in the presence of a poorly-performing
 * system realloc().
 * The growth pattern is:  0, 4, 8, 16, 25, 35, 46, 58, 72, 88, ...
 * Note: new_allocated won't overflow because the largest possible value
 *       is PY_SSIZE_T_MAX * (9 / 8) + 6 which always fits in a size_t.
 */

Tuples refer directly to their elements

References to objects are incorporated directly in a tuple object. In contrast, lists have an extra layer of indirection to an external array of pointers.

This gives tuples a small speed advantage for indexed lookups and unpacking:

$ python3.6 -m timeit -s 'a = (10, 20, 30)' 'a[1]'
10000000 loops, best of 3: 0.0304 usec per loop
$ python3.6 -m timeit -s 'a = [10, 20, 30]' 'a[1]'
10000000 loops, best of 3: 0.0309 usec per loop

$ python3.6 -m timeit -s 'a = (10, 20, 30)' 'x, y, z = a'
10000000 loops, best of 3: 0.0249 usec per loop
$ python3.6 -m timeit -s 'a = [10, 20, 30]' 'x, y, z = a'
10000000 loops, best of 3: 0.0251 usec per loop

Here is how the tuple (10, 20) is stored:

    typedef struct {
        Py_ssize_t ob_refcnt;
        struct _typeobject *ob_type;
        Py_ssize_t ob_size;
        PyObject *ob_item[2];     /* store a pointer to 10 and a pointer to 20 */
    } PyTupleObject;

Here is how the list [10, 20] is stored:

    PyObject arr[2];              /* store a pointer to 10 and a pointer to 20 */

    typedef struct {
        Py_ssize_t ob_refcnt;
        struct _typeobject *ob_type;
        Py_ssize_t ob_size;
        PyObject **ob_item = arr; /* store a pointer to the two-pointer array */
        Py_ssize_t allocated;
    } PyListObject;

Note that the tuple object incorporates the two data pointers directly while the list object has an additional layer of indirection to an external array holding the two data pointers.

@osman 2015-09-26 07:16:09

Finally, someone puts the C structs!

@DRz 2016-08-05 14:56:22

Internally, tuples are stored a little more efficiently than lists, and also tuples can be accessed slightly faster. How could you explain the results from dF.'s answer then?

@horta 2016-12-21 18:35:44

When working with ~50k lists of ~100 element lists, moving this structure to tuples decreased lookup times by multiple orders of magnitude for multiple lookups. I believe this to be due to the greater cache locality of the tuple once you start using the tuple due to the removal of the second layer of indirection you demonstrate.

@Divakar 2018-11-20 01:57:54

The main reason for Tuple to be very efficient in reading is because it's immutable.

Why immutable objects are easy to read?

The reason is tuples can be stored in the memory cache, unlike lists. The program always read from the lists memory location as it is mutable (can change any time).

@Ralph Corderoy 2008-09-16 11:14:08

You should also consider the array module in the standard library if all the items in your list or tuple are of the same C type. It will take less memory and can be faster.

@Brian 2008-10-27 16:06:18

It'll take less memory, but access time will probably be a bit slower, rather than faster. Accessing an element requires the packed value to be unboxed to a real integer, which will slow the process down.

@Dev Aggarwal 2018-08-25 08:06:33

Here is another little benchmark, just for the sake of it..

In [11]: %timeit list(range(100))
749 ns ± 2.41 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

In [12]: %timeit tuple(range(100))
781 ns ± 3.34 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

In [1]: %timeit list(range(1_000))
13.5 µs ± 466 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

In [2]: %timeit tuple(range(1_000))
12.4 µs ± 182 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

In [7]: %timeit list(range(10_000))
182 µs ± 810 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)

In [8]: %timeit tuple(range(10_000))
188 µs ± 2.38 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)

In [3]: %timeit list(range(1_00_000))
2.76 ms ± 30.5 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

In [4]: %timeit tuple(range(1_00_000))
2.74 ms ± 31.8 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

In [10]: %timeit list(range(10_00_000))
28.1 ms ± 266 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

In [9]: %timeit tuple(range(10_00_000))
28.5 ms ± 447 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

Let's average these out:

In [3]: l = np.array([749 * 10 ** -9, 13.5 * 10 ** -6, 182 * 10 ** -6, 2.76 * 10 ** -3, 28.1 * 10 ** -3])

In [2]: t = np.array([781 * 10 ** -9, 12.4 * 10 ** -6, 188 * 10 ** -6, 2.74 * 10 ** -3, 28.5 * 10 ** -3])

In [11]: np.average(l)
Out[11]: 0.0062112498000000006

In [12]: np.average(t)
Out[12]: 0.0062882362

In [17]: np.average(t) / np.average(l)  * 100
Out[17]: 101.23946713590554

You can call it almost inconclusive.

But sure, tuples took 101.239% the time, or 1.239% extra time to do the job compared to lists.

@Mark Harrison 2008-09-16 02:13:29

The dis module disassembles the byte code for a function and is useful to see the difference between tuples and lists.

In this case, you can see that accessing an element generates identical code, but that assigning a tuple is much faster than assigning a list.

>>> def a():
...     x=[1,2,3,4,5]
...     y=x[2]
...
>>> def b():
...     x=(1,2,3,4,5)
...     y=x[2]
...
>>> import dis
>>> dis.dis(a)
  2           0 LOAD_CONST               1 (1)
              3 LOAD_CONST               2 (2)
              6 LOAD_CONST               3 (3)
              9 LOAD_CONST               4 (4)
             12 LOAD_CONST               5 (5)
             15 BUILD_LIST               5
             18 STORE_FAST               0 (x)

  3          21 LOAD_FAST                0 (x)
             24 LOAD_CONST               2 (2)
             27 BINARY_SUBSCR
             28 STORE_FAST               1 (y)
             31 LOAD_CONST               0 (None)
             34 RETURN_VALUE
>>> dis.dis(b)
  2           0 LOAD_CONST               6 ((1, 2, 3, 4, 5))
              3 STORE_FAST               0 (x)

  3           6 LOAD_FAST                0 (x)
              9 LOAD_CONST               2 (2)
             12 BINARY_SUBSCR
             13 STORE_FAST               1 (y)
             16 LOAD_CONST               0 (None)
             19 RETURN_VALUE

@mzz 2010-01-31 15:21:35

Err, just that the same bytecode is generated absolutely does not mean the same operations happen at the C (and therefore cpu) level. Try creating a class ListLike with a __getitem__ that does something horribly slow, then disassemble x = ListLike((1, 2, 3, 4, 5)); y = x[2]. The bytecode will be more like the tuple example above than the list example, but do you really believe that means performance will be similar?

@Mark Harrison 2010-02-02 10:58:51

It seems you're saying that that some types are more efficient than others. That makes sense, but the overhead of list and tuple generations seems to be orthogonal to the data type involved, with the caveat that they are lists and tuples of the same data type.

@martineau 2012-06-12 20:20:40

Number of byte-codes, like number of lines-of-code, bears little relationship to speed-of-execution (and therefore efficiency and performance).

@poolie 2013-05-14 22:32:11

Although the suggestion you can conclude anything from counting ops is misguided, this does show the key difference: constant tuples are stored as such in the bytecode and just referenced when used, whereas lists need to be built at runtime.

@Tom 2015-08-13 14:40:49

This answer shows us that Python acknowledges tuple constants. That's good to know! But what happens when trying to build a tuple or a list from variable values?

@dF. 2008-09-16 01:57:10

In general, you might expect tuples to be slightly faster. However you should definitely test your specific case (if the difference might impact the performance of your program -- remember "premature optimization is the root of all evil").

Python makes this very easy: timeit is your friend.

$ python -m timeit "x=(1,2,3,4,5,6,7,8)"
10000000 loops, best of 3: 0.0388 usec per loop

$ python -m timeit "x=[1,2,3,4,5,6,7,8]"
1000000 loops, best of 3: 0.363 usec per loop

and...

$ python -m timeit -s "x=(1,2,3,4,5,6,7,8)" "y=x[3]"
10000000 loops, best of 3: 0.0938 usec per loop

$ python -m timeit -s "x=[1,2,3,4,5,6,7,8]" "y=x[3]"
10000000 loops, best of 3: 0.0649 usec per loop

So in this case, instantiation is almost an order of magnitude faster for the tuple, but item access is actually somewhat faster for the list! So if you're creating a few tuples and accessing them many many times, it may actually be faster to use lists instead.

Of course if you want to change an item, the list will definitely be faster since you'd need to create an entire new tuple to change one item of it (since tuples are immutable).

@Matt Joiner 2011-02-20 15:17:54

What version of python were your tests with!

@dF. 2011-02-21 15:21:07

@Matt Not sure, but just re-ran them under 2.6.6 (on mac) and get basically the same results.

@Hamish Grubijan 2012-11-15 01:10:02

There is another interesting test - python -m timeit "x=tuple(xrange(999999))" vs python -m timeit "x=list(xrange(999999))". As one might expect, it takes a bit longer to materialize a tuple than a list.

@gerrit 2013-02-01 15:01:55

@HamishGrubijan Why might one expect that?

@Hamish Grubijan 2013-02-01 18:22:47

@gerrit, a tuple is of fixed size while a list is not. A tuple takes less space in memory than a list stackoverflow.com/questions/449560/…. Therefore, when constructing a tuple from a generator of an unknown size, there must be an extra effort of releasing unused buffer space, whereas a list would not need to do so. That is what I guessed.

@Jamie Pate 2013-02-05 18:39:45

on the other hand, the same timeit for a very short list (xrange(10)) is slightly faster with tuples, (of course, these are measured in the usec range, so unless you are doing a lot of them, it probably won't matter)

@shx2 2013-11-01 13:19:31

@HamishGrubijan perhaps xrange is not a good example, because the size is known (try len(xrange(999999))). However, I see your point with a general generator.

@ToolmakerSteve 2013-12-15 19:57:39

Seems bizarre that tuple access is slower than list access. However, trying that in Python 2.7 on my Windows 7 PC, the difference is only 10%, so unimportant.

@Raymond Hettinger 2014-05-03 22:54:27

FWIW, list access is faster that tuple access in Python 2 but only because there is a special case for lists in BINARY_SUBSCR in Python/ceval.c. In Python 3, that optimization is gone, and tuples access becomes slighty faster than list access.

@yoopoo 2014-07-07 08:38:40

@dF. Why is the first timing (0.363 usec) of list 5 times slower than the second(0.0649 usec)? I cannot reproduce that.

@leewz 2016-04-15 04:23:41

The first test is probably wrong. You are assigning a tuple of constants, which is a constant, so the compiler creates the tuple as a code constant instead of generating code to create it.

@leewz 2016-04-15 04:26:13

@yoopoo, the first test creates a list a million times, but the second creates a list once and accesses it a million times. The -s "SETUP_CODE" is run before the actual timed code.

@tzot 2008-09-16 10:16:52

Tuples, being immutable, are more memory efficient; lists, for efficiency, overallocate memory in order to allow appends without constant reallocs. So, if you want to iterate through a constant sequence of values in your code (eg for direction in 'up', 'right', 'down', 'left':), tuples are preferred, since such tuples are pre-calculated in compile time.

Access speeds should be the same (they are both stored as contiguous arrays in the memory).

But, alist.append(item) is much preferred to atuple+= (item,) when you deal with mutable data. Remember, tuples are intended to be treated as records without field names.

@balki 2012-07-21 16:06:19

what is compile time in python?

@tzot 2012-07-23 07:57:33

@balki: the time when python source is compiled to bytecode (which bytecode might be saved as a .py[co] file).

@Grijesh Chauhan 2014-09-09 07:38:15

A citation would be great if possible.

@ctcherry 2008-09-16 01:45:39

Tuples should be slightly more efficient and because of that, faster, than lists because they are immutable.

@Blair Conrad 2008-09-16 01:47:26

Why do you say that immutability, in and of itself, increases efficiency? Especially efficiency of instantiation and retrieval?

@ctcherry 2008-09-16 06:47:06

It seems Mark's reply above mine has covered the disassembled instructions of what happens inside of Python. You can see that instantiation takes fewer instructions, however in this case, retrieval is apparently identical.

Related Questions

Sponsored Content

23 Answered Questions

28 Answered Questions

[SOLVED] Finding the index of an item given a list containing it in Python

  • 2008-10-07 01:39:38
  • Eugene M
  • 3367399 View
  • 2782 Score
  • 28 Answer
  • Tags:   python list indexing

30 Answered Questions

[SOLVED] How do I check if a list is empty?

  • 2008-09-10 06:20:11
  • Ray Vega
  • 2366861 View
  • 3235 Score
  • 30 Answer
  • Tags:   python list

16 Answered Questions

[SOLVED] What are metaclasses in Python?

60 Answered Questions

[SOLVED] How do you split a list into evenly sized chunks?

20 Answered Questions

38 Answered Questions

[SOLVED] What does the "yield" keyword do?

11 Answered Questions

[SOLVED] What are "named tuples" in Python?

20 Answered Questions

[SOLVED] What is the difference between call and apply?

17 Answered Questions

[SOLVED] What's the difference between lists and tuples?

  • 2009-03-09 15:41:25
  • Lucas Gabriel Sánchez
  • 394808 View
  • 957 Score
  • 17 Answer
  • Tags:   python list tuples

Sponsored Content