By Amelio Vazquez-Reina

2013-07-31 02:16:42 8 Comments

Say I have a dataframe

import pandas as pd
import numpy as np
foo = pd.DataFrame(np.random.random((10,5)))

and I create another dataframe from a subset of my data:

bar = foo.iloc[3:5,1:4]

does bar hold a copy of those elements from foo? Is there any way to create a view of that data instead? If so, what would happen if I try to modify data in this view? Does Pandas provide any sort of copy-on-write mechanism?


@davidshinn 2013-07-31 04:12:35

Your answer lies in the pandas docs: returning-a-view-versus-a-copy.

Whenever an array of labels or a boolean vector are involved in the indexing operation, the result will be a copy. With single label / scalar indexing and slicing, e.g. df.ix[3:6] or df.ix[:, 'A'], a view will be returned.

In your example, bar is a view of slices of foo. If you wanted a copy, you could have used the copy method. Modifying bar also modifies foo. pandas does not appear to have a copy-on-write mechanism.

See my code example below to illustrate:

In [1]: import pandas as pd
   ...: import numpy as np
   ...: foo = pd.DataFrame(np.random.random((10,5)))

In [2]: pd.__version__
Out[2]: ''

In [3]: np.__version__
Out[3]: '1.7.1'

In [4]: # DataFrame has copy method
   ...: foo_copy = foo.copy()

In [5]: bar = foo.iloc[3:5,1:4]

In [6]: bar == foo.iloc[3:5,1:4] == foo_copy.iloc[3:5,1:4]
      1     2     3
3  True  True  True
4  True  True  True

In [7]: # Changing the view
   ...: bar.ix[3,1] = 5

In [8]: # View and DataFrame still equal
   ...: bar == foo.iloc[3:5,1:4]
      1     2     3
3  True  True  True
4  True  True  True

In [9]: # It is now different from a copy of original
   ...: bar == foo_copy.iloc[3:5,1:4]
       1     2     3
3  False  True  True
4   True  True  True

@Lisa 2017-07-11 23:26:37

so when I do bar.loc[:, ['a', 'b']] it returns a copy, but when I do bar.loc[:, 'a'] it returns a view?

@davidshinn 2017-07-11 23:55:29

The bar.loc[:, 'a'] acts like a slice, which returns a view, vs bar.loc[:, ['a', 'b']], which uses list indexing which returns a copy. Note that bar.loc[:, ['a']] would also return a copy.

@Lisa 2017-07-12 00:27:11

how about bar['a']? is it a view or a copy?

@Pietro Marchesi 2017-10-24 09:04:45

@davidshinn Is the highlighted quote still in the docs you linked? I can't find it!

@davidshinn 2017-10-30 21:17:59

It has been revised since the original response (the quote is in version 0.13):…

@johnDanger 2020-02-28 23:56:28

The top link no longer directs to a page with this information.

Related Questions

Sponsored Content

23 Answered Questions

[SOLVED] Adding new column to existing DataFrame in Python pandas

15 Answered Questions

[SOLVED] Delete column from pandas DataFrame

21 Answered Questions

[SOLVED] How to iterate over rows in a DataFrame in Pandas?

17 Answered Questions

[SOLVED] Selecting multiple columns in a pandas dataframe

26 Answered Questions

[SOLVED] Renaming columns in pandas

19 Answered Questions

[SOLVED] Get list from pandas DataFrame column headers

16 Answered Questions

[SOLVED] "Large data" work flows using pandas

18 Answered Questions

[SOLVED] How to clone or copy a list?

10 Answered Questions

[SOLVED] How to select rows from a DataFrame based on column values?

Sponsored Content