By Owen


2014-05-02 19:07:56 8 Comments

I feel like there is a better way than this:

import pandas as pd
df = pd.DataFrame(
    [['A', 'X', 3], ['A', 'X', 5], ['A', 'Y', 7], ['A', 'Y', 1],
     ['B', 'X', 3], ['B', 'X', 1], ['B', 'X', 3], ['B', 'Y', 1],
     ['C', 'X', 7], ['C', 'Y', 4], ['C', 'Y', 1], ['C', 'Y', 6]],
    columns=['c1', 'c2', 'v1'])
def callback(x):
    x['seq'] = range(1, x.shape[0] + 1)
    return x
df = df.groupby(['c1', 'c2']).apply(callback)
print df

To achieve this:

   c1 c2  v1  seq
0   A  X   3    1
1   A  X   5    2
2   A  Y   7    1
3   A  Y   1    2
4   B  X   3    1
5   B  X   1    2
6   B  X   3    3
7   B  Y   1    1
8   C  X   7    1
9   C  Y   4    1
10  C  Y   1    2
11  C  Y   6    3

Is there a way to do it that avoids the callback?

2 comments

@Md Johirul Islam 2018-05-03 19:11:13

The full working code

import pandas as pd
df = pd.DataFrame(
    [['A', 'X', 3], ['A', 'X', 5], ['A', 'Y', 7], ['A', 'Y', 1],
     ['B', 'X', 3], ['B', 'X', 1], ['B', 'X', 3], ['B', 'Y', 1],
     ['C', 'X', 7], ['C', 'Y', 4], ['C', 'Y', 1], ['C', 'Y', 6]],
    columns=['c1', 'c2', 'v1'])

df['seq'] = df.groupby(['c1', 'c2']).cumcount() + 1
print(df)

Output

   c1 c2  v1  seq
0   A  X   3    1
1   A  X   5    2
2   A  Y   7    1
3   A  Y   1    2
4   B  X   3    1
5   B  X   1    2
6   B  X   3    3
7   B  Y   1    1
8   C  X   7    1
9   C  Y   4    1
10  C  Y   1    2
11  C  Y   6    3

@Jeff 2014-05-02 19:11:17

use cumcount(), see docs here

In [4]: df.groupby(['c1', 'c2']).cumcount()
Out[4]: 
0     0
1     1
2     0
3     1
4     0
5     1
6     2
7     0
8     0
9     0
10    1
11    2
dtype: int64

If you want orderings starting at 1

In [5]: df.groupby(['c1', 'c2']).cumcount()+1
Out[5]: 
0     1
1     2
2     1
3     2
4     1
5     2
6     3
7     1
8     1
9     1
10    2
11    3
dtype: int64

Related Questions

Sponsored Content

16 Answered Questions

[SOLVED] Selecting multiple columns in a pandas dataframe

33 Answered Questions

[SOLVED] Renaming columns in pandas

18 Answered Questions

[SOLVED] Get list from pandas DataFrame column headers

17 Answered Questions

[SOLVED] How to iterate over rows in a DataFrame in Pandas?

14 Answered Questions

[SOLVED] Select rows from a DataFrame based on values in a column in pandas

23 Answered Questions

[SOLVED] Adding new column to existing DataFrame in Python pandas

18 Answered Questions

[SOLVED] Add one row to pandas DataFrame

15 Answered Questions

[SOLVED] Delete column from pandas DataFrame by column name

13 Answered Questions

[SOLVED] "Large data" work flows using pandas

13 Answered Questions

[SOLVED] How do I expand the output display to see more columns?

Sponsored Content