By Visualisation App


2019-08-13 20:00:58 8 Comments

How can I repeat the values in my dataframe n times while adding a new column in each repetition

I've tried and got the repetition of the values n times but I couldn't figure out how to add a new column. Here's my initial dataframe of randomly generated temperatures -

df1 = 
    temp
0   30
1   40
2   50
3   60

And I could replicate it n times using following code -

df2 = pd.DataFrame(np.repeat(df.values,2,axis=0))

Now I want the new df to have a new column called city and every new repetition to add a different value specified in the following list -

cities = ['Bangalore', 'Hyderabad'] //no. of cities will be same as n

expected output -
df2 = 
    temp city
0   30   Bangalore
1   40   Bangalore
2   50   Bangalore
3   60   Bangalore
4   30   Hyderabad
5   40   Hyderabad
6   50   Hyderabad
7   60   Hyderabad

How can I get this

3 comments

@Pierre V. 2019-08-13 20:20:54

Using numpy.tile and numpy.repeat:

import pandas as pd
import numpy as np

temps = [30, 40, 50, 60]
cities = ['Bangalore', 'Hyderabad']

temp = np.tile(temps, len(cities))
city = np.repeat(cities, len(temps))
df = pd.DataFrame({"temp": temp, "city": city})

Output:

    temp    city
0   30  Bangalore
1   40  Bangalore
2   50  Bangalore
3   60  Bangalore
4   30  Hyderabad
5   40  Hyderabad
6   50  Hyderabad
7   60  Hyderabad

@Visualisation App 2019-08-13 20:22:33

This is neat but for this I need to have my temp values in a list

@political scientist 2019-08-13 20:24:53

I like your numpy approach, that's nice!

@political scientist 2019-08-13 20:15:35

using pandas.MultiIndex.from_product

pd.MultiIndex.from_product([df['temp'], cities], names=['temp', 'city']) \
    .to_frame(index=False) \
    .sort_values('city')
    temp    city
0   30  Bangalore
2   40  Bangalore
4   50  Bangalore
6   60  Bangalore
1   30  Hyderabad
3   40  Hyderabad
5   50  Hyderabad
7   60  Hyderabad

@Erfan 2019-08-13 20:10:47

Using DataFrame.assign & pd.concat:

We loop over each city in your cities list and assign it as a new column. Then we use concat to concatenate the separate dataframes to one final dataframe.

final = pd.concat([df1.assign(city=c) for c in cities], ignore_index=True)

Output

   temp       city
0    30  Bangalore
1    40  Bangalore
2    50  Bangalore
3    60  Bangalore
4    30  Hyderabad
5    40  Hyderabad
6    50  Hyderabad
7    60  Hyderabad

Related Questions

Sponsored Content

18 Answered Questions

[SOLVED] Get list from pandas DataFrame column headers

33 Answered Questions

[SOLVED] How to get the current time in Python

  • 2009-01-06 04:54:23
  • user46646
  • 2899175 View
  • 2498 Score
  • 33 Answer
  • Tags:   python datetime time

5 Answered Questions

[SOLVED] Creating an empty Pandas DataFrame, then filling it?

10 Answered Questions

[SOLVED] Select rows from a DataFrame based on values in a column in pandas

12 Answered Questions

[SOLVED] How can I make a time delay in Python?

13 Answered Questions

[SOLVED] Delete column from pandas DataFrame

17 Answered Questions

[SOLVED] How to iterate over rows in a DataFrame in Pandas?

22 Answered Questions

[SOLVED] Renaming columns in pandas

22 Answered Questions

[SOLVED] Adding new column to existing DataFrame in Python pandas

15 Answered Questions

[SOLVED] Selecting multiple columns in a pandas dataframe

Sponsored Content