Home > other >  Create duplicate row in Pandas dataframe with a one to many mapping
Create duplicate row in Pandas dataframe with a one to many mapping

Time:11-24

Suppose I have the following dataframe:

df = pd.DataFrame({"A":[1,2], "B":["Q1", "Q2"], "C":[5,6]})
print(df)

A   B   C
1   Q1  5
2   Q2  6

I want to expand the dataframe by replacing the values of the 'B' column as follows. I want to replace Q1 with Jan, Feb and Mar. Similarly, Q2 should be replaced with Apr, May and Jun. That is, I want the dataframe to look as follows:

A   B   C
1   Jan 5
1   Feb 5
1   Mar 5
2   Apr 6
2   May 6
2   Jun 6

Is it possible to do this in pandas?

CodePudding user response:

You can map a list and explode:

df = pd.DataFrame({"A":[1,2], "B":["Q1", "Q2"], "C":[5,6]})

sub = {'Q1': ['Jan', 'Feb', 'Mar'], 'Q2': ['Apr', 'May', 'Jun']}

(df.assign(B=df['B'].map(sub))
   .explode('B')
   .reset_index(drop=True) # optional
)

output:

   A    B  C
0  1  Jan  5
1  1  Feb  5
2  1  Mar  5
3  2  Apr  6
4  2  May  6
5  2  Jun  6

  • Related