Home > Software design >  How can I map multiple row entries in df1 (name, category, amount) to one row entry per name under t
How can I map multiple row entries in df1 (name, category, amount) to one row entry per name under t

Time:02-01

A first dataframe has a column containing categories which are the same as the headers of the second. There are multiple row entries for one name in df1. df2 will have 1 row entry per name. df1 has 1 row entry per category per name. All rows for one name occur in sequence in df1. As shown:

enter image description here

The headers of df2 are as follows:

enter image description here

And the desired output is below:

enter image description here

How can I map data from the df1 to df2?

More specifically, how can I map multiple rows from df1 to 1 row and the respective columns of df2 in a more efficient way than looping twice to check for each category under each name?

Any help is appreciated, Have a great day

Code:

import pandas as pds

df1 = pds.DataFrame({'Client': ['Rick', 'Rick', 'John'], 'Category': ['Service1', 'Service2', 'Service1'], 'Amount': [250, 6, 79]})


df2 = pds.DataFrame(columns = ['Client', 'Due_Date', 'Service1', 'Service2'])

output = pds.DataFrame({'Client': ['Rick', 'John'], 'Due_Date': [None,None] , 'Service1': [250, 79], 'Service2': [6, 0]})

CodePudding user response:

This is an alternative approach using .pivot() and .assign()

df1_pivot = (df1.pivot(index='Client', columns='Category', values='Amount')
                .reset_index().assign(Due_Date=None))
df_out = df2.assign(**df1_pivot)
print(df_out)

  Client Due_Date  Service1  Service2
0   John     None      79.0       NaN
1   Rick     None     250.0       6.0

CodePudding user response:

You're looking for pandas.DataFrame.pivot :

out = (df1.pivot(index="Client", columns="Category")
           .reset_index()
           .set_axis(["Client", "Service1", "Service2"], axis=1)
           .assign(Due_Date= None)
      )

NB : I suggest you to use import pandas as pd as per the import convention .

Output : ​

print(out)

  Client  Service1  Service2 Due_Date
0   John      79.0       NaN     None
1   Rick     250.0       6.0     None
  • Related