Home > database >  Transpose or consolidate Dataframe
Transpose or consolidate Dataframe

Time:10-14

Got a tricky situation. I tried my best via Pivot or other methods but gave up. Please help if possible.

I like to take a value = 1 from each column and populate the Date in that part. After the above map, the 'Date' field is no more needed. So I am ok to delete that

My sample dataset:

df1 = pd.DataFrame({'Patient': ['John','John','John','Smith','Smith','Smith'],
                   'Date': [20200101, 20200102, 20200105,20220101, 20220102, 20220105],
                   'Ibrufen': ['NaN','NaN',1,'NaN','NaN',1],
                   'Tylenol': [1, 'NaN','NaN',1, 'NaN','NaN'],                
                   })

My desired output:

df2 = pd.DataFrame({'Patient': ['Jonh','Smith'],
                   'Ibrufen': ['20200105','20220105'],
                   'Tylenol': ['20200101','20220101'],
                   'Steroid': ['20200102','20220102'],                  
                   })

CodePudding user response:

A possible solution, based on the idea of first creating an auxiliary column containing, for each row, the corresponding medicine:

df1['aux'] = df1.apply(lambda x:
                       'Ibrufen' if (x['Ibrufen'] == 1) else
                       'Tylenol' if (x['Tylenol'] == 1) else
                       'Steroid', axis=1)

(df1.pivot(index='Patient', columns='aux', values='Date')
 .reset_index()
 .rename_axis(None, axis=1))

Output:

  Patient   Ibrufen   Steroid   Tylenol
0    John  20200105  20200102  20200101
1   Smith  20220105  20220102  20220101
  • Related