Home > Software design >  How to Transpose a table with different columns using python
How to Transpose a table with different columns using python

Time:04-07

Here is the initial table as shown below.

Cust ID Jan Transaction Fee Jan Transaction Fee Jan Product Fee Jan Product Fee Feb Transaction Fee Feb Transaction Fee Feb Product Fee Feb Product Fee
HKD USD HKD USD HKD USD HKD USD
100103 100 20 21 24 215 55 253 25
100104 200 30 31 34 315 65 353 35

I would like to convert the table from above to the below expected result.

Cust ID Period Type FX Price
100103 202201 Transaction Fee HKD 100
100103 202201 Transaction Fee USD 20
100103 202201 Product Fee HKD 21
100103 202201 Product Fee USD 24
100103 202202 Transaction Fee HKD 215
100103 202202 Transaction Fee USD 55
100103 202202 Product Fee HKD 253
100103 202202 Product Fee USD 25
100104 202201 Transaction Fee HKD 200
100104 202201 Transaction Fee USD 30
100104 202201 Product Fee HKD 31
100104 202201 Product Fee USD 34
100104 202202 Transaction Fee HKD 315
100104 202202 Transaction Fee USD 65
100104 202202 Product Fee HKD 353
100104 202202 Product Fee USD 35

My coding on import data is below

import pandas as pd 


test=pd.DataFrame({'Cust ID':['','100103','100104'],'Jan Transaction Fee':['HKD',100,200],'Jan Transaction Fee.1':['USD',20,30],\
                  'Jan Product Fee':['HKD',21,31],'Jan Product Fee.1':['USD',24,34],
                  'Feb Transaction Fee':['HKD',215,315],'Feb Transaction Fee.1':['USD',55,65],
                  'Feb Product Fee':['HKD',253,353],'Feb Product Fee.1':['USD',25,35]})

test

is there a way to do the expected result with using Python?

CodePudding user response:

This is a complex reshape.

NB. I ignored the '.1', removed using test.columns = test.columns.map(lambda s: s.strip('.1')).

df = (test
 .T.set_index(0, append=True).T
 .set_index([('Cust ID', '')])
 .stack()
 .rename_axis(index=['Cust ID', 'FX'], columns='Type')
 .stack()
 .reset_index(name='Price')
)

df[['Period', 'Type']] = df['Type'].str.split(n=1, expand=True)
df['Period'] = pd.to_datetime('2022 ' df['Period']).dt.strftime('%Y%m')

output:

  Cust ID   FX             Type Price  Period
0  100103  HKD  Transaction Fee   100  202201
1  100103  USD      Product Fee    25  202202
2  100103  USD  Transaction Fee    55  202202
3  100103  USD      Product Fee    24  202201
4  100103  USD  Transaction Fee    20  202201
5  100104  HKD  Transaction Fee   200  202201
6  100104  USD      Product Fee    35  202202
7  100104  USD  Transaction Fee    65  202202
8  100104  USD      Product Fee    34  202201
9  100104  USD  Transaction Fee    30  202201
  • Related