I have a table below.
Updated Questions (adding one more column called Cust ID
Cust ID | Jan Transaction Fee | Jan Transaction Fee | Jan Product Fee | Jan Product Fee | Feb Transaction Fee | Feb Transaction Fee | Feb Product Fee | Feb Product Fee |
---|---|---|---|---|---|---|---|---|
HKD | USD | HKD | USD | HKD | USD | HKD | USD | |
100103 | 100 | 20 | 21 | 24 | 215 | 55 | 253 | 25 |
100104 | 200 | 30 | 31 | 34 | 315 | 65 | 353 | 35 |
I would like to convert the table from above to the below expected result.
Cust ID | Period | Type | FX | Price |
---|---|---|---|---|
100103 | 202201 | Transaction Fee | HKD | 100 |
100103 | 202201 | Transaction Fee | USD | 20 |
100103 | 202201 | Product Fee | HKD | 21 |
100103 | 202201 | Product Fee | USD | 24 |
100103 | 202202 | Transaction Fee | HKD | 215 |
100103 | 202202 | Transaction Fee | USD | 55 |
100103 | 202202 | Product Fee | HKD | 253 |
100103 | 202202 | Product Fee | USD | 25 |
100104 | 202201 | Transaction Fee | HKD | 200 |
100104 | 202201 | Transaction Fee | USD | 30 |
100104 | 202201 | Product Fee | HKD | 31 |
100104 | 202201 | Product Fee | USD | 34 |
100104 | 202202 | Transaction Fee | HKD | 315 |
100104 | 202202 | Transaction Fee | USD | 65 |
100104 | 202202 | Product Fee | HKD | 353 |
100104 | 202202 | Product Fee | USD | 35 |
In my thoughts, i would like to concate Transaction fee and currency together, then split the words later, but i have already come across an issue in doing the appending columns. My coding is below
import pandas as pd
test=pd.DataFrame({'Cust ID':['','','100103','100104'],'0':['Jan Transaction Fee','HKD',100,200],'1':['Jan Transaction Fee','USD',20,30],\
'2':['Jan Product Fee','HKD',21,31],'3':['Jan Product Fee','USD',24,34],
'4':['Feb Transaction Fee','HKD',215,315],'5':['Feb Transaction Fee','USD',55,65],
'6':['Feb Product Fee','HKD',253,353],'7':['Feb Product Fee','USD',25,35]})
test
df_1=test[['0','1']].iloc[:2]
df_1['test']=df_1['0'].str.cat(sep=' ')
df_1['test_1']=df_1['1'].str.cat(sep=' ')
df_1['test']=pd.concat([df_1['test_1']])
df_1
The final result is as follows
My test column remove the Jan Transaction Fee HKD, and only Jan Transaction Fee USD is remained under test column. Does anyone guide me how to append the values under test_1 column to test column so that there will be four rows tgt? Or is there any way to do my expected result in much easier way (by using any transpose technique).
I am just a beginner in Python. Hope someone can help. Thanks
CodePudding user response:
IIUC, you could set labels, transpose
, and split
the Type column into Type and period (with reworking the date):
df = test.set_axis(['Type', 'FX', 'Price']).T
# or renaming by numerical index
# df = test.rename({0: 'Type', 1: 'FW', 2: 'Price'}).T
df[['Period', 'Type']] = df['Type'].str.split(n=1, expand=True)
df['Period'] = pd.to_datetime('2022 ' df['Period']).dt.strftime('%Y%m')
output:
Type FW Price Period
0 Transaction Fee HKD 100 202201
1 Transaction Fee USD 20 202201
2 Product Fee HKD 21 202201
3 Product Fee USD 24 202201
4 Transaction Fee HKD 215 202202
5 Transaction Fee USD 55 202202
6 Product Fee HKD 253 202202
7 Product Fee USD 25 202202
Period as YYYYMMDD integer with DD as MonthEnd
df['Period'] = (pd.to_datetime('2022 ' df['Period'])
.add(pd.offsets.MonthEnd())
.dt.strftime('%Y%m%d')
.astype(int)
)