I have a Pandas dataframe that looks something like this:
timestamp | Place | Data A | Data B | Data C |
---|---|---|---|---|
16508 | France | 0.03 | 0.06 | 0.15 |
16510 | England | 0.05 | 0.07 | 0.11 |
16515 | England | 0.04 | 0.03 | 0.87 |
What I would like to do is the following:
- Add a new column for every different value in the column "Place".
- In this new column, add the division between Data A and Data B in percentage (Data A / Data B * 100).
The expected output would be:
timestamp | Place | Data A | Data B | Data C | To France | To England |
---|---|---|---|---|---|---|
16508 | France | 0.03 | 0.06 | 0.15 | 50 | 0 |
16510 | England | 0.05 | 0.07 | 0.11 | 0 | 71.42 |
16515 | England | 0.04 | 0.03 | 0.87 | 0 | 133.33 |
I tried the following:
for column in data['Place'].unique():
column_name = f'To {Place}'
data[column_name] = data[data['Place'] == column]['Data A'].div(['Data B'])*100
data[column_name].fillna(method='ffill', inplace=True)
data[column_name].fillna(value=0, inplace=True)
But it's not working. I get a "'list' object has no attribute 'div'" error. I have tried other different things but they are not working either.
Could somebody give me a hand with this?
Thanks in advance.
CodePudding user response:
IIUC, you can try with pivot
:
df["Ratio"] = df["Data A"].div(df["Data B"])
output = df.drop("Ratio", axis=1).join(df.pivot(None, "Place", "Ratio").mul(100).fillna(0).add_prefix("To "))
>>> output
Place Data A Data B Data C To England To France
0 France 0.03 0.06 0.15 0.000000 50.0
1 England 0.05 0.07 0.11 71.428571 0.0
2 England 0.04 0.03 0.87 133.333333 0.0
CodePudding user response:
I'd do it like this:
df_ratio = ((df['Data A'].div(df['Data B'])*100).to_frame()
.assign(col='To ' df['Place'])
.set_index('col', append=True)[0]
.unstack(fill_value=0))
pd.concat([df, df_ratio], axis=1))
Output:
timestamp Place Data A Data B Data C To England To France
0 16508 France 0.03 0.06 0.15 0.000000 50.0
1 16510 England 0.05 0.07 0.11 71.428571 0.0
2 16515 England 0.04 0.03 0.87 133.333333 0.0