I have a pandas dataframe like so:
id cost val1 val1_avg val1_tot val2 val2_avg val2_tot val3 val3_avg val3_tot
1 5 29 24 45 15 18 78 25 32 56
Now I want to melt this dataframe on id and cost, such that I also separate value, avg value & total value for val1, val2, & val3. So finally I want to get below dataframe:
id cost variable value value_avg value_tot
1 5 val1 29 24 45
1 5 val3 15 18 78
1 5 val3 25 32 56
How can I achieve this?
CodePudding user response:
You can try pandas.wide_to_long
df.columns = [col.split('_')[1] '_' col.split('_')[0] if '_' in col else col for col in df.columns]
out = pd.wide_to_long(df, ["val", "avg_val", "tot_val"], i=["id", "cost"], j='variable').reset_index()
out['variable'] = 'val' out['variable'].astype(str)
out.columns = [col.split('_')[1] '_' col.split('_')[0] if '_' in col else col for col in out.columns]
print(out)
id cost variable val val_avg val_tot
0 1 5 val1 29 24 45
1 1 5 val2 15 18 78
2 1 5 val3 25 32 56
CodePudding user response:
Here is an approach:
from itertools import zip_longest
(df
.set_index(['id', 'cost'])
.pipe(lambda d: d.set_axis(pd.MultiIndex
.from_arrays(zip_longest(*d.columns.str.split('(?=_)'),
fillvalue='')),
axis=1
))
.rename_axis(columns=('variable', None))
.stack(0)
.add_prefix('value')
.reset_index()
)
Alternative with melt
pivot
:
(df
.melt(id_vars=['id', 'cost'])
.assign(col=lambda d: 'value' d['variable'].str.split('(?=_)').str[1].fillna(''),
variable=lambda d: d['variable'].str.split('_').str[0],
)
.pivot(['id', 'cost', 'variable'], 'col', 'value')
.reset_index()
)
output:
id cost variable value value_avg value_tot
0 1 5 val1 29 24 45
1 1 5 val2 15 18 78
2 1 5 val3 25 32 56