I have dataframe in pivot format but I want to make it in better way.
import pandas as pd
d = { 'year': [2019,2020,2021,2022], 'cat1': ['a','a','b','b'], 'cat2': ['c1','c2','c3','c4'],'value': [1,2,300,400]}
df = pd.DataFrame(data=d)
df.pivot(index=['year','cat1'], columns='cat2', values='value').reset_index()
output
year cat1 c1 c2 c3 c4
0 2019 a 1.0 NaN NaN NaN
1 2020 a NaN 1.0 NaN NaN
2 2021 b NaN NaN 300.0 NaN
3 2022 b NaN NaN NaN 300.0
output required like this--
year cat1 cat2 value
0 2019 a c1 1.0
1 2020 a c2 1.0
2 2021 b c3 300.0
3 2022 b c4 300.0
CodePudding user response:
Use DataFrame.stack()
:
df = df.set_index(['year', 'cat1']).stack().reset_index()
df.columns = ['year', 'cat1', 'cat2', 'value']
df
year cat1 cat2 value
0 2019 a c1 1.0
1 2020 a c2 1.0
2 2021 b c3 300.0
3 2022 b c4 300.0