I have a nested hearders in a dataframe after pivoting it and I'd like to make it flat, with repeated item on each rows (like in a excel) May you help me ?
import pandas as pd
data = {
"year": [2022, 2022 , 2021 , 2021 , 2020 ],
"client": ["A", "B", "B", "C", "C"],
"product" : [ "P1", "P2" , "P1", "P2", "P1"],
"sales" : [ 10,20, 20, 22, 25]
}
df = pd.DataFrame(data)
df2 = df.pivot_table ( index = ["year","client"] , columns = ["product"] , values = ["sales"] , aggfunc ="sum")
df2
sales
product P1 P2
year client
2020 C 25.0 NaN
2021 B 20.0 NaN
C NaN 22.0
2022 A 10.0 NaN
B NaN 20.0
where I'd like to get a flat header and reapeated rows :
year client P1 P2
2020 C 25.0 NaN
2021 B 20.0 NaN
2021 C NaN 22.0
2022 A 10.0 NaN
2022 B NaN 20.0
many thanks
Nico
CodePudding user response:
You need to use droplevel(0)
which will remove the top level (Sales). Setting columns.name
to None
will remove the Products
. So, after df2
is created, add these lines...
>> df2.columns = df2.columns.droplevel(0)
>> df2.columns.name = None
>> df2 =df2.reset_index()
>> df2
year client P1 P2
0 2020 C 25.0 NaN
1 2021 B 20.0 NaN
2 2021 C NaN 22.0
3 2022 A 10.0 NaN
4 2022 B NaN 20.0
CodePudding user response:
You can try
df = (df2.droplevel(0, axis=1)
.rename_axis(None, axis=1)
.reset_index())
print(df)
year client P1 P2
0 2020 C 25.0 NaN
1 2021 B 20.0 NaN
2 2021 C NaN 22.0
3 2022 A 10.0 NaN
4 2022 B NaN 20.0