Having trouble merging two dataframes, one of which was converted to a pivot table. Once merging the column headers are not preserved and collapse in to (x,,) style format, see below:
df1
data = {'x': ['a1', 'a1', 'a1', 'b1', 'b1', 'b2'],
'color': ['blue', 'red','blue','red','blue','blue',],
'type': ['car', 'truck', 'truck', 'car', 'car', 'truck'],
'values' : [1,2,3,4,5,6]}
df = pd.DataFrame(data=data)
x y z values
0 a1 blue car 1
1 a1 red truck 2
2 a1 blue truck 3
3 b1 red car 4
4 b1 blue car 5
5 b2 blue truck 6
Pivot df1 so all a1 and b1 are in one row
table = pd.pivot_table(df, index=['x'], values=['values'], columns=['color','type'], fill_value=0).reset_index()
x values
color blue red
type car truck car truck
0 a1 1 3 0 2
1 b1 5 6 4 0
df2
data2 = {'x': ['a1', 'b1'],
'price': [1300, 2500]}
df2 = pd.DataFrame(data=data2)
combine df1 and df2 on the 'x' index
combined = table.merge(df2, on='x')
combined.head()
x (x, , ) (values, blue, car) (values, blue, truck) (values, red, car) (values, red, truck) price
0 a1 a1 1 3 0 2 1300
1 b1 b1 5 6 4 0 2500
The previous column headers collapsed in to parentheses and comma format, like the index from each df wasn't carried over properly. I want it to just add the price column and preserve the previous multi-tiered header of df1
CodePudding user response:
You have to change the index of df2
before merge
to get same levels of index:
df2.columns = pd.MultiIndex.from_product([df2.columns, [''], ['']])
combined = table.merge(df2.sort_index(axis=1), on='x')
print(combined)
# Output
x values price
color blue red
type car truck car truck
0 a1 1 3 0 2 1300
1 b1 5 0 4 0 2500