Home > Net >  pandas merge two df and preserve multiple header columns
pandas merge two df and preserve multiple header columns

Time:07-04

Having trouble merging two dataframes, one of which was converted to a pivot table. Once merging the column headers are not preserved and collapse in to (x,,) style format, see below:

df1

data = {'x': ['a1', 'a1', 'a1', 'b1', 'b1', 'b2'], 
        'color': ['blue', 'red','blue','red','blue','blue',],
        'type': ['car', 'truck', 'truck', 'car', 'car', 'truck'],
        'values' : [1,2,3,4,5,6]}
df = pd.DataFrame(data=data)

    x   y       z      values
0   a1  blue    car     1
1   a1  red     truck   2
2   a1  blue    truck   3
3   b1  red     car     4
4   b1  blue    car     5
5   b2  blue    truck   6

Pivot df1 so all a1 and b1 are in one row

table = pd.pivot_table(df, index=['x'], values=['values'], columns=['color','type'], fill_value=0).reset_index()

      x     values
color       blue    red
type        car     truck   car     truck
0     a1    1       3       0       2
1     b1    5       6       4       0

df2

data2 = {'x': ['a1', 'b1'], 
        'price': [1300, 2500]}
df2 = pd.DataFrame(data=data2)

combine df1 and df2 on the 'x' index

combined = table.merge(df2, on='x')
combined.head()

    x       (x, , ) (values, blue, car)  (values, blue, truck)  (values, red, car)  (values, red, truck)    price
0   a1      a1      1                    3                      0                   2                       1300
1   b1      b1      5                    6                      4                   0                       2500

The previous column headers collapsed in to parentheses and comma format, like the index from each df wasn't carried over properly. I want it to just add the price column and preserve the previous multi-tiered header of df1

CodePudding user response:

You have to change the index of df2 before merge to get same levels of index:

df2.columns = pd.MultiIndex.from_product([df2.columns, [''], ['']])
combined = table.merge(df2.sort_index(axis=1), on='x')
print(combined)

# Output
        x values                 price
color       blue       red            
type         car truck car truck      
0      a1      1     3   0     2  1300
1      b1      5     0   4     0  2500
  • Related