new = pd.DataFrame({'table': \['a','b', 'c', 'd'\], 'desc': \['','','',''\], 'total':\[22,22,22,22\]})
old = pd.DataFrame({'table': \['a','b', 'e'\], 'desc': \['foo','foo','foo'\], 'total':\[11,11,11\]})
all = pd.merge(new, old, how='outer', on=\['table', 'total'\])
Output:
table desc_x total desc_y
0 a 22 NaN
1 b 22 NaN
2 c 22 NaN
3 d 22 NaN
4 a NaN 11 foo
Desired Output:
table desc total
0 a foo 22
1 b foo 22
2 c 22
3 d 22
4 a foo 11
I attempted to outer join, but it removed the descriptions of a and b.`
CodePudding user response:
- it makes no sense given what you are trying to achieve is to outer join on table and total. Changed to outer join on table
- table can then be modified to use preference that is implied in your desired output and cleanup columns
new = pd.DataFrame({'table': ['a','b', 'c', 'd'], 'desc': ['','','',''], 'total':[22,22,22,22]})
old = pd.DataFrame({'table': ['a','b', 'e'], 'desc': ['foo','foo','foo'], 'total':[11,11,11]})
all = pd.merge(new, old, how='outer', on=['table'])
# select prefered columns
all["desc"] = all["desc_x"].replace('', np.nan).fillna(all["desc_y"]).fillna("")
all["total"] = all["total_x"].fillna(all["total_y"])
# clean up columns
all = all.drop(columns=[c for c in all.columns if c[-2:] in ["_x", "_y"]])
all
table | desc | total | |
---|---|---|---|
0 | a | foo | 22 |
1 | b | foo | 22 |
2 | c | 22 | |
3 | d | 22 | |
4 | e | foo | 11 |