I have two data frames with many different column names but a few common ones. One frame has rows that have to be "expanded" with rows from the other data frame:
Example:
df = pd.DataFrame({'option':['A', 'A', 'B', 'B', 'fill_A', 'fill_B', ], 'items':['11111', '22222', '33333', '11111', '', '', ], 'other_colA':['','', '','', '','' ]})
look_up_df = pd.DataFrame({'option':['A','A','A','B', 'B','B'], 'items':['11111', '22222', '33333', '44444', '55555', '66666'], 'other_colB':['','', '','', '','' ]})
df
Rows "fill_A" and "fill_B" in df
have to be replace with rows from look_up_df
like so:
How do this expansion while leaving out of the rest of columns ?
CodePudding user response:
You can use boolean indexing, a merge
and concat
:
m = df['option'].str.startswith('fill_')
pd.concat([df[~m],
df.loc[m, 'option'].str.replace('fill_', '').to_frame()
.merge(look_up_df, on='option')
])
Output:
option items other_colA other_colB
0 A 11111 NaN
1 A 22222 NaN
2 B 33333 NaN
3 B 11111 NaN
0 A 11111 NaN
1 A 22222 NaN
2 A 33333 NaN
3 B 44444 NaN
4 B 55555 NaN
5 B 66666 NaN