I have a dataframe that looks like below
merge_id | identifier | Location | Value |
---|---|---|---|
1 | A1 | DEL | 50 |
1 | B2 | HYD | 60 |
2 | C1 | BEN | 80 |
2 | D2 | HYD | 10 |
I want the output dataframe to look like below
merge_id | identifier | Location | Value | m_identifier | m_Location | m_Value |
---|---|---|---|---|---|---|
1 | A1 | DEL | 50 | B2 | HYD | 60 |
2 | C1 | BEN | 80 | D2 | HYD | 10 |
Please can you suggest how I can do that
CodePudding user response:
This looks like a pivot
with a few tweaks:
df2 = (df.assign(c=df.groupby('merge_id').cumcount())
.pivot(index='merge_id', columns='c')
.sort_index(level=1, sort_remaining=False, axis=1)
)
df2.columns = df2.columns.map(lambda x: f'{"m_" if x[1] else ""}{x[0]}')
print(df2.reset_index())
output:
merge_id identifier Location Value m_identifier m_Location m_Value
0 1 A1 DEL 50 B2 HYD 60
1 2 C1 BEN 80 D2 HYD 10
CodePudding user response:
here is one way about it
df2=df.merge(df.mask(df['identifier'].str.endswith('1')),
on='merge_id',
how='left',
suffixes=(None,'_m'))
df2=df2.mask(df2['identifier'].eq(df2['identifier_m']))
df2.dropna()
merge_id identifier Location Value identifier_m Location_m Value_m
0 1.0 A1 DEL 50.0 B2 HYD 60.0
2 2.0 C1 BEN 80.0 D2 HYD 10.0