I have a dataframe that looks like
Name | Class | id | C-id | Main ID |
---|---|---|---|---|
ABC | XC | 1 | 1 | M1 |
ZTC | CVF | 2 | 1 | M1 |
ABK1 | XCD | 1 | 2 | M1 |
ZTH1 | CVFD | 2 | 2 | M1 |
I want the output to look like that
Name | Class | Name_m | Class_m | C-id | Main ID |
---|---|---|---|---|---|
ABC | XC | ZTC | CVF | 1 | M1 |
ABK1 | XCD | ZTH1 | CVFD | 2 | M1 |
Can someone please suggest how I can accomplish this with the least hardcoding possible.
One idea I have is to separate the records with id=1 and id =2 and merge again
CodePudding user response:
This will get you what you want, but I guess if you explain more about your logic and how your data looks then it will be easier to understand how to generalise it.
(df
.groupby('id')
.get_group(1)
.reset_index(drop=True)
.merge(
df
.groupby('id')
.get_group(2)[['Name', 'Class']]
.reset_index(drop=True)
.add_suffix('_m'),
left_index=True, right_index=True)
.drop(columns=['id'])
)
Or in smaller steps:
a = df.groupby('id').get_group(1).reset_index(drop=True)
b = df.groupby('id').get_group(2)[['Name', 'Class']].reset_index(drop=True).add_suffix('_m')
a.merge(b, left_index=True, right_index=True).drop(columns=['id'])
CodePudding user response:
I am sure that @Josh Friedlander can update his solution to produce the exact columns you need. In the mean time, this is another approach:
dfm = df[df['id'] == 1].merge(df[df['id'] == 2], on="C-id", how='inner')
dfm = dfm.rename(columns={'Name_x': 'Name', 'Class_x': 'Class', 'Name_y': 'Name_m', 'Class_y': 'Class_m', 'Main ID_y': 'Main ID'})
dfm.drop(['id_x', 'id_y', 'Main ID_x'], inplace=True, axis=1)
dfm
Yields:
Name Class C-id Name_m Class_m Main ID
0 ABC XC 1 ZTC CVF M1
1 ABK1 XCD 2 ZTH1 CVFD M1