This is an subset of huge dataset, I want that all the numbers in ID-2 columns linked directly or indirectly to 1 in ID-1 to be grouped next to 1, same with 2 and so on.
ID1 ID2
0 1 2
1 1 4
2 2 6
3 2 5
4 3 7
to:
ID1 ID2
0 1 2
1 4
2 6
3 5
4 3 7
CodePudding user response:
You can use boolean indexing:
# map ID2 values to ID1 or keep ID1
s = df['ID1'].map(df.set_index('ID2')['ID1']).fillna(df['ID1'])
# mask duplicated values
df.loc[s.duplicated(), 'ID1'] = ''
output:
ID1 ID2
0 1 2
1 4
2 6
3 5
4 3 7
If you prefer a new column:
s = df['ID1'].map(df.set_index('ID2')['ID1']).fillna(df['ID1'])
df['new'] = df['ID1'].mask(s.duplicated(), '')
output:
ID1 ID2 new
0 1 2 1
1 1 4
2 2 6
3 2 5
4 3 7 3