Home > other >  Create new key based on relationship between two columns
Create new key based on relationship between two columns

Time:09-25

I'm trying to add a key for all related instances between two columns, then create a GroupID

The logic will be:

  1. Check all instances of ID2 linked to ID1
  2. CHeck all instances of ID1 linked to ID2 found in (1)
  3. Repeat until all relationships found

enter image description here

CodePudding user response:

Let us try with networkx

import networkx as nx
G=nx.from_pandas_edgelist(df, 'ID1', 'ID2')
l=list(nx.connected_components(G))
L=[dict.fromkeys(y,x) for x, y in enumerate(l)]
d={k: v for d in L for k, v in d.items()}
df['new'] = df['ID1'].map(d)
df
Out[302]: 
  ID1  ID2  new
0   A    1    0
1   A    2    0
2   B    1    0
3   B    3    0
4   C    4    1
5   C    5    1
6   D    2    0
  • Related