I have this dataframe
col1 col2 col3
0 a b 1
1 b c 2
2 c d 3
3 d a 4
4 k g 5
5 w x 6
6 y z 7
7 z w 8
8 r w 9
I want an output where I can only have a "cycle" pattern in the dataframe
Expected output:
col1 col2 col3
0 a b 1
1 b c 2
2 c d 3
3 d a 4
5 w x 6
6 y z 7
7 z w 8
Is what I'm asking possible?
CodePudding user response:
This looks like a graph problem, which you can solve using
import networkx as nx
G = nx.from_pandas_edgelist(df, source='col1', target='col2',
create_using=nx.DiGraph)
nodes = {n for l in nx.simple_cycles(G) for n in l}
# {'a', 'b', 'c', 'd', 'w', 'x', 'z'}
out = df.loc[df['col1'].isin(nodes) & df['col2'].isin(nodes)]
# or
# out = df[df[['col1', 'col2']].isin(nodes).all(axis=1)]
print(out)
output:
col1 col2 col3
0 a b 1
1 b c 2
2 c d 3
3 d a 4
5 w x 6
6 x z 7
7 z w 8
graph of the output: