Home > OS >  How can I get branch of a networkx graph from pandas dataframe in Python in the form of a new pandas
How can I get branch of a networkx graph from pandas dataframe in Python in the form of a new pandas

Time:10-17

I have a pandas dataframe df which looks as follows:

From    To
0   Node1   Node2
1   Node1   Node3
2   Node2   Node4
3   Node2   Node5
4   Node3   Node6
5   Node3   Node7
6   Node4   Node8
7   Node5   Node9
8   Node6   Node10
9   Node7   Node11

df.to_dict() is:

{'From': {0: 'Node1',
  1: 'Node1',
  2: 'Node2',
  3: 'Node2',
  4: 'Node3',
  5: 'Node3',
  6: 'Node4',
  7: 'Node5',
  8: 'Node6',
  9: 'Node7'},
 'To': {0: 'Node2',
  1: 'Node3',
  2: 'Node4',
  3: 'Node5',
  4: 'Node6',
  5: 'Node7',
  6: 'Node8',
  7: 'Node9',
  8: 'Node10',
  9: 'Node11'}}

I have plotted this pandas dataframe as a network graph using networkx package which looks as follows: enter image description here

I want to get the unique scenarios/branches from this network graph in the form of a new pandas dataframe as follows:

  A B   C   D
0   Node1   Node2   Node4   Node8
1   Node1   Node2   Node5   Node9
2   Node1   Node3   Node6   Node10
3   Node1   Node3   Node7   Node11

How is it possible to get this?

CodePudding user response:

You can iterate graph like DFS then save path on each iterate and return path and convert to DataFrame like below:

import pandas as pd

df = pd.DataFrame({
          'From':['Node1','Node1', 'Node2', 'Node2', 'Node3', 'Node3', 'Node4', 'Node5', 'Node6', 'Node7'],
          'TO'  :['Node2','Node3', 'Node4', 'Node5', 'Node6', 'Node7', 'Node8', 'Node9', 'Node10', 'Node11']
        })

fnl_result = []
def svPath(path, node, df, lst_vst, fnl_result):
    for val in df.values:
        if val[0] == node:
            path.append(val[1])
            svPath(path, val[1], df, lst_vst, fnl_result)
    
    if not path[-1] in lst_vst:
        fnl_result.append([p for p in path])
    for p in path: lst_vst.add(p)
    path.pop()
    return
    
lst_vst = set()
svPath(['Node1'],'Node1', df, lst_vst, fnl_result)
dfOut = pd.DataFrame(fnl_result, columns=['A','B','C','D'])

Output:

>>> dfOut
        A       B       C       D
0   Node1   Node2   Node4   Node8
1   Node1   Node2   Node5   Node9
2   Node1   Node3   Node6   Node10
3   Node1   Node3   Node7   Node11
  • Related