Home > other >  from Pandas adjacency matrix to NetworkX affiliation network
from Pandas adjacency matrix to NetworkX affiliation network

Time:05-11

I want to create a NetworkX graph from a Pandas adjacency matrix.

Usually this works with nx.from_pandas_adjacency(df). But this time I have an affiliation network.

here is an example:

import pandas as pd
import numpy as np
import networkx as nx

#create dummy df
rng = np.random.RandomState(seed=5)
rng = np.random.RandomState(seed=5)
ints = rng.randint(1, 11, size=(4, 2))
df = pd.DataFrame(ints, columns=["Book1","Book2"])
df["Tag"]=['music','city','transport','traveling']
df.set_index('Tag', inplace=True)

#show dummy df
print (df)

This gives the dummy df:

           Book1  Book2
Tag                    
music          4      7
city           7      1
transport     10      9
traveling      5      8

If I now try to use nx.from_pandas_adjacency(df) the following error message comes:

networkx.exception.NetworkXError: ('Columns must match Indices.', 
"['music', 'city', 'transport', 'traveling'] not in columns")

I could do a loop and put the information into an edge list and then pass it to Networkx like this:

edgelist=[
('Book1','music',4),
('Book2','music',7),
('Book1','city',7),
('Book2','city',1)... and so on]

But I am sure there is a much better and more efficient way to do this. The real df has 1000 Books and 90% NaN values (no edge).

CodePudding user response:

You can melt the dataframe and then use nx.from_pandas_edgelist:

G = nx.from_pandas_edgelist(
    df.reset_index().melt(id_vars='Tag'),
    source='Tag',
    target='variable',
    edge_attr='value'
)
  • Related