I would like to visualize a network with networkx and matplotlib from data stored in a CSV file. The data in the CSV file consist of 15 columns and about 150 rows. The data in the csv file look as in the toy example below. The index columns consists of a list of names. Each name should be a single node. Each column name stands for an organisation. "x" means that the person is a member of the organisation. The goal is to show all direct relations between two persons as an edge - and the more relations the bigger the node in the visu. Thanks for help!
as text:
(a)
Name Org A Org B Org C Org D Org F ...
Person 1 x x x x
Person 2 x x
Person 3 x
Person 4 x
Person 5 x x
Person 6
Person 7 x x x
Person 8 x x
Person 9 x
Person 10 x x x
...
CodePudding user response:
IIUC, you want something like this:
import networkx as nx
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
df = pd.DataFrame({'Name':[f'person {f}' for f in range(1,11)],
'Org A':['x'] [np.nan]*7 ['x'] [np.nan],
'Org B':['x'] [np.nan]*3 ['x'] [np.nan] ['x']*2 [np.nan]*2,
'Org C':['x',np.nan,'x'] [np.nan]*3 ['x'] [np.nan]*2 ['x'],
'Org D':['x', 'x', np.nan, 'x'] [np.nan]*2 ['x'] [np.nan]*2 ['x'],
'Org F':[np.nan] ['x'] [np.nan]*2 ['x'] [np.nan]*2 ['x'] [np.nan, 'x']})
dfm = df.melt('Name').dropna()
df_net= dfm.merge(dfm, on = 'variable').query('Name_x != Name_y')
G = nx.from_pandas_edgelist(df_net, 'Name_x', 'Name_y')
fig, ax = plt.subplots(figsize=(10,8))
nx.draw_networkx(G, ax=ax)
Output: