I have this source, target and weight dataframe :
source target weight
0 A B 3
1 A C 2
2 B C 0
3 C D 1
4 D A 1
5 D B 1
...
How can I get a JSON file that looks like this :
{
"nodes": [
{"id": "A"},
{"id": "B"},
{"id": "C"},
{"id": "D"}
],
"links": [
{"source": "A", "target": "B", "weight": 3},
{"source": "A", "target": "C", "weight": 2},
{"source": "B", "target": "C", "weight": 0},
{"source": "C", "target": "D", "weight": 1},
{"source": "D", "target": "A", "weight": 1},
{"source": "D", "target": "B", "weight": 1}
]
}
I could reconstruct it through loops and lists, but is there a mor easy way ?
CodePudding user response:
nodes
can be built from the unique values in source and target (via np.unique
), then the links
can be built from DataFrame.to_dict
:
import numpy as np
import pandas as pd
df = pd.DataFrame({
'source': ['A', 'A', 'B', 'C', 'D', 'D'],
'target': ['B', 'C', 'C', 'D', 'A', 'B'],
'weight': [3, 2, 0, 1, 1, 1]
})
data = {
'nodes': [{'id': v} for v in np.unique(df[['source', 'target']])],
'links': df.to_dict(orient='records')
}
data
:
{
'nodes': [{'id': 'A'}, {'id': 'B'}, {'id': 'C'}, {'id': 'D'}],
'links': [{'source': 'A', 'target': 'B', 'weight': 3},
{'source': 'A', 'target': 'C', 'weight': 2},
{'source': 'B', 'target': 'C', 'weight': 0},
{'source': 'C', 'target': 'D', 'weight': 1},
{'source': 'D', 'target': 'A', 'weight': 1},
{'source': 'D', 'target': 'B', 'weight': 1}]
}
Depending on requirements, networkx
also has support for this with json_graph.node_link_data
, this is certainly overkill unless additional Graph operations are needed:
import networkx as nx
import pandas as pd
from networkx.readwrite import json_graph
df = pd.DataFrame({
'source': ['A', 'A', 'B', 'C', 'D', 'D'],
'target': ['B', 'C', 'C', 'D', 'A', 'B'],
'weight': [3, 2, 0, 1, 1, 1]
})
G = nx.from_pandas_edgelist(df, source='source',
target='target',
edge_attr='weight')
data = json_graph.node_link_data(G)
data
:
{'directed': False,
'graph': {},
'links': [{'source': 'A', 'target': 'B', 'weight': 3},
{'source': 'A', 'target': 'C', 'weight': 2},
{'source': 'A', 'target': 'D', 'weight': 1},
{'source': 'B', 'target': 'C', 'weight': 0},
{'source': 'B', 'target': 'D', 'weight': 1},
{'source': 'C', 'target': 'D', 'weight': 1}],
'multigraph': False,
'nodes': [{'id': 'A'}, {'id': 'B'}, {'id': 'C'}, {'id': 'D'}]}
CodePudding user response:
You can use df.to_json()
, but you might have to work a little to get it into the desired form.
Example:
import re
'{' '"nodes": ' re.sub(r"\d", "id", df.source.drop_duplicates().to_json(orient="index")) ' "links": ' df.to_json(orient="records") '}'
Output:
'{"nodes": {"id":"A","id":"B","id":"C","id":"D"} "links": [{"source":"A","target":"B","weight":3},{"source":"A","target":"C","weight":2},{"source":"B","target":"C","weight":0},{"source":"C","target":"D","weight":1},{"source":"D","target":"A","weight":1},{"source":"D","target":"B","weight":1}]}'