From source-target-weight dataframe to JSON file-CodePudding

I have this source, target and weight dataframe :

    source            target  weight
0     A                  B       3
1     A                  C       2
2     B                  C       0
3     C                  D       1
4     D                  A       1
5     D                  B       1
...

How can I get a JSON file that looks like this :

{
  "nodes": [
    {"id": "A"},
    {"id": "B"},
    {"id": "C"},
    {"id": "D"}
    ],
   "links": [
    {"source": "A", "target": "B", "weight": 3},
    {"source": "A", "target": "C", "weight": 2},
    {"source": "B", "target": "C", "weight": 0},
    {"source": "C", "target": "D", "weight": 1},
    {"source": "D", "target": "A", "weight": 1},
    {"source": "D", "target": "B", "weight": 1}
    ]
}

I could reconstruct it through loops and lists, but is there a mor easy way ?

CodePudding user response：

nodes can be built from the unique values in source and target (via np.unique), then the links can be built from DataFrame.to_dict:

import numpy as np
import pandas as pd

df = pd.DataFrame({
    'source': ['A', 'A', 'B', 'C', 'D', 'D'],
    'target': ['B', 'C', 'C', 'D', 'A', 'B'],
    'weight': [3, 2, 0, 1, 1, 1]
})

data = {
    'nodes': [{'id': v} for v in np.unique(df[['source', 'target']])],
    'links': df.to_dict(orient='records')
}

data:

{
    'nodes': [{'id': 'A'}, {'id': 'B'}, {'id': 'C'}, {'id': 'D'}],
    'links': [{'source': 'A', 'target': 'B', 'weight': 3},
              {'source': 'A', 'target': 'C', 'weight': 2},
              {'source': 'B', 'target': 'C', 'weight': 0},
              {'source': 'C', 'target': 'D', 'weight': 1},
              {'source': 'D', 'target': 'A', 'weight': 1},
              {'source': 'D', 'target': 'B', 'weight': 1}]
}

Depending on requirements, networkx also has support for this with json_graph.node_link_data, this is certainly overkill unless additional Graph operations are needed:

import networkx as nx
import pandas as pd
from networkx.readwrite import json_graph

df = pd.DataFrame({
    'source': ['A', 'A', 'B', 'C', 'D', 'D'],
    'target': ['B', 'C', 'C', 'D', 'A', 'B'],
    'weight': [3, 2, 0, 1, 1, 1]
})

G = nx.from_pandas_edgelist(df, source='source',
                            target='target',
                            edge_attr='weight')
data = json_graph.node_link_data(G)

data:

{'directed': False,
 'graph': {},
 'links': [{'source': 'A', 'target': 'B', 'weight': 3},
           {'source': 'A', 'target': 'C', 'weight': 2},
           {'source': 'A', 'target': 'D', 'weight': 1},
           {'source': 'B', 'target': 'C', 'weight': 0},
           {'source': 'B', 'target': 'D', 'weight': 1},
           {'source': 'C', 'target': 'D', 'weight': 1}],
 'multigraph': False,
 'nodes': [{'id': 'A'}, {'id': 'B'}, {'id': 'C'}, {'id': 'D'}]}

CodePudding user response：

You can use df.to_json(), but you might have to work a little to get it into the desired form.

Example:

import re

'{' '"nodes": ' re.sub(r"\d", "id", df.source.drop_duplicates().to_json(orient="index")) ' "links": ' df.to_json(orient="records") '}'

Output:

'{"nodes": {"id":"A","id":"B","id":"C","id":"D"} "links": [{"source":"A","target":"B","weight":3},{"source":"A","target":"C","weight":2},{"source":"B","target":"C","weight":0},{"source":"C","target":"D","weight":1},{"source":"D","target":"A","weight":1},{"source":"D","target":"B","weight":1}]}'