Home > Software engineering >  Converting Graph weights to Matrix in python
Converting Graph weights to Matrix in python

Time:05-15

I have data in the form of:

A=B=11
A=C=6
A=D=5
B=C=19
B=D=17
C=D=6

But I'd like to convert this into this format:

graph= [[ 0, 10, 15, 20 ],
            [ 10, 0, 35, 25 ],
            [ 15, 35, 0, 30 ],
            [ 20, 25, 30, 0 ]]

How is this achieved? I'm aware of how multi-dimensional arrays work however constructing this matrix using python is quite confusing

CodePudding user response:

You can use a dictionary to make an adjacency list. Then enumerate the keys of that dictionary to define an index for each key. Then finally copy the weights into the final matrix structure:

nodes = {}
for line in open("input.txt").read().splitlines():
    a, b, weight = line.split("=")
    nodes.setdefault(a, []).append((b, int(weight)))
    nodes.setdefault(b, []).append((a, int(weight)))

n = len(nodes)
key2index = { key: i for i, key in enumerate(nodes.keys()) }
graph = [[0] * n for _ in range(n)]
for a, edges in nodes.items():
    row = graph[key2index[a]]
    for b, weight in edges:
        row[key2index[b]] = weight

print(graph)

A zero in the matrix denotes there is no edge, like is the case in your example on the main diagonal of the matrix (i.e. your example graph has no "loops").

Comment

As you asked in a deleted comment to skip the first line of the file, here is the code adapted to do just that:

nodes = {}
lines = open("input.txt").read().splitlines()
for line in lines[1:]:
    a, b, weight = line.split("=")
    nodes.setdefault(a, []).append((b, int(weight)))
    nodes.setdefault(b, []).append((a, int(weight)))

n = len(nodes)
key2index = { key: i for i, key in enumerate(nodes.keys()) }
graph = [[0] * n for _ in range(n)]
for a, edges in nodes.items():
    row = graph[key2index[a]]
    for b, weight in edges:
        row[key2index[b]] = weight

print(graph)

CodePudding user response:

You could do it like this:

import pandas as pd
import numpy as np

# Change to your filename.
filename = 'graph.csv'

# Lines to skip.
skip_rows = 1

# Whether the graph is undirected. 
undirected = True

# Read file and convert vertex names to integer indices. Names are sorted. 
# I.e., A=0, B=1, etc. (like in your example).
df = pd.read_csv(filename, sep='=', header=None, names=['n1', 'n2', 'weight'], skiprows=skip_rows)
cat_type = pd.CategoricalDtype(categories=sorted(np.unique(np.concatenate((df['n1'].unique(), df['n2'].unique())))), ordered=True)
df['n1'] = df['n1'].astype(cat_type).cat.codes
df['n2'] = df['n2'].astype(cat_type).cat.codes

n_nodes = len(cat_type.categories)
graph = np.full((n_nodes, n_nodes), np.nan)

for n1, n2, w in zip(df['n1'], df['n2'], df['weight']):
    graph[n1, n2] = w
    if undirected:
        graph[n2, n1] = w

np.fill_diagonal(graph, 0)

print(graph)
[[ 0. 11.  6.  5.]
 [11.  0. 19. 17.]
 [ 6. 19.  0.  6.]
 [ 5. 17.  6.  0.]]

If graph[i, j] == NaN, it means there is no edge (path of length 1) from node i to node j as per your file.

  • Related