Convert interaction list between proteins into a matrix using numpy-CodePudding

For calculing the graph Laplacian L, it's needed the Adjacency matrix. I have the protein list (nodes):

['A', 'B', 'C', 'D', 'E', 'F']

And the intreaction list between this proteins (edges):

[('B', 'A'), ('D', 'A'), ('F', 'D'), ('A', 'D'), ('A', 'B'), ('E', 'B'), ('C', 'D'), ('E', 'C'), ('D', 'B'), ('C', 'E'), ('A', 'C'), ('C', 'B'), ('B', 'D'), ('D', 'F'), ('B', 'E'), ('C', 'A'), ('D', 'C'), ('B', 'C')]

How can I convert the list into a adjacency matrix with values of 1 if interaction occurs or 0 if not, only using numpy.

The output would be:

    A   B   C   D   E   F
A   0   1   1   1   0   0
B   1   0   1   1   1   0
C   1   1   0   1   1   0
D   1   1   1   0   0   1
E   0   1   1   0   0   0
F   0   0   0   1   0   0

CodePudding user response：

I'd say build a zeros array, then put some ones

import numpy as np

prot = ['A', 'B', 'C', 'D', 'E', 'F']
edges = [('B', 'A'), ('D', 'A'), ('F', 'D'), ('A', 'D'), ('A', 'B'), ('E', 'B'),
         ('C', 'D'), ('E', 'C'), ('D', 'B'), ('C', 'E'), ('A', 'C'), ('C', 'B'),
         ('B', 'D'), ('D', 'F'), ('B', 'E'), ('C', 'A'), ('D', 'C'), ('B', 'C')]

result = np.zeros((len(prot), len(prot)))

for start, end in edges:
    result[prot.index(start), prot.index(end)] = 1

CodePudding user response：

Another option is to map edges into corresponding indices:

nodes = ['A', 'B', 'C', 'D', 'E', 'F']
edges = np.array([('B', 'A'), ('D', 'A'), ('F', 'D'), ('A', 'D'), ('A', 'B'), ('E', 'B'), ('C', 'D'), ('E', 'C'), ('D', 'B'), ('C', 'E'), ('A', 'C'), ('C', 'B'), ('B', 'D'), ('D', 'F'), ('B', 'E'), ('C', 'A'), ('D', 'C'), ('B', 'C')])
edges_idx = np.searchsorted(nodes, edges.ravel()).reshape(-1,2) # [[1, 0], [3, 0], [5, 3], ..., [3, 2], [1, 2]]
result = np.zeros((len(nodes), len(nodes)), dtype=np.uint8)
x, y = np.transpose(edges_idx)
result[x, y]  = 1