Home > Net >  How to make a p * q matrix from a numpy.nd array(Where p*q = n)?
How to make a p * q matrix from a numpy.nd array(Where p*q = n)?

Time:08-18

I have these two array

data1 = [["ab","bc","ca"], ["bc","cd","da"], ["be","cd","db"]]
topics1 = [["ab","db"],["be","cd"]]

I have to find the intersection of each topic for each document. Here is my attempt.

mat11 = []
for i in range(len(data1)):
  for j in range(len(topics1)):
    mat1 = len(list(set(data1[i]) & set(topics1[j])))
    mat11.append(mat1)

mat 11 is a list of (len(data1) * len(topic1)) elements.

mat11

enter image description here

I want it to be as a matrix of shape [len(data1) * len(topic1)]. So I have done the following.

import numpy as np
img_mat = np.array( mat11 )
shape = ( len(data1), len(topics1) )
img_mat.reshape( shape )

which is giving me this output

enter image description here

But it's not the shape which I wanted,

enter image description here

How to make this a 3*2 matrix. Moreover my main aim is to get a dataframe which looks like

enter image description here

CodePudding user response:

import numpy as np
img_mat = np.array( mat11 )
shape = ( len(data1), len(topics1) )
l = np.matrix(img_mat.reshape(shape))

import pandas as pd
l_df = pd.DataFrame(l)
l_df = l_df.rename_axis('Docs').reset_index()
l_df.Docs = pd.Series(["D" str(ind) for ind in l_df.Docs])
suffix = 'Topic'
l_df = l_df.add_prefix(suffix)
l_df.rename(columns={'TopicDocs':'Docs'}, inplace=True)
  • Related