Home > Blockchain >  pytorch tensor from pandas columns of vectors
pytorch tensor from pandas columns of vectors

Time:03-31

I want to convert a panda's columns to a PyTorch tensor. Each cell of the column has a 300 dim NumPy vector (an embedding).

I have tried this:

torch.from_numpy(g_list[1]['sentence_vector'].to_numpy())

but it throws this error:

TypeError: can't convert np.ndarray of type numpy.object_. The only supported types are: float64, float32, float16, complex64, complex128, int64, int32, int16, int8, uint8, and bool.

CodePudding user response:

If you have this dataframe which each column is a vector of 2 numbers:

import torch
import pandas as pd

df = pd.DataFrame({'a':    [[ 3,  29],[ 3,  29]],
                   'b': [[94, 170],[ 3,  29]],
                   'c': [[31, 115],[ 3,  29]]})

enter image description here

To convert this dataframe to a pytorch tensor, you only need to convert the values of dataframe to list and then a tensor:

t = torch.Tensor(list(df.values))

#output

tensor([[[  3.,  29.],
         [ 94., 170.],
         [ 31., 115.]],

        [[  3.,  29.],
         [  3.,  29.],
         [  3.,  29.]]])

The shape of t is [2,3,2] is 2 rows, 3 columns, 2 elements inside each list.

  • Related