I want to convert a panda's columns to a PyTorch tensor. Each cell of the column has a 300 dim NumPy vector (an embedding).
I have tried this:
torch.from_numpy(g_list[1]['sentence_vector'].to_numpy())
but it throws this error:
TypeError: can't convert np.ndarray of type numpy.object_. The only supported types are: float64, float32, float16, complex64, complex128, int64, int32, int16, int8, uint8, and bool.
CodePudding user response:
If you have this dataframe which each column is a vector of 2 numbers:
import torch
import pandas as pd
df = pd.DataFrame({'a': [[ 3, 29],[ 3, 29]],
'b': [[94, 170],[ 3, 29]],
'c': [[31, 115],[ 3, 29]]})
To convert this dataframe to a pytorch tensor, you only need to convert the values of dataframe to list and then a tensor:
t = torch.Tensor(list(df.values))
#output
tensor([[[ 3., 29.],
[ 94., 170.],
[ 31., 115.]],
[[ 3., 29.],
[ 3., 29.],
[ 3., 29.]]])
The shape of t
is [2,3,2] is 2 rows, 3 columns, 2 elements inside each list.