I want to apply tf.data transformations to a panda dataframe. According to the tensorflow docs HERE I can apply tf.data to a dataframe directly but the dtype of the dataframe should be uniform.
When I apply tf.data to my dataframe like below
tf.data.Dataset.from_tensor_slices(df['reports'])
it generates this error
ValueError: Failed to convert a NumPy array to a Tensor (Unsupported object type float).
When I print df['reports'].dtype
it is dtype('O')
which seems to be not uniformed, if this is the case then how can I convert this dataframe to uniform dtype
.
CodePudding user response:
You can try forcing your df["reports"]
to a specific type. Assuming that you want to convert this column to numbers you can easily do it like this:
df['reports'] = pd.to_numeric(df['reports'])
Anyway, I suggest you to investigate the cause of your non-uniform dtype('O')
. You could have some mistake in your data.
CodePudding user response:
Try using a ragged structure:
import tensorflow as tf
import pandas as pd
df = pd.DataFrame(data={'reports': [[2.0, 3.0, 4.0], [2.0, 3.0], [2.0]]})
dataset = tf.data.Dataset.from_tensor_slices(tf.ragged.constant(df['reports']))
for x in dataset:
print(x)
tf.Tensor([2. 3. 4.], shape=(3,), dtype=float32)
tf.Tensor([2. 3.], shape=(2,), dtype=float32)
tf.Tensor([2.], shape=(1,), dtype=float32)