Home > database >  How to remove single feature from tensorflow dataset, how to use apply on single feture?
How to remove single feature from tensorflow dataset, how to use apply on single feture?

Time:02-18

I created dataset from csv file with dataset = tf.data.experimental.make_csv_dataset() function but My dataset has categorical and numeric features.

dataset=
color  price weight
red    120    1.2
blue    80     2.0
green   90     3

Question 1: The question is how can I modify only single feature, for example weight 2, to:

dataset=
color  price weight
red    120    3.2
blue    80     4.0
green   90     5

I try to do something like:

dataset = dataset.apply(lambda x: x['weight'] 2)

but the error is: "TypeError: 'FilterDataset' object is not subscriptable"

Example from the documentation https://www.tensorflow.org/api_docs/python/tf/data/Dataset#apply doesn't show it.

Question 2: How can I remove single feature ? Is there any equivalent to pandas drop column?

CodePudding user response:

You can remove features by only filtering the features that you want. This how you can modify only one feature:

import tensorflow as tf
import pandas as pd

df = pd.DataFrame(data={'color': ['red', 'blue','green'], 'price': [120, 80, 90], 'weight': [3.2, 4.0, 5]})
df.to_csv('data.csv', index=False)

dataset = tf.data.experimental.make_csv_dataset('/content/data.csv', batch_size=1, num_epochs = 1, shuffle=False)
dataset = dataset.map(lambda x: (x['color'], x['price'], x['weight'] 2))

for x in dataset:
  print(x[0], x[1], x[2])
tf.Tensor([b'red'], shape=(1,), dtype=string) tf.Tensor([120], shape=(1,), dtype=int32) tf.Tensor([5.2], shape=(1,), dtype=float32)
tf.Tensor([b'blue'], shape=(1,), dtype=string) tf.Tensor([80], shape=(1,), dtype=int32) tf.Tensor([6.], shape=(1,), dtype=float32)
tf.Tensor([b'green'], shape=(1,), dtype=string) tf.Tensor([90], shape=(1,), dtype=int32) tf.Tensor([7.], shape=(1,), dtype=float32)
  • Related