How to remove single feature from tensorflow dataset, how to use apply on single feture?-CodePudding

I created dataset from csv file with dataset = tf.data.experimental.make_csv_dataset() function but My dataset has categorical and numeric features.

dataset=
color  price weight
red    120    1.2
blue    80     2.0
green   90     3

Question 1: The question is how can I modify only single feature, for example weight 2, to:

dataset=
color  price weight
red    120    3.2
blue    80     4.0
green   90     5

I try to do something like:

dataset = dataset.apply(lambda x: x['weight'] 2)

but the error is: "TypeError: 'FilterDataset' object is not subscriptable"

Example from the documentation https://www.tensorflow.org/api_docs/python/tf/data/Dataset#apply doesn't show it.

Question 2: How can I remove single feature ? Is there any equivalent to pandas drop column?

CodePudding user response：

You can remove features by only filtering the features that you want. This how you can modify only one feature:

import tensorflow as tf
import pandas as pd

df = pd.DataFrame(data={'color': ['red', 'blue','green'], 'price': [120, 80, 90], 'weight': [3.2, 4.0, 5]})
df.to_csv('data.csv', index=False)

dataset = tf.data.experimental.make_csv_dataset('/content/data.csv', batch_size=1, num_epochs = 1, shuffle=False)
dataset = dataset.map(lambda x: (x['color'], x['price'], x['weight'] 2))

for x in dataset:
  print(x[0], x[1], x[2])

tf.Tensor([b'red'], shape=(1,), dtype=string) tf.Tensor([120], shape=(1,), dtype=int32) tf.Tensor([5.2], shape=(1,), dtype=float32)
tf.Tensor([b'blue'], shape=(1,), dtype=string) tf.Tensor([80], shape=(1,), dtype=int32) tf.Tensor([6.], shape=(1,), dtype=float32)
tf.Tensor([b'green'], shape=(1,), dtype=string) tf.Tensor([90], shape=(1,), dtype=int32) tf.Tensor([7.], shape=(1,), dtype=float32)