Here is the batch data set i created before to fit in the model:
train_ds = tf.keras.preprocessing.image_dataset_from_directory(
train_path,
label_mode = 'categorical', #it is used for multiclass classification. It is one hot encoded labels for each class
validation_split = 0.2, #percentage of dataset to be considered for validation
subset = "training", #this subset is used for training
seed = 1337, # seed is set so that same results are reproduced
image_size = img_size, # shape of input images
batch_size = batch_size, # This should match with model batch size
)
valid_ds = tf.keras.preprocessing.image_dataset_from_directory(
train_path,
label_mode ='categorical',
validation_split = 0.2,
subset = "validation", #this subset is used for validation
seed = 1337,
image_size = img_size,
batch_size = batch_size,
)
if i run a for loop, i am able to access the img array and labels:
for images, labels in train_ds:
print(labels)
But if i try to access them like this:
ATTEMPT 1)
images, labels = train_ds
I get the following value error: ValueError: too many values to unpack (expected 2)
ATTEMPT 2:
If i try to unpack it like this:
images = train_ds[:,0] # get the 0th column of all rows
labels = train_ds[:,1] # get the 1st column of all rows
I get the following error: TypeError: 'BatchDataset' object is not subscriptable
Is there a way for me to extract the labels and images without going trough a for loop?
CodePudding user response:
For your specific case, train_ds will be a Tensor Object, each element of which is a tuple: (image,label)
.
Probably try something like:
# train_ds = [(image,label) …]
images = train_ds[:,0] # get the 0th column of all rows
labels = train_ds[:,1] # get the 1st column of all rows
CodePudding user response:
Just unbatch your dataset and convert the data to lists:
import tensorflow as tf
import pathlib
dataset_url = "https://storage.googleapis.com/download.tensorflow.org/example_images/flower_photos.tgz"
data_dir = tf.keras.utils.get_file('flower_photos', origin=dataset_url, untar=True)
data_dir = pathlib.Path(data_dir)
batch_size = 32
train_ds = tf.keras.utils.image_dataset_from_directory(
data_dir, validation_split=0.2, subset="training",
seed=123, batch_size=batch_size)
train_ds = train_ds.unbatch()
images = list(train_ds.map(lambda x, y: x))
labels = list(train_ds.map(lambda x, y: y))
print(len(labels))
print(len(images))
Found 3670 files belonging to 5 classes.
Using 2936 files for training.
2936
2936