I'm using Keras Image Data Generator for data augmentation, and the flow_from_dataframe function within it. Info regarding it here: https://www.tensorflow.org/api_docs/python/tf/keras/preprocessing/image/ImageDataGenerator#flow_from_dataframe
# Create new dataframes for train and test
df_train = pd.DataFrame()
df_train['image'], df_train['labels'] = X_train, y_train
df_test = pd.DataFrame()
df_test['image'], df_test['labels'] = X_test, y_test
This is what one dataframe looks like:
image labels
4227 /Users/m/Documents/Machine Learning Pr... [73, 0]
4676 /Users/m/Documents/Machine Learning Pr... [36, 0]
800 /Users/m/Documents/Machine Learning Pr... [26, 0]
3671 /Users/m/Documents/Machine Learning Pr... [42, 0]
This is how I've imported the data generator:
from keras.preprocessing.image import ImageDataGenerator
datagen = ImageDataGenerator(
rescale = 1./255,
rotation_range = 40,
width_shift_range = 0.2,
height_shift_range = 0.2,
shear_range = 0.2,
zoom_range = 0.2,
horizontal_flip = True,
fill_mode = 'nearest'
)
test_datagen= ImageDataGenerator(rescale=1./255.)
train_generator=datagen.flow_from_dataframe(
dataframe = df_train,
x_col="image",
y_col="labels",
batch_size=32,
seed=42,
shuffle=True,
class_mode='multi_output',
target_size=(128, 128))
valid_generator = test_datagen.flow_from_dataframe(
dataframe = df_test,
x_col = "image",
y_col = "labels",
batch_size = 32,
seed = 42,
shuffle = True,
class_mode='multi_output',
target_size=(128, 128))
The function reads in a dataframe, but in the documentation it says the y_col specified must be a list:
y_col string or list, column/s in dataframe that has the target data.
Before I created the dataframe the column was a list, but now that it's a column in pandas it's no longer classed as a 'list', right? So why do I get this error message:
TypeError: If class_mode="multi_output", y_col must be a list. Received str.
I want to use the class mode multi outputas above, and it states y_col must be a list but it's a string. Not sure why it is stating it's a string? Is there anyway to change the 'type' of a column within a dataframe or am I misunderstanding?
CodePudding user response:
'List' here means list of column names.
As Zelemist has said, change your dataframe so that there are two columns rather than the one you have.
Then input a list to y_col such as:
y_col = ['col1', 'col2]
Hope it makes sense now.