Home > Enterprise >  Error while training Model with categorical and numerical dataset: Failed to convert a NumPy array t
Error while training Model with categorical and numerical dataset: Failed to convert a NumPy array t

Time:05-02

currently I'm working on my final degree project and I have to train a neural network that predicts the class of an individual. The dataset is about accidents in Barcelona. Due to that, my dataset has both categorical and numerical features. In order to train the neural network I have built a model that contains an embedding layer for every categorical column. How ever, when I try to fit my model the following appears.

      1 m.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
----> 2 m.fit(dd_normalized, dummy_y)

ValueError: Failed to convert a NumPy array to a Tensor (Unsupported object type float).

I have researched about it and it does not seem to solve my issue. I'm a rookie with neural networks so please have some patience. My code is the following:

dd = pd.read_csv("C:/Users/Hussnain Shafqat/Desktop/Uni/Q8/TFG/Bases de dades/Modified/2021_Accidents_Final.csv")
dd_features = dd.copy()

Y = dd_features.pop('TipoAcc') #my target variable

# Normalization of Numerical variable
dd_normalized = dd_features.copy()
normalize_var_names = ["Long", "Lat", "NLesLeves", "NLesGraves", "NVictimas", "NVehiculos", "ACarne"] 
for name, column in dd_features.items():
    if name in normalize_var_names:
        print(f"Normalizando {name}")
        dd_normalized[name] = (dd_features[name] - dd_features[name].min()) / (dd_features[name].max() - dd_features[name].min())

dd_normalized = dd_normalized.replace({'VictMortales': {'Si': 1, 'No': 0}})  

#Neural network model creation
def get_model(df):
    names = df.columns
    inputs = []
    outputs = []
    for col in names:
        if col in normalize_var_names:
            inp = layers.Input(shape=(1,), name = col)
            inputs.append(inp)
            outputs.append(inp)
        else:
            num_unique_vals = int(df[col].nunique())
            embedding_size = int(min(np.ceil(num_unique_vals/2), 600))
            inp = layers.Input(shape=(1,), name = col)
            out = layers.Embedding(num_unique_vals   1, embedding_size, name = col "_emb")(inp)
            out = layers.Reshape(target_shape = (embedding_size,))(out)
            inputs.append(inp)
            outputs.append(out)
    x = layers.Concatenate()(outputs)
    x = layers.Flatten()(x)
    x = layers.Dense(64, activation ='relu')(x)
    y = layers.Dense(15, activation = 'softmax')(x)
    model = Model(inputs=inputs, outputs = y)
    return model

m = get_model(dd_normalized)

#I convert the target variable using one hot encoding
encoder = LabelEncoder()
encoder.fit(Y)
encoded_Y = encoder.transform(Y)
dummy_y = np_utils.to_categorical(encoded_Y)

#Model training
m.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
m.fit(dd_normalized, dummy_y)

I have tried to convert my dataset into a tensor using tf.convert_to_tensor but the same error appears. After some research, I have find out that the same errors appears when I try to convert to tensor with both categorical and numerial columns. If I apply the function to just categoricals or numericals columns it works fine. i know that I can't feed categorical data to neural network, however, I think with the embedding layers should be enough to solve the problem.

Finally, I want to say that I also have tried this Sample

  • Related