I am trying to replicate a neural network project. It used dummy encoding to deal with the 3 categorical (nominal) features its data has. It illustrated the expected result for 1 of the 3 features, business type with 4 levels. The result showed a 3-dimensional feature vector.
I thought that I have to do this for the other 2 features (one with 90 levels, the other with 53), but this line confused me
In our case, dummy coding provides for the three unordered categorical feature components a 3 52 89=144-dimensional feature vector.
I am working on understanding how to turn categorical features into appropriate neural network inputs. The thought of having 1 feature vector that is 144-dimensional confuses me because I imagine having separate dummy codes for each feature as they have their own input neuron in the model.
I might be missing something or just misunderstanding the process. I'd appreciate any clarification! I am working with RStudio and would also appreciate any ideas on how to implement dummy encoding for this type of task.
CodePudding user response:
You have 3 categorical variables, each will have its own, separate 1-hot encoding (that you code "dummy encoding"). Afterwards you will concatenate it to form a single vector. Note, that this will be a 3-hot vector, it will have 3 ones and 141 zeroes in it.