I am creating for the first time a bilstm with keras but I am having difficulties. So that you understand, here are the steps I have done:
- I created an embedding matrix with Glove for my x ;
def create_embeddings(fichier,dictionnaire,dictionnaire_tokens):
with open(fichier) as file:
line = file.readline()
max_words = max(dictionnaire_tokens.values()) 1 #1032
max_size_dimensions = 300
emb_matrix = np.zeros((max_words,max_size_dimensions))
for item,count in dictionnaire_tokens.items():
try:
vecteur = dictionnaire[item]
except:
pass
if vecteur is not None:
emb_matrix[count]= vecteur
return emb_matrix
- I did some one hot encoding with my y's;
def one_hot_encoding(file):
with open(file) as file:
line = file.readline()
liste = []
while line:
tag = line.split(" ")[1]
tag = [tag]
line = file.readline()
liste.append(tag)
one_hot = MultiLabelBinarizer()
array = one_hot.fit_transform(liste)
return array
- I compiled my model with keras
from tensorflow.keras.layers import Bidirectional
model = Sequential()
embedding_layer = Embedding(input_dim=1031 1,
output_dim=300,
weights=[embedding_matrix],
trainable=False)
model.add(embedding_layer)
bilstm_layer = Bidirectional(LSTM(units=300, return_sequences=True))
model.add(bilstm_layer)
model.add(Dense(300, activation="relu"))
#crf_layer = CRF(units=len(self.tags), sparse_target=True)
#model.add(crf_layer)
model.compile(optimizer="adam", loss='binary_crossentropy', metrics='acc')
model.summary()
Input of my embedding layer (embedding matrix) :
[[ 0. 0. 0. ... 0. 0. 0. ]
[ 0. 0. 0. ... 0. 0. 0. ]
[ 0. 0. 0. ... 0. 0. 0. ]
...
[-0.068577 -0.71314 0.3898 ... -0.077923 -1.0469 0.56874 ]
[ 0.32461 0.50463 0.72544 ... 0.17634 -0.28961 0.29007 ]
[-0.33771 -0.24912 -0.032685 ... -0.033254 -0.45513 -0.13319 ]]
- I train my model. However when I want to train it, I get the following message: ValueError: Dimensions must be equal, but are 7 and 300 for '{{node binary_crossentropy/mul}} = Mul[T=DT_FLOAT](binary_crossentropy/Cast, binary_crossentropy/Log)' with input shapes: [?,7], [?,300,300].
My embedding matrix was made with glove 300d so it has 300 dimensions. While my labels, I have only 7 labels. So I have to make my x and y have the same dimensions but how? Thank you!!!
CodePudding user response:
keras.backend.clear_session()
from tensorflow.keras.layers import Bidirectional
model = Sequential()
_input = keras.layers.Input(shape=(300,1))
model.add(_input)
bilstm_layer = Bidirectional(LSTM(units=300, return_sequences=False))
model.add(bilstm_layer)
model.add(Dense(7, activation="relu")) #here 7 is the number of classes you have and None is the batch_size
#crf_layer = CRF(units=len(self.tags), sparse_target=True)
#model.add(crf_layer)
model.compile(optimizer="adam", loss='binary_crossentropy', metrics='acc')
model.summary()