Structure of similar images, calculating loss function will only use the output of the second group LSTM,
The difficulty is, if the first run the first set of BiLSTM, to save the final result, put them in the second group BiLSTM, seems to be only obtained the second BiLSTM training in training, the first group of BiLSTM parameters are not changed,