I have a LSTM model trained on text content. And now I want to use that model to generate some sentences. But instead of always picking the best option, i want it to select from for example the top 3, so that it can produce different sentences with the same input, because now I get the same answer for almost every input. How do i modify this code so that is possible, I know I need to remove the np.argmax
but i don't know how to the return the index of the top 3 highest values.
Current code:
def prediction(seed_text, next_words):
for _ in range(next_words):
token_list = tokenizer.texts_to_sequences([seed_text])[0]
token_list = pad_sequences([token_list], maxlen=max_seq_length-1, padding='pre')
predicted = np.argmax(model.predict(token_list, verbose=0), axis=-1)
ouput_word = ""
for word, index in tokenizer.word_index.items():
if index == predicted:
output_word = word
break
seed_text = ' ' output_word
return seed_text
CodePudding user response:
np.argsort
will give you the indices of the items in an array in the order that sorts them small to large: https://numpy.org/doc/stable/reference/generated/numpy.argsort.html
Here's an example using argsort
. Note that the one with the lowest prediction (index 2, "c" with the predicted value of 0.05) is left out of what is printed.
import numpy as np
word_index = {'a': 0, 'b': 1, 'c': 2, 'd': 3}
predictions = np.array([0.1, 0.7, 0.05, 0.15])
# add negative to sort large to small; slice to select just up to 3rd index
top_3 = np.argsort(-predictions)[:3]
for word, index in word_index.items():
if index in top_3:
print(word)
#> a
#> b
#> d