I am training a machine with reinforcements, everything is going well, but the task is to get the number of the game in which 5 victories were won in a row.
The algorithm consists of a loop that calculates 10,000 games, in each of which the agent walks on a frozen lake using 100 steps (for each game). If the agent correctly passes the lake, this is considered a victory, and so 10,000 games (iterations). I got 7914 winning games - this is the correct answer.
And the next question is: Complete the following code so that as a result of training the model, you can find out the number of wins and the number of the game (game) in which the agent won the fifth win in a row for the first time.
Here is my code:
for game in tqdm(range(total_games)):
//BODY OF CYCLE
success = reward # COUNTS WINNING GAMES
I need a simple algorithm that will select the first five wins in a row and put it in a variable. Something like this, but it's of course wrong:
if success==5:
game5Success = game
CodePudding user response:
Since I don't have more information I created an own little example. You should get the idea.
Create a seperate counter, which also adds 1 when the game is one. If the game isn't won, the counter resets to 0.
# [random.choice([1,0]) for i in range(25)]
# resulted in this:
lst = [0, 1, 1, 0, 1, 1, 1, 0, 1, 0, 0, 1, 0, 1, 0, 0, 0, 1, 0, 0, 0, 1, 1, 1, 1]
counter = 0
success = 0
only_first = True
for idx, num in enumerate(lst):
if num == 1: # equivalent to win this game
success = 1 # or reward, but I guess it is just plus 1
counter = 1
else:
counter = 0 # reset Counter if game isn't a win
if counter == 3 and only_first: # check for the number of consecutive wins you want to know and if it is the first time
print(f"First 3 successively won games at game No: {idx}")
only_first = False
Output:
First 3 successively won games at game No: 6
CodePudding user response:
I assume reward
is 0 if the game was lost and 1 if it was won. If so, then you're gonna need to store two different results. Something like:
for game in tqdm(range(total_games)):
//BODY OF CYCLE
success = reward # COUNTS WINNING GAMES
wins_in_a_row = wins_in_a_row reward if reward else 0 # COUNTS WINS IN A ROW
And then you can get your fifth game in the same way:
if wins_in_a_row == 5:
game5Success = game