I have a deep reinforcement learning agent that interacts with a customized environment and I am displaying the reward value every episode using tensorboard. The curve looks like this
For some reason it jumps to step 80 after step 17 every time and I cannot understand why, I don't even know what part of the code I should copy paste here.
Anyone has any idea why it does that ?
CodePudding user response:
Turns out the step number is getting incremented elsewhere, commented that line and it works fine now.