I’ve a question regarding the second round:
Is the second round going to be evaluated based on the same agents that we submitted for the first round or are we allowed to refine our models for the sim2real transfer?
Best regards, Nils
I found another weird behavior of the env.reset() function: According to the openai gym specifications, env.reset() should return an initial observation of a new episode. However, if we call env.reset() after an episode has ended, it returns the last observation of that previous episode instead of an initial observation of the next episode. Furthermore, the data format of the observations returned by env.reset() is different compared to the observations returned by env.step(action). Is there anyone else with the same problems or is there a misunderstanding on my side? Thx
Hmm, that’s weird. Usually, I always get a done flag if my agent leaves the road. However somehow, this bug appears only when I’m connecting multiple agents to the same server. So maybe they cause some interference there?
Hi, it seems there is a weird bug: sometimes, if my agent leaves the track, the agent gets simply reset onto the track, but the done flag is not set to True and there is no negative reward neither… did anyone else make the same experience with this new version?