Has filled their profile page
Hi @charel_van_hoof, could you try once with git bash on Windows. I’ll try to check but I think nle is currently not support on windows. But just the ttyplay2 script might work.
BSuite gives the limit of 10000 as a fair amount episodes needed to converge.
Generally speaking, to have a level playing field among all competitors, while also not making it too easy, some constraint has to he applied. In this case, the number of episodes serves that role.
I know the number 10000 may be arbitrary, so if enough students feel it should be increased, we’ll do it. For now please try to improve your algorithm to get the highest score with 10000 episodes.
Can you check if your local scoring cell is working without any error.
In case your local scoring cell gives an error its probably because the output format is wrong, check the targets file for the output format.
If your local scoring cell gives no error please let me know.
Sounds like a formatting issue with the code on your end, since its not a general issue that affects all students I encourage you to find the bug on your own. With correct format you should get decimals for all algorithms in the local scoring code. Look at the targets for example of the format.
Do let me know in case the problem still persists after you’ve checked it thoroughly.
Yes, you can use training data of Task 1 for 2 and 3. Feel free to use all the data at your disposal.
You can even use some unsupervised learning on the test sequences.
Its not important for submission.
No, out of grid states do not have to be considered.
Can you please provide a link where it is?
I think your TAs must have communicated that its supposed to be individual states and not matrix norm.
If you think about it matrix norm isn’t even a valid way, two arrays [0,0,1] and [1,0,0] would have the same norm.
Its a effect of how numpy arrays are printed vs how the image is shown in the diagram. Please rotate/transpose as needed for your visualization.
Our full code is in Pytorch. However, I wrote entirely custom code on Pytorch for this competition as I was completely unfamiliar with rllib and wanted fine grained control over the entire code. My implementation works by basically subclassing TorchPolicy in rllib and writing the full training code in the learn_on_batch function. This admittedly removes rllib’s distributed learning benefits but allowed me to get comparable speed and score with Pytorch. Sorry I haven’t released the code yet, will be doing that soon.
I’m just curious as to why you think none of your submissions was prepared for the final evaluation? … Full disclosure we did not try to tune our submissions to the rest of the 10 environments either (Though we knew that the final evaluation will be done on 20 envs)
Thanks for the clarification, however this raises further questions. I think sample efficiency and generalization is a trade-off towards the end of training, which means with one submission we can be high scoring on sample efficiency but poor on generalization. Or we can improve generalization but reduce sample efficiency.
So the scenarios are:
There two tracks, two env configs while training, and two separate scoring metrics.
There is one track, one env config while training, one joint scoring metric.
Please clarify, which of the above is the case?
If there are two tracks (and two env configs while training), the selected submission can be used be near the top of sample efficiency but low generalization, or be in the middle of both leaderboards. Else, if there is a joint metric, we’d like to test that locally. We’ll plan our submissions accordingly.
[Explainer] - EDA of Seismic data by geographic axis Here’s my EDA notebook on how the seismic data varies by geographic axis, along with some ideas for training.