Just to add the rule to prohibit a future data usage is not enough in general. Always possible to implement a method that solves the offline version of the problem with all future data and then finetune the official online method with predicted data at hidden offline realization as “ground-truth”.
If you really want to solve an online tracking problem, you should invent some method how to strictly hide future data, like at the flatland challenge. But it required something like kernel-competitions at Kaggle.
Thank you for your answers. I have a few additional questions about metrics.
Why you use 2D distance? We need to predict the height of aircraft, no?
Do you check the submission file fully after each submits? So the score on the leaderboard is final, right?
Can you please describe the evaluation metrics of the competition? What the score and secondary score means?
The link to the competition appears on NeurIPS, but it looks hidden now. When the competition will start?
Here is our (ck.ua team) 2nd place solution:
(topic withdrawn by author, will be automatically deleted in 24 hours unless flagged)
I have a few questions about evaluation process.
- Can you please confirm that our solutions always evaluated on the same test samples? Now looks like the test sequence has shuffled at least.
- How maps are choosen for visualization video? I can see different teams has different maps on the video.
- I seen something strange with score progress during evaluation. The score always start from some low value and next quickly increased. And at the end I always see a big score jump. For example, I had a score 91.6% after 248 simulations. But in the end I have a score 92%. This significant jump at the end is not possible. Looks like the scoring algorithm divided done-agens sum on N+1, where N is number of finished simulations.
What hardware are using for submission evaluation? How much RAM and CPU cores we can use?
I also have a lot of questions about the challenge regarded with submissions.
- What hardware are using for submission evaluation?
- How score was calculated in the leaderboard?
- How score will combined in Round 1 and Round 2.
- What maximum possible world size and train count?
And it’s a time to update the overview text, because now we can find a lot of useful information only in discussions.
Currently I can see a message " Round 0 : Exploration Round: 3 months" at top of the challenge description". But round 0 has finished yesterday.
As I understand from the challenge rules, our models will evaluating on a set of unknown random seed worlds.
But how we will submit our models? Just an archive of python files? Or in a Docker?
So, we should expect 10000*10000 world size in next round, right?