Thank you for posting an update.
Can you please say why our team has a public score that differs from the leaderboard. We had 90.131 and there is 105.827 in the table. Did you face reproduction problems?
Upd: It seems that you used the wrong submission in the table.
Can you please clarify which submission will be selected for calculating the private score?
@shivam Thank you.
I think I understand why it happened. I selected file, which forced form submission, but also clicked a button and it caused second form submission.
Possible solution is to block button once the file is selected, but for sure you should know better how to fix it
Even worse it repeated again.
So now we lost 2 submissions per day.
Seems that one of my submissions was sent twice - don’t know how it happened.
Difference is 4 seconds and same score.
Most funny thing that our team has -1 submission now :))
Will be great if you can remove this junk submission.
That is a really nice competition with the new field for our ck.ua team. Always interesting to see tasks which have direct application and build a solution that can help somebody.
I would like to thanks organizers, congratulate winners, and congratulate all competitors with an interesting experience.
We totally agree that the main mistakes in organizing this competition were 100 submissions per day and opened final leaderboard. You can easily reduce submissions to 5-10 per day, but you should exclude failed submission. And if this is possible you can show results for part of test aircrafts. And only after finishing show final results of the rest of the test aircrafts. Just check how public/private leaderboards work in Kaggle - that is a really nice solution.
And about submissions count. In the last hour of the competition I already saw one of the slides in our future presentation about competition. And the title of that slide "How we lost “overfitting challenge”
By the way, my separate congratulations to ZAviators (@benoit_figuet and @rmonstein ) with the impressive finish. I am looking forward to hearing from you a cool story about that last day of the competition.
What are you talking about?
Yes, Richard has experience in the field and this gives him some advantage. But this is FAIR advantage. He has no access to test data, no information about how data was collected, etc. He breaks no rules.
With the same points, you can claim people who work as ML engineers as they know how to use machine learning techniques already, but somebody needs to learn it during the competition.
For sure I also want to win, but if somebody with more experience will win that’s ok. I will just check his solution and study from it. Please keep your heads cold.
My point is there is no rules violation.
Good luck to all competitors.
I agree that models need to be practical and useful
It depends what is practical. For example if your task is anti-spoofing then this is right idea to see all airplane track
That is incorrect to add significant details to rules in the final stage of the competition.
Train data that we have includes “data from future” and obviously all use it.
It will be practical if model should be limited by the data that was given by organizers in data archive.
I not understand how it happened. But when i was pushing single submission today 11:34 UTC. The system also has taken 1 more submission (51) that was already sent by my teammate at 00:42.
It just so happened that I saw this only after sending one more submission (that is 5th in 24 hours).
Can you please remove submission https://gitlab.aicrowd.com/vitaly_bondar/flatland/issues/51
You can easily verify that it is the same source codes as in https://gitlab.aicrowd.com/vitaly_bondar/flatland/issues/50
Can you please clarify how daily submission limit counts.
What time zone used for start of the day?
Do you include failed submissions in this count?
Here is an example:
It shows: “Total Execution Time : 52061s”, what is more than 14 hours.
And anyway it is scored.
I hope that you will review all submissions, not only submissions of our team to keep fairness in the competition
Yes, that is perfectly explains calculations.
The score is mean percentage of agents who solved their own objective
Does it mean that you calculate
arrived_trains / len(trains)and then calculate mean over all episodes?
Or you calculate global amount of arrived trains and divide on global count of trains.
Can you please explain how time limit works.
Do you count the whole time of evaluation? Or you remove time used for environment calculations?
What is time limit? 8 hours?
Is it possible to compete in this challenge as a team?
And how to make this?