Now I can access it! Thanks for the quick fix.
Thanks! It should be same as other platform like Kaggle, you can just create a discussion thread to share your approach! Of couse it would be the most helpful if you kindly share the code as well, but this competition was very structured so just sharing approach may be eough to understand what leads you to win:)
Hi ＠shivam, is there any progress?
Hi @dipam, thanks for hosting the interesting compeitition!
It seems the competition was finished, when will the leaderborad be finalized?
@dipam Voted, thanks for considering our opinions seriously.
I’m convinced that it will lead to good results for everyone!
By the way, there are only 2 weeks left, are you planning to extend the deadline or keep it as it is?
The question is how you define the best or useful images. If it’s the best for improving 10 epochs effnet-b4(which I suspect underfitting), the current scheme makes sense.
But in practice, I geuss people would decide to add data after trying to Improve the model with the current data and finding the performance still doesn’t reach to the expected one.
So my definition of “useful” here is “useful to improve the performance of well enough finetuned model”. And I suspect current post training pipeline doesn’t reach to the level, IMHO.
Hi @dipam, is there any update about this?
Or, please let me know if you already decided to stick to current training pipeline. I’ll try to optimize my purchase strategy to the one.
Thanks, I totally understand the situation. I can imagine it’s much harder to host a competition than just to join as a competitor:)
Anyway, whether the modification would be made or not, I’ll try to do my best.
@dipam Thanks for the comment!
I understand that round 2 tries to make us more focused on the purchase strategies.
My concern is not about how good the final F1 score is, but about the meaning of the best additional data.
In general, increasing dataset size when your model is underfitting is a common bad strategy.
The same thing can be said here, the strategy to choose “good additional data for underfitted model” is less practically meaningful than the one for overfitted model.
The easiest way to fix this issue is just change the training pipeline so that the trained model overfit to 1,000 training dataset.
I believe it would make the competition more useful and everyone can learn more interesting strategies.
Hi @dipam, have you already cosidered to change post training code as mentioned in the comment?
Especially for the small number of epoch seems problematic for me.
You can easily check the trained model is still underfitting for the dataset by changing number of epochs to 20 from 10 and see how your score improved.
That means for the model there is almost no need to feed data as it’s still “learning” with given data, so it might not a good model for evaluating purchased data quality.
In real situation, I guess the host would never use the way underfitted model to evaluate purchased data, that’s why I think it’s better to change or allow participants to change the post training code too.
Thanks, hope this competition would become more interesting and useful one!
welcome to any comment;)
Hi, at first thanks for launching round 2 of this exciting and interesting challenge!
I’ve just read through updates at round 2, and was a little bit surprised by the change of post purchase training part.
I know it’s for making us focus more on purchasing strategy, but in my humble opinion it should be jointly optimized with the post purchasing training part. For example, we may want to change model size when the computing budget is small. Or, sometimes we may not want to use the ImageNet pretrained model as it is without any extra finetuning.
I understand what makes this competition unique is the purchasing phase, but I guess what the host wants is a strong classifier for each computing and labeling budget, isn’t it?
To maximize the chance to achieve it, I’d like to suggest allowing participants to modify post training code.
Thanks, welcome to any opinion!
Also, could you please make sure you can correctly untar the dataset files?
There are some wierd points, the extension is somehow tar (not tar.gz as description says) and PaxHeader files are included in image directory.
If you have a certain way to untar, please let me know!
I see, thanks:)
But you don’t have to consider to give prizes, it would make me uneasy as my sub was scored 0.33, which is worth 2nd place
Anyway, can’t wait to try new dataset!
Thanks, I got it. I was stuck in submission error last night…
So I intended to submit it to round 2 as timeline says round 2 will start on March 1st, but it wasn’t?
And let me make sure that there is no prize for 1st-3rd place if it’s below 0.47, right?