Activity
Challenge Categories
Challenges Entered
Improve RAG with Real-World Benchmarks | KDD Cup 2025
Latest submissions
See All| graded | 289892 | ||
| graded | 289891 | ||
| failed | 289715 |
Improve RAG with Real-World Benchmarks
Latest submissions
Revolutionise E-Commerce with LLM!
Latest submissions
See All| graded | 270741 | ||
| graded | 270740 | ||
| graded | 270655 |
Generating answers using image-linked data
Latest submissions
See All| graded | 289892 | ||
| graded | 289891 | ||
| graded | 288840 |
Synthesising answers from image and web sources
Latest submissions
See All| failed | 289190 | ||
| graded | 289174 | ||
| graded | 289033 |
| Participant | Rating |
|---|---|
mincheolyoon
|
0 |
happystat
|
0 |
unna97
|
0 |
linchia
|
0 |
Karrich
|
0 |
gaozhanfire
|
0 |
pengbo_wang
|
0 |
pp2915
|
0 |
pengyue_jia3
|
0 |
GenpengXu
|
0 |
xiaopeng_li
|
0 |
eliot8
|
0 |
lai_jinxing
|
0 |
| Participant | Rating |
|---|
-
NVIDIA-Merlin Amazon KDD Cup '23: Multilingual Recommendation ChallengeView
-
Team_NVIDIA Amazon KDD Cup 2024: Multi-Task Online Shopping Challenge for LLMsView
-
Team_NVIDIA Meta CRAG - MM Challenge 2025View
Meta CRAG - MM Challenge 2025
๐จ Submission Selection Deadline: 23rd June 2025, 12:00 UTC (noon)
5 months ago@snehananavati @yilun_jin8 @jyotish Before we can submit the form, we need to know why we are choosing two. Could you please explain how the two will be used? For example
- If we submit two, you will use the one with less missing responses.
- If we submit two, you will choose the code that you like best
- If we submit two, you will human evaluate both and choose the one with best score
- If we submit two, you will choose the one which runs the fastest
- etc, etc, etc
Please respond quickly so we have time to pick our two final submissions before selection deadline. Thank you!
๐จ Submission Selection Deadline: 23rd June 2025, 12:00 UTC (noon)
5 months agoHi. When we select two, what do you do with the two? Will you human evaluate both of the two and pick the one that has the best score? (i.e. of the two selected per task, which one of the two will eventually be used as our teamโs final submission per task?)
Submissions stucks
5 months agoThis means that your submission is in the queue and hasnโt started yet. There all like 100 submissions to evaluate before yours.
Why did Submission 288508?
5 months agoHello. Can you tell me why my submission 288508 failed? Please add a comment to the submission page. And can you tell me why my teammates submission 288507 failed? Please add a comment to his submission page. Thank you!
Why did Submission 287740 Fail?
5 months agoHello, why did submission 287740 fail? It finished 68% of โgenerate predictionsโ.
Why did Submission 287602 Fail?
5 months agoAnd can you tell me why submission 287646 failed? Thank you
Why did Submission 287602 Fail?
5 months agoHi, can an admin tell me why my submission 287602 failed with โexit code 1โ during โgenerate predictionsโ? It made it to 94%, so i thought it would be successful. Thanks!
Can you help me figure out why Submission #287175 failed?
5 months agoWhat does the evaluation page say?
Why Did Submission 287138 Fail?
5 months agoHello. Can admins tell me why submission 287138 failed? It received a โEvaluation failed with exit code 1โ halfway through โgenerating predictionsโ. Thank you.
Has Private Data Changed in Phase 2?
5 months agoI am confused. On day 1 of phase 2, wasnโt the private web search using v0.5 and then recently it was changed to private web search v0.6 (with 10-20% more chunks)? Wasnโt that the reason that private web search started returning None and the disk space on server starting running out?
So in other words, didnโt the private web search change during phase 2? And solutions submitted during day 1 with v0.5 will perform differently on leaderboard than the same solutions submitted today with v0.6?
i.e. if nothing changed with private data, why are we re-running all submission to the leaderboard?
Question about ego sample TAB in multi-turn-QA's Leaderboard
5 months agoYes this is normal. Because task 3 is all the same type of image. (I forget, it is either all ego or all non-ego). Therefore there is only 1 LB)
Why Submission #286793 failed?
5 months agoDoes the submission page have more info? If we exceed the 7.5 hour time limit we get a โGym server stopped unexpectlyโ. Maybe this is your error.
Error - crag batch iteration
5 months agoThere are two queries in dataset without an answer. If you use the latest version of crag batch iterator here, it will fix that:
The bad ids are
SESSIONS_TO_SKIP = ["04d98259-27af-41b1-a7be-5798fd1b8e95", "695b4b5c-7c65-4f7b-8968-50fe10482a16"]
Why Does Private Web Search v0.6 search_pipeline() return None?
5 months agoI have noticed that validation web search v0.6 and public test web search v0.6 never return None when we call search_pipeline() locally. This makes sense because a RAG search uses cosine similarity, and it can always find and return k chunks.
But during submission, the private web search v0.6 search_pipeline() returns None. The private web search v0.5 did not return None.
Today, AIcrowd is re-running all our submissions. All of my re-run submissions are failing because of these None (but worked fine with v0.5). Is something going wrong with private web search v0.6 search_pipeline()? What would cause it to return None? And is it only returning a few None or is every call returning None?
Why Submission #286732 failed?
5 months agoYou must wait. Your evaluation is in a queue and has not started yet. Afterward it will say either successful and show scores. Or it will say fail. But it has not started yet.
Status was changed after submission
5 months agoThis is because AIcrowd is re-running our submissions. Here is quote from @yilun_jin8
You donโt need to re-submit by yourself. We have queued all previous submissions for re-evaluation, and the re-evals will happen automatically.
The re-runs are failing because AIcrowdโs function search_pipeline() is returning None values now and this causes code to fail. (So instead of returning a python list with k items, it is just returning a single None value. But top k RAG should never return nothing).
TypeError: โNoneTypeโ object is not a mapping
Has Private Data Changed in Phase 2?
5 months agoBased on comments by @yilun_jin8 here, here, here (and displayed in quote below). It seems that the private LB test data has been updated. Does this mean that everyone needs to resubmit all their submissions again to see the improvement of the new v0.6 private web search databases?
Can admins please answer the 3 bullet points in my previous post? And in the future, can admins please inform all partcipants when changes are made to private data and/or submission processes? Thank you!
Because the size of the search indices expanded dramatically after the latest update from Meta, there were lower than expected storage space on the evaluation nodes for participants to store their models. Therefore, you may see that your submission failed due to limited storage.
We have addressed the problem by increasing the disk space on the node. There will be at least 250GB for participantsโ code and models.
We have also re-queued all recent submissions to account for this, and also for the latest updates in the search index.
Why Submission #286586 failed?
5 months ago@yilun_jin8 Does this mean that the private web search databases have recently been updated too? For example did you add 10-20% chunks into the private web search databases? (i.e. is there a private web search index v0.6?)
Can admins tell us when changes are made to private data? (Admins tell us about public changes, but it is also important for us to know about private changes too). Please answer my discussion post here. Thank you!
Has Private Data Changed in Phase 2?
5 months agoHello admins, can you tell us if anything about private data has changed since day 1 of phase 2? This is important to know for reproducibility and experiment evaluation.
- On day 1, were private nonegocentric images being resized incorrectly and then later fixed?
- On day 1, were private web search indexes corrupted and then later fixed?
- Since day 1, has 10-20% more chunks been added to private web search corpus?
The above 3 occurred on public data, so Iโm wondering if they also affected private data. Additionally were there any other changes to private data and/or submission process that has changed since day 1 of phase 2?

๐ Meta CRAG Challenge 2025 Winners Announcement
4 months agoDo winners have an opportunity to submit a paper and make a presentation at the KDD conference in August? If so, what are the deadlines and procedures?