Loading
12 Follower
0 Following
Chris_Deotte
Chris Deotte

Organization

Nvidia

Location

US

Badges

2
2
1

Connect

Activity

May
Jun
Jul
Aug
Sep
Oct
Nov
Dec
Jan
Feb
Mar
Apr
May
Mon
Wed
Fri

Challenge Categories

Loading...

Challenges Entered

Improve RAG with Real-World Benchmarks | KDD Cup 2025

Latest submissions

See All
graded 284893
graded 284834
graded 282712

Improve RAG with Real-World Benchmarks

Latest submissions

No submissions made in this challenge.

Latest submissions

See All
graded 270741
graded 270740
graded 270655

Latest submissions

See All
graded 235811
graded 235350
graded 235349

Latest submissions

See All
graded 235349
graded 235348
graded 235347

Latest submissions

See All
graded 235350
graded 235166
graded 235125

Generating answers using image-linked data

Latest submissions

See All
graded 282712
failed 281820
Participant Rating
mincheolyoon 0
happystat 0
unna97 0
linchia 0
Karrich 0
gaozhanfire 0
pengbo_wang 0
pp2915 0
pengyue_jia3 0
GenpengXu 0
xiaopeng_li 0
eliot8 0
Participant Rating
  • NVIDIA-Merlin Amazon KDD Cup '23: Multilingual Recommendation Challenge
    View
  • Team_NVIDIA Amazon KDD Cup 2024: Multi-Task Online Shopping Challenge for LLMs
    View

Meta CRAG - MM Challenge 2025

When is Deadline to Team up?

4 days ago

@yilun_jin8 @jyotish Hi, do either of you know the answers to these 2 quesions?

When is Deadline to Team up?

9 days ago

I have two questions about teaming up:

  1. When is the deadline to team up? (In the “timeline” section of website it says May 28th, and in the “participation and submission” section it says May 21st).
  2. Can participants team up if their combined phase 2 submissions exceed 6? (The submission limit for phase 2 is “each team can make 6 total submissions across all three tracks”. If participant A has 6 phase 2 subs and participant B has 6 phase 2 subs, can they still team up? because after teaming up their team will have 12 phase 2 subs).

During Submission How Do We Download Web Search URL?

14 days ago

One suggestion is that participants’ code cannot communicate directly with the internet (using WGET, etc). Instead they call an API provided by AIcrowd which fetches webpages. This ensures that participants’ code only receives information from the internet and cannot submit information (i.e. the hidden test questions) to external websites.

If this is how it currently works, how do we call the API to fetch webpages?

During Submission How Do We Download Web Search URL?

15 days ago

The web search api docs here, say that we need to download the result URL ourselves.

Note: The Search APIs only return urls for images and webpages, instead of full contents. To get the full webpage contents and images, you will have to download it yourself. During the challenge, participants can assume that the connection to these urls are available.

During submission, do we need to do this? And, how do we do this? because internet is turned off. Can we use WGET on these websites?

Doesn’t this pose a risk that if a participant owns these websites, they can transfer all the test questions to their URL (using an http GET or POST) during submission and receive all the hidden test questions (then hardcode the answers into future submissions)?

Questions about the leaderboard

19 days ago

Thanks for improving the leaderboard. There are still 5 things wrong with the leaderboard. Can you please fix the following 5 things? Thanks.

  1. Multi-source Augmentation (task 2) ranking is being determined by “Ego Samples” when ranking should be determined by “All Samples”. Furthermore, when we click to see mulit-source augmentation LB, we should see “All Samples” first by default.

  2. Multi-source Augmentation (task 2) ranking is being sorted by “Accuracy” when ranking should be determined by “Truthfulness”.

  3. Multi-turn QA (task 3) ranking is being determined by “Ego Samples” when ranking should be determined by “All Samples”. Furthermore, when we click to see mulit-turn QA LB, we should see “All Samples” first by default.

  4. Mulit-turn QA (task3) “Ego Samples” is displaying all scores as NAN

  5. Top score on Single Source Augmentation (task 1) incorrectly computes truthfullness as 0.889 when their hallucinations are 0.219 and accuracy is 0.108 (i.e. their truthfulness should be -0.111. Other team truthfulness scores were updated yesterday but this score was not updated).

Thanks for fixing these 5 issues!

Does "search_pipeline" source change during LB submission

22 days ago

I have noticed that you just (14 hour ago) updated the HuggingFace websearch vector database from 113k entries to 647k entires. Is the new database similar to the LB database?

For us to tune our models during local validation, we need a local validation database similar to what our models will see during LB submission. Is the current (newly updated websearch database) similar to LB database? And, is the image search validation database similar to LB image search validation database?

=========
Let me clarify my question. (1) For validation we have 647k entries in web search database to help us answer 1548 validation queries. So we have 418 database entries per validation question. Is this the same ratio that our models will see during LB submission web search?

(2) Furthermore, a certain percentage of validation queries have their answer contained inside the web search vector database (with the rest of vector database being noise). During LB submission, does the same percentage of answers and noise exist in the LB vector database?

And lastly, can you answer these 2 questions for image search? Thank you!

Does "search_pipeline" source change during LB submission

29 days ago

In the old evaluation script, the agent defined the search pipeline as "crag-mm-2025/image-search-index-validation" which means that the same vector database is used for both local validation and LB submission.

I see the new starter kit changed this. My question is: Does our submission use a different search pipeline or does submission also use "crag-mm-2025/image-search-index-validation"?

How to use private model

About 1 month ago

How to use private model

About 1 month ago

ChatGPT says we can add them as collaborators under settings. What is aicrowd’s HF username?

Where is the Starter Kit for submissions?

About 2 months ago

Hi everyone, I see that many teams have already submitted to the leaderboard. Where can i find the “For up to date instructions for your challenge, please refer to the starter kit provided by challenge organisers.” and “git clone ”?

Amazon KDD Cup 2024: Multi-Task Online Shopping Ch

Note for our final evaluation

10 months ago

@yilun_jin , Sometimes the exact same code will succeed one time and fail another time. For example we submitted the exact same code [here] and [here]. The first succeeded and the second failed. During re-run what happens if code fails that has previously succeeded? Will the admins run it a second time?

Also, can you tell us why the second link above failed?

When we select our final 2 submissions for each track, should we just select our best scoring submission twice in case it fails the first time it is re-run?

All Submissions Are Failing

11 months ago

Our team’s last 6 submissions failed. And when I look at the list of submissions from the other teams in the past 4 hours, all other teams failed too. Is there a problem with AIcrowd server?

Here are the links of our team’s last two failures [here] and [here]

Can an admin please investigate? Thank you.

Push gitlab and cannot find issue

11 months ago

The same thing has just happened to me. I have have created 5 new tags. They all appear in my GitLab but none appear in my issues.

They are tags submission-200, submission-202, submission-203, submission-204, submission-205. Some code are duplicates of each other because I tried submitting the same thing twice without success.

All Submissions "Waiting In Queue" for 12 Hours

11 months ago

FYI, all submissions (from all teams) have been “waiting in queue” for the past 12 hours. Perhaps an admin can investigate. Thanks.

Submission stuck on "evaluation initiated"

11 months ago

The following two submissions [here] and [here] are stuck with label “evaluation initiated” even though they have failed.

Can an admin switch the GitLab label to failed? Because as is, they are using 2 submission quotas. Thanks.

Submission Failed - Please Tell Us When Submission Works Again

11 months ago

Yes, this is not fixed. I just submitted and got
Submission failed : Failed to communicate with the grader. Please resubmit again in a few hours if this issue persists..
The GitLab issue is [here]

For the past 2 days, no team has been able to submit to track 5.

Please fix this issue and let us know when it is fixed. Thank you

Submissions fail

11 months ago

I am also seeing weird submission behavior today. I posted a discussion describing the errors I have been seeing today [here]

Submission Failed - Please Tell Us When Submission Works Again

11 months ago

Hi, for the past 4 hours, I have been receiving " Submission failed : Failed to communicate with the grader. Please resubmit again in a few hours if this issue persists.." when submitting to track 5. An example GitLab issue (for admins to review) is [here].

I have tried 3 times and received 3 “failed” submissions. I do not want to try anymore because I do not want to use up my failed submission quota. Can an admin tell us when submissions are working for track 5 again? Thanks.

Track 2 LB Doesn't Show Retrieval Score

11 months ago

Hi, Can admins @yilun_jin fix the track 2 leaderboard webpage to show each teams’ retrieval score? Thank you.

Phase 2 launching!

12 months ago

I notice that AIcrowd website says “Round 2: 21 days left” which implies that phase 2 ends on June 15th. It this the correct end of phase 2?

Earned a BA in mathematics then worked as a graphic artist, photographer, carpenter, and teacher. Earned a PhD in computational science and mathematics with a thesis on optimizing parallel processing. Now work as a data scientist and researcher.