
Organization
Location
Badges
Activity
Challenge Categories
Challenges Entered
Improve RAG with Real-World Benchmarks | KDD Cup 2025
Latest submissions
See All| graded | 289797 | ||
| graded | 289788 | ||
| graded | 289778 |
Improve RAG with Real-World Benchmarks
Latest submissions
See All| graded | 267130 | ||
| graded | 267129 | ||
| graded | 267099 |
Amazon KDD Cup 2022
Latest submissions
Testing RAG Systems with Limited Web Pages
Latest submissions
See All| graded | 266952 | ||
| graded | 266951 | ||
| graded | 266273 |
Enhance RAG systems With Multiple Web Sources & Mock API
Latest submissions
See All| graded | 267130 | ||
| graded | 267129 | ||
| failed | 266263 |
Generating answers using image-linked data
Latest submissions
See All| graded | 289797 | ||
| graded | 289693 | ||
| graded | 289626 |
| Participant | Rating |
|---|---|
chenghao_shaun
|
0 |
shizueyy
|
0 |
dako
|
0 |
graphway
|
0 |
| Participant | Rating |
|---|
Meta CRAG - MM Challenge 2025
Why did 289384, 289471 faild?
6 months agoIt has already reached 100%. And 289,697 shows โStep has exceeded its deadlineโ
Why did 289428 failed?
6 months agoConnectionError: (MaxRetryError(โHTTPSConnectionPool(host=โhuggingface.coโ, port=443): Max retries exceeded with url: /api/datasets/crag-mm-2025/crag-mm-single-turn-debug-private/revision/b5ff0aaa05fab0256d77682b4b7da582c0660a6b (Caused by NameResolutionError(โ<urllib3.connection.HTTPSConnection object at 0x7f7f00af3e50>: Failed to resolve โhuggingface.coโ ([Errno -3] Temporary failure in name resolution)โ))โ), โ(Request ID: 7c73f288-c699-438b-9794-be08cad15999)โ) Check the submission page for more details.
Suggestion: Make Evaluation Prompts More Flexible
6 months agoMoreover, I believe the evaluation prompt should be made public. If someone wants to โhackโ the prompt, they donโt actually need to know its exact contentโkeeping it secret only widens the gap between local testing and server-side evaluation results.
Suggestion: Make Evaluation Prompts More Flexible
6 months agoI think the current evaluation prompt is too strict, causing everyone to respond with โI donโt knowโ frequently just to ensure a score > 0. In reality, many answers could be considered partially correctโat least, human evaluators would take this into account. However, under the current setup, the top 10 models donโt attempt to provide partially correct strategies, which might actually perform worse in human evaluation compared to strategies scoring below 0. Yet, these strategies never even reach human review. I suggest the organizers relax the evaluation prompt to at least allow for some score differentiation.
Why failed Submission #285113
7 months agoEvaluation failed with exit code 1. I hope I can take a look at the error message
๐ข Dataset Release: CRAG-MM v0.1.1 ๐
8 months agoIn current CragImageKG file ๏ผโฆ/cragmm_search/image_search_mock_api/image_kg.py, cragmm-search-pipeline==0.2.10๏ผ, the field in get_image_url function should be img_url, otherwise it will cause an error.
๐ข Dataset Release: CRAG-MM v0.1.1 ๐
8 months agoThe current rag-agent does not differentiate between task1 and task2. How should UnifiedSearchPipeline be used specifically for task1?
Meta Comprehensive RAG Benchmark: KDD Cup 2-9d1937
How exactly is the number of submissions counted ten times a week?
Over 1 year agoAfter my testing, if the error is reported in the build environment, the submission time will not be deducted, but it will be recorded.
๐จ IMP: Phase 2 Announcement
Over 1 year agosame problem, I also email the help@aicrowd.com but no responses
Has phase-2 started?
Over 1 year agoI donโt know, I saw some successful commits and tried to commit and found that it got scores but didnโt update the leaderboard
Has phase-2 started?
Over 1 year agohi bro, I have the same question and have not received any message.
Submission failed
Over 1 year agoSubmission failed : You have exceeded the allowed number of parallel submissions. Please wait until your other submission(s) are graded.
No other submissions but failed.
Expect to return a message that stating whether it was a timeout problem
Over 1 year agoThis doesnโt need to return logs and helps us troubleshoot some issues
๐ข Announcements: Phase 1 Extension, New Private Test Set, Batch Prediction Interface, and Updated Baselines
Over 1 year agoAccording to baseline , now each query only has 10s to answer?
- Response Time: Ensure that your model processes and responds to each query within 10 seconds.
๐ข Announcement: Addition of `query_time` to the `generate_answer` Interface, and increased Timeouts!
Over 1 year agoWhat is the format of query_time?
Meta KDD Cup 24 - CRAG - Retrieval Summarization
About Test Set Leakage in Round 1
Over 1 year agoIn fact, the test set for round1 is the data set given to us, so there is no leakage problem
Why did 289384, 289471 faild?
6 months agoWill you consider resubmitting 289697 ? @yilun_jin8 @jyotish