Loading
4 Follower
0 Following
jiazunchen

Organization

Peking University

Location

CN

Badges

0
0
0

Activity

Jun
Jul
Aug
Sep
Oct
Nov
Dec
Jan
Feb
Mar
Apr
May
Jun
Mon
Wed
Fri

Challenge Categories

Loading...

Challenges Entered

Improve RAG with Real-World Benchmarks | KDD Cup 2025

Latest submissions

See All
graded 289797
graded 289788
graded 289778

Improve RAG with Real-World Benchmarks

Latest submissions

See All
graded 267130
graded 267129
graded 267099

Latest submissions

No submissions made in this challenge.

Testing RAG Systems with Limited Web Pages

Latest submissions

See All
graded 266952
graded 266951
graded 266273

Enhance RAG systems With Multiple Web Sources & Mock API

Latest submissions

See All
graded 267130
graded 267129
failed 266263

Generating answers using image-linked data

Latest submissions

See All
graded 289797
graded 289693
graded 289626
Participant Rating
chenghao_shaun 0
shizueyy 0
dako 0
graphway 0
Participant Rating
  • db3 Meta Comprehensive RAG Benchmark: KDD Cup 2024
    View
  • db3 Meta CRAG - MM Challenge 2025
    View

Meta CRAG - MM Challenge 2025

Why did 289384, 289471 faild?

4 days ago

Will you consider resubmitting 289697 ? @yilun_jin8 @jyotish

Why did 289384, 289471 faild?

6 days ago

It has already reached 100%. And 289,697 shows “Step has exceeded its deadline”

Why did 289428 failed?

7 days ago

ConnectionError: (MaxRetryError(‘HTTPSConnectionPool(host=‘huggingface.co’, port=443): Max retries exceeded with url: /api/datasets/crag-mm-2025/crag-mm-single-turn-debug-private/revision/b5ff0aaa05fab0256d77682b4b7da582c0660a6b (Caused by NameResolutionError(“<urllib3.connection.HTTPSConnection object at 0x7f7f00af3e50>: Failed to resolve ‘huggingface.co’ ([Errno -3] Temporary failure in name resolution)”))’), ‘(Request ID: 7c73f288-c699-438b-9794-be08cad15999)’) Check the submission page for more details.

Why did Submission 288711 failed?

9 days ago

Why did Submission 288711 failed? thank you.

Why did Submission #287803 Fail?

15 days ago

(topic deleted by author)

Important Update on Missing/Refusal Rate

16 days ago

What specifically is the high missing rate?

Suggestion: Make Evaluation Prompts More Flexible

26 days ago

Moreover, I believe the evaluation prompt should be made public. If someone wants to ‘hack’ the prompt, they don’t actually need to know its exact content—keeping it secret only widens the gap between local testing and server-side evaluation results.

Suggestion: Make Evaluation Prompts More Flexible

26 days ago

I think the current evaluation prompt is too strict, causing everyone to respond with ‘I don’t know’ frequently just to ensure a score > 0. In reality, many answers could be considered partially correct—at least, human evaluators would take this into account. However, under the current setup, the top 10 models don’t attempt to provide partially correct strategies, which might actually perform worse in human evaluation compared to strategies scoring below 0. Yet, these strategies never even reach human review. I suggest the organizers relax the evaluation prompt to at least allow for some score differentiation.

Why failed Submission #285113

About 1 month ago

Evaluation failed with exit code 1. I hope I can take a look at the error message

📢 Dataset Release: CRAG-MM v0.1.1 🚀

2 months ago

In current CragImageKG file (…/cragmm_search/image_search_mock_api/image_kg.py, cragmm-search-pipeline==0.2.10), the field in get_image_url function should be img_url, otherwise it will cause an error.

📢 Dataset Release: CRAG-MM v0.1.1 🚀

2 months ago

The current rag-agent does not differentiate between task1 and task2. How should UnifiedSearchPipeline be used specifically for task1?

Meta Comprehensive RAG Benchmark: KDD Cup 2-9d1937

How exactly is the number of submissions counted ten times a week?

About 1 year ago

After my testing, if the error is reported in the build environment, the submission time will not be deducted, but it will be recorded.

🚨 IMP: Phase 2 Announcement

About 1 year ago

same problem, I also email the help@aicrowd.com but no responses

Has phase-2 started?

About 1 year ago

It’s been submitted more than six times

Has phase-2 started?

About 1 year ago

(post deleted by author)

Has phase-2 started?

About 1 year ago

I don’t know, I saw some successful commits and tried to commit and found that it got scores but didn’t update the leaderboard

Has phase-2 started?

About 1 year ago

hi bro, I have the same question and have not received any message.

Submission failed

About 1 year ago

Submission failed : You have exceeded the allowed number of parallel submissions. Please wait until your other submission(s) are graded.

No other submissions but failed.

Meta KDD Cup 24 - CRAG - Retrieval Summarization

About Test Set Leakage in Round 1

About 1 year ago

In fact, the test set for round1 is the data set given to us, so there is no leakage problem

jiazunchen has not provided any information yet.