Loading
4 Follower
0 Following
jiazunchen

Organization

Peking University

Location

CN

Badges

0
0
0

Activity

Dec
Jan
Feb
Mar
Apr
May
Jun
Jul
Aug
Sep
Oct
Nov
Dec
Mon
Wed
Fri

Challenge Categories

Loading...

Challenges Entered

Improve RAG with Real-World Benchmarks | KDD Cup 2025

Latest submissions

See All
graded 289797
graded 289788
graded 289778

Improve RAG with Real-World Benchmarks

Latest submissions

See All
graded 267130
graded 267129
graded 267099

Latest submissions

No submissions made in this challenge.

Testing RAG Systems with Limited Web Pages

Latest submissions

See All
graded 266952
graded 266951
graded 266273

Enhance RAG systems With Multiple Web Sources & Mock API

Latest submissions

See All
graded 267130
graded 267129
failed 266263

Generating answers using image-linked data

Latest submissions

See All
graded 289797
graded 289693
graded 289626
Participant Rating
chenghao_shaun 0
shizueyy 0
dako 0
graphway 0
Participant Rating
  • db3 Meta Comprehensive RAG Benchmark: KDD Cup 2024
    View
  • db3 Meta CRAG - MM Challenge 2025
    View

Meta CRAG - MM Challenge 2025

Why did 289384, 289471 faild?

6 months ago

Will you consider resubmitting 289697 ? @yilun_jin8 @jyotish

Why did 289384, 289471 faild?

6 months ago

It has already reached 100%. And 289,697 shows โ€œStep has exceeded its deadlineโ€

Why did 289428 failed?

6 months ago

ConnectionError: (MaxRetryError(โ€˜HTTPSConnectionPool(host=โ€˜huggingface.coโ€™, port=443): Max retries exceeded with url: /api/datasets/crag-mm-2025/crag-mm-single-turn-debug-private/revision/b5ff0aaa05fab0256d77682b4b7da582c0660a6b (Caused by NameResolutionError(โ€œ<urllib3.connection.HTTPSConnection object at 0x7f7f00af3e50>: Failed to resolve โ€˜huggingface.coโ€™ ([Errno -3] Temporary failure in name resolution)โ€))โ€™), โ€˜(Request ID: 7c73f288-c699-438b-9794-be08cad15999)โ€™) Check the submission page for more details.

Why did Submission 288711 failed?

6 months ago

Why did Submission 288711 failed? thank you.

Important Update on Missing/Refusal Rate

6 months ago

What specifically is the high missing rate?

Suggestion: Make Evaluation Prompts More Flexible

6 months ago

Moreover, I believe the evaluation prompt should be made public. If someone wants to โ€˜hackโ€™ the prompt, they donโ€™t actually need to know its exact contentโ€”keeping it secret only widens the gap between local testing and server-side evaluation results.

Suggestion: Make Evaluation Prompts More Flexible

6 months ago

I think the current evaluation prompt is too strict, causing everyone to respond with โ€˜I donโ€™t knowโ€™ frequently just to ensure a score > 0. In reality, many answers could be considered partially correctโ€”at least, human evaluators would take this into account. However, under the current setup, the top 10 models donโ€™t attempt to provide partially correct strategies, which might actually perform worse in human evaluation compared to strategies scoring below 0. Yet, these strategies never even reach human review. I suggest the organizers relax the evaluation prompt to at least allow for some score differentiation.

Why failed Submission #285113

7 months ago

Evaluation failed with exit code 1. I hope I can take a look at the error message

๐Ÿ“ข Dataset Release: CRAG-MM v0.1.1 ๐Ÿš€

8 months ago

In current CragImageKG file ๏ผˆโ€ฆ/cragmm_search/image_search_mock_api/image_kg.py, cragmm-search-pipeline==0.2.10๏ผ‰, the field in get_image_url function should be img_url, otherwise it will cause an error.

๐Ÿ“ข Dataset Release: CRAG-MM v0.1.1 ๐Ÿš€

8 months ago

The current rag-agent does not differentiate between task1 and task2. How should UnifiedSearchPipeline be used specifically for task1?

Meta Comprehensive RAG Benchmark: KDD Cup 2-9d1937

How exactly is the number of submissions counted ten times a week?

Over 1 year ago

After my testing, if the error is reported in the build environment, the submission time will not be deducted, but it will be recorded.

๐Ÿšจ IMP: Phase 2 Announcement

Over 1 year ago

same problem, I also email the help@aicrowd.com but no responses

Has phase-2 started?

Over 1 year ago

Itโ€™s been submitted more than six times

Has phase-2 started?

Over 1 year ago

I donโ€™t know, I saw some successful commits and tried to commit and found that it got scores but didnโ€™t update the leaderboard

Has phase-2 started?

Over 1 year ago

hi bro, I have the same question and have not received any message.

Submission failed

Over 1 year ago

Submission failed : You have exceeded the allowed number of parallel submissions. Please wait until your other submission(s) are graded.

No other submissions but failed.

Expect to return a message that stating whether it was a timeout problem

Over 1 year ago

This doesnโ€™t need to return logs and helps us troubleshoot some issues

๐Ÿ“ข Announcements: Phase 1 Extension, New Private Test Set, Batch Prediction Interface, and Updated Baselines

Over 1 year ago

According to baseline , now each query only has 10s to answer?
- Response Time: Ensure that your model processes and responds to each query within 10 seconds.

Meta KDD Cup 24 - CRAG - Retrieval Summarization

About Test Set Leakage in Round 1

Over 1 year ago

In fact, the test set for round1 is the data set given to us, so there is no leakage problem

jiazunchen has not provided any information yet.