Loading
0 Follower
0 Following
yikuan_xia

Badges

0
0
0

Activity

Jul
Aug
Sep
Oct
Nov
Dec
Jan
Feb
Mar
Apr
May
Jun
Jul
Mon
Wed
Fri

Challenge Categories

Loading...

Challenges Entered

Improve RAG with Real-World Benchmarks | KDD Cup 2025

Latest submissions

No submissions made in this challenge.

Improve RAG with Real-World Benchmarks

Latest submissions

No submissions made in this challenge.

Testing RAG Systems with Limited Web Pages

Latest submissions

No submissions made in this challenge.

Synthesising answers from image and web sources

Latest submissions

No submissions made in this challenge.
Participant Rating
Participant Rating
  • db3 Meta Comprehensive RAG Benchmark: KDD Cup 2024
    View
  • db3 Meta CRAG - MM Challenge 2025
    View

Meta CRAG - MM Challenge 2025

Image and web search API updates and feedback

About 1 month ago

The quality of the new search_pipeline for web is far worse than the previous version.
1.Why the currecnt public-test set and validatation set has similar number of queries, but the corresponding pipeline for web search have different number of entries. (public-test web search~130k,current validation web search~70k). This indicates that many information can’t be found in the current validation web search pipeline
2. I randomly check 10 entities in the ground truth validation dataset, 5 of them can not be found in the search pipeline, (the entites are: millipedes, gatorade, suits TV show, st.patrick’s cathedral, hugo boss)). In the last round validation pipeline and the claimed settings, we can find almost all related information in the pipeline. There must be a problem in the current version of web search pipeline. Besides, the end-2-end performance of our local baseline is worsened a lot using the current search pipeline compared with the pipeline proposed in phase 1. Can you check whether there’s problem with the web search index in the current version?

Starter‑Kit Update – **`WebSearchResult` helper for full‑page retrieval**

About 2 months ago

what about the image in the image api output, are such features available for images?

📢 Update of the search API (v0.4.0) and indices (v0.4)

2 months ago

Can you clarify whether we can use the image in the image retrieval output or can we only use the text descriptions of the retrieved images? If we can use the image, how can we access it during evaluation. This is crucial because it determines the input information of the RAG system.

📢 Update of the search API (v0.4.0) and indices (v0.4)

2 months ago

which attribute should we access to get the retrieved image? Is there an example how we can acceess the image? (Not the image_url,but the raw image in the retrieval results)

📢 Update of the search API (v0.4.0) and indices (v0.4)

2 months ago

can we get access to the retrieved image using the image_url on the evaluation server?

Does "search_pipeline" source change during LB submission

3 months ago

which search pieline should we use while we are developing our agents? For the crag-mm-2025/crag-mm-single-turn-public qas, should we use crag-mm-2025/image-search-index-validation or crag-mm-2025/image-search-index-public during developing?

The cragmm_search file is missing

3 months ago

When will the training data be released?

Question about image retrieval format and retrieval rules?

3 months ago

q1:The current image retrieval results only contain url of the retrieved image, will the image be available in the contest?

Meta Comprehensive RAG Benchmark: KDD Cup 2-9d1937

🚨 IMP: Phase 2 Announcement

About 1 year ago

what does api update mean, is the test environment updated to the new api?

Issues about submission LFS file issues

About 1 year ago

During our last submission, we successfully go throught the docker building process. And there’s an error during the inference stage when we are attempting to load the large model:

File “/src/models/dummy_model.py”, line 86, in init

File “/src/models/dummy_model.py”, line 86, in init

self.m = LlamaForCausalLM.from_pretrained(model, device_map="balanced",

self.m = LlamaForCausalLM.from_pretrained(model, device_map="balanced",

File “/home/aicrowd/.conda/lib/python3.8/site-packages/transformers/modeling_utils.py”, line 3531, in from_pretrained

File “/home/aicrowd/.conda/lib/python3.8/site-packages/transformers/modeling_utils.py”, line 3531, in from_pretrained

) = cls._load_pretrained_model(

) = cls._load_pretrained_model(

File “/home/aicrowd/.conda/lib/python3.8/site-packages/transformers/modeling_utils.py”, line 3938, in _load_pretrained_model

File “/home/aicrowd/.conda/lib/python3.8/site-packages/transformers/modeling_utils.py”, line 3938, in _load_pretrained_model

state_dict = load_state_dict(shard_file, is_quantized=is_quantized)

state_dict = load_state_dict(shard_file, is_quantized=is_quantized)

File “/home/aicrowd/.conda/lib/python3.8/site-packages/transformers/modeling_utils.py”, line 542, in load_state_dict

File “/home/aicrowd/.conda/lib/python3.8/site-packages/transformers/modeling_utils.py”, line 542, in load_state_dict

raise OSError(

raise OSError(

OSError: You seem to have cloned a repository without having git-lfs installed. Please install git-lfs and run git lfs install followed by git lfs pull in the folder you cloned.

OSError: You seem to have cloned a repository without having git-lfs installed. Please install git-lfs and run git lfs install followed by git lfs pull in the folder you cloned.

In the website page of our repository, the model files are correctly uploaded with an LFS tag on the left hand. The log message suggest installing git-lfs. So we made another try.

We add “git-lfs” in the original “apt.txt”, but this time, it doesn’t go through the docker buiding stage.

How can I fix this issue?

Another question is that: what’s the submission number limitations now. Is it 6 times a week, and does that include a failure submission?

yikuan_xia has not provided any information yet.