Loading
Warm-Up Round: Completed Round 1: Completed Round 2: Completed Weight: 1.0
9541
1028
256
918

 

โ€ผ๏ธSelect your final submissions for evaluation by 23 June 2025, 12:00 U

 

๐Ÿ™‹โ€โ™€๏ธ New to the challenge? ๐Ÿค” Want to make your first submission? 

โš™๏ธ Access the Starter-Kit here   ๐Ÿ“š Check out the official CRAG-MM v0.1.1 Dataset here.
 

๐Ÿ’ฌ Join the conversation on Discord โ€“ connect with other participants, get support, and stay updated. Jump in and introduce yourself ๐Ÿ‘‰  https://discord.gg/YWDQQa8byx

An MM-RAG QA system takes as input an image ๐ผ and a question ๐‘„, and outputs an answer ๐ด; the answer is generated by MM-LLMs according to information retrieved from external sources, combined with knowledge internalized in the model. A Multi-turn MM-RAG QA system in addition takes questions and answers from previous turns as context to answer new questions. The answer should provide useful information to answer the question, without adding any hallucination.

Task #1: Single-source Augmentation

In Task #1, we provide an image mock API to access information from an underlying image-based mock KG. The mock KG is indexed by the image and stores structured data associated with the image; answers to the questions may or may not exist in the mock KG.

The mock API takes an image as input and returns similar images from the mock KG along with structured data associated with each image to support answer generation.

This task aims to test the answer generation capability of MM-RAG systems.

To know more about the Meta CRAG-MM challenge, please see: https://www.aicrowd.com/challenges/meta-crag-mm-challenge-2025

๐Ÿ“š Explore the pre-release sample dataset now
๐Ÿ’ฌ Join the conversation on Discord โ€“ connect with other participants, get support, and stay updated. 

๐Ÿ‘‰ Jump in and introduce yourself: https://discord.gg/YWDQQa8byx

An MM-RAG QA system takes as input an image ๐ผ and a question ๐‘„, and outputs an answer ๐ด; the answer is generated by MM-LLMs according to information retrieved from external sources, combined with knowledge internalized in the model. A Multi-turn MM-RAG QA system in addition takes questions and answers from previous turns as context to answer new questions. The answer should provide useful information to answer the question, without adding any hallucination.

Task #1: Single-source Augmentation

In Task #1, we provide an image mock API to access information from an underlying image-based mock KG. The mock KG is indexed by the image and stores structured data associated with the image; answers to the questions may or may not exist in the mock KG.

The mock API takes an image as input and returns similar images from the mock KG along with structured data associated with each image to support answer generation.

This task aims to test the answer generation capability of MM-RAG systems.

To know more about the Meta CRAG-MM challenge, please see: https://www.aicrowd.com/challenges/meta-crag-mm-challenge-2025

An MM-RAG QA system takes as input an image ๐ผ and a question ๐‘„, and outputs an answer ๐ด; the answer is generated by MM-LLMs according to information retrieved from external sources, combined with knowledge internalized in the model. A Multi-turn MM-RAG QA system in addition takes questions and answers from previous turns as context to answer new questions. The answer should provide useful information to answer the question, without adding any hallucination.

Task #3: Multi-turn QA

Unlike Task #1 and Task #2, Task #3 evaluates the systemโ€™s ability to conduct multi-turn conversations. Each conversation consists of 2โ€“6 turns, with later questions may or may not require the image for answering.

This task focuses on testing context understanding to ensure smooth multi-turn interactions.

To know more about the Meta CRAG-MM challenge, please see: https://www.aicrowd.com/challenges/meta-crag-mm-challenge-2025