Loading
4 Follower
0 Following
snehananavati
Sneha Nanavati

Organization

AIcrowd

Location

IN

Badges

20
21
21

Activity

May
Jun
Jul
Aug
Sep
Oct
Nov
Dec
Jan
Feb
Mar
Apr
May
Mon
Wed
Fri

Challenge Categories

Loading...

Challenges Entered

Create Context-Aware, Dynamic, and Immersive In-Game Dialogue

Latest submissions

No submissions made in this challenge.

Improve RAG with Real-World Benchmarks | KDD Cup 2025

Latest submissions

No submissions made in this challenge.

Automating Building Data Classification

Latest submissions

See All
failed 280150

Generate Synchronised & Contextually Accurate Videos

Latest submissions

No submissions made in this challenge.

Improve RAG with Real-World Benchmarks

Latest submissions

No submissions made in this challenge.

Latest submissions

No submissions made in this challenge.

Latest submissions

No submissions made in this challenge.

Latest submissions

See All
failed 247893
graded 247892

Multi-Agent Dynamics & Mixed-Motive Cooperation

Latest submissions

No submissions made in this challenge.

Latest submissions

No submissions made in this challenge.

Latest submissions

No submissions made in this challenge.

Latest submissions

No submissions made in this challenge.

Latest submissions

No submissions made in this challenge.

Small Object Detection and Classification

Latest submissions

See All
failed 235496

Understand semantic segmentation and monocular depth estimation from downward-facing drone images

Latest submissions

No submissions made in this challenge.

Latest submissions

No submissions made in this challenge.

Latest submissions

No submissions made in this challenge.

A benchmark for image-based food recognition

Latest submissions

No submissions made in this challenge.

Latest submissions

No submissions made in this challenge.

Latest submissions

No submissions made in this challenge.

What data should you label to get the most value for your money?

Latest submissions

No submissions made in this challenge.

Interactive embodied agents for Human-AI collaboration

Latest submissions

No submissions made in this challenge.

Latest submissions

No submissions made in this challenge.

Latest submissions

No submissions made in this challenge.

Behavioral Representation Learning from Animal Poses.

Latest submissions

No submissions made in this challenge.

Airborne Object Tracking Challenge

Latest submissions

No submissions made in this challenge.

ASCII-rendered single-player dungeon crawl game

Latest submissions

No submissions made in this challenge.

5 Puzzles 21 Days. Can you solve it all?

Latest submissions

No submissions made in this challenge.

Measure sample efficiency and generalization in reinforcement learning using procedurally generated environments

Latest submissions

No submissions made in this challenge.

5 Puzzles 21 Days. Can you solve it all?

Latest submissions

No submissions made in this challenge.

Self-driving RL on DeepRacer cars - From simulation to real world

Latest submissions

No submissions made in this challenge.

3D Seismic Image Interpretation by Machine Learning

Latest submissions

No submissions made in this challenge.

5 Puzzles 21 Days. Can you solve it all?

Latest submissions

No submissions made in this challenge.

Latest submissions

No submissions made in this challenge.

5 Puzzles 21 Days. Can you solve it all?

Latest submissions

No submissions made in this challenge.

5 Puzzles 21 Days. Can you solve it all?

Latest submissions

No submissions made in this challenge.

Multi-Agent Reinforcement Learning on Trains

Latest submissions

No submissions made in this challenge.

A dataset and open-ended challenge for music recommendation research

Latest submissions

No submissions made in this challenge.

A benchmark for image-based food recognition

Latest submissions

No submissions made in this challenge.

Latest submissions

No submissions made in this challenge.

Sample-efficient reinforcement learning in Minecraft

Latest submissions

No submissions made in this challenge.

5 Puzzles, 3 Weeks. Can you solve them all? 😉

Latest submissions

No submissions made in this challenge.

Multi-agent RL in game environment. Train your Derklings, creatures with a neural network brain, to fight for you!

Latest submissions

No submissions made in this challenge.

Predicting smell of molecular compounds

Latest submissions

No submissions made in this challenge.

Latest submissions

No submissions made in this challenge.

5 Problems 21 Days. Can you solve it all?

Latest submissions

No submissions made in this challenge.

5 Puzzles 21 Days. Can you solve it all?

Latest submissions

No submissions made in this challenge.

5 Puzzles, 3 Weeks | Can you solve them all?

Latest submissions

No submissions made in this challenge.

Latest submissions

No submissions made in this challenge.

Grouping/Sorting players into their respective teams

Latest submissions

No submissions made in this challenge.

5 Problems 15 Days. Can you solve it all?

Latest submissions

No submissions made in this challenge.

5 Problems 15 Days. Can you solve it all?

Latest submissions

No submissions made in this challenge.

Latest submissions

No submissions made in this challenge.

5 PROBLEMS 3 WEEKS. CAN YOU SOLVE THEM ALL?

Latest submissions

No submissions made in this challenge.

Latest submissions

No submissions made in this challenge.

Remove Smoke from Image

Latest submissions

No submissions made in this challenge.

Classify Rotation of F1 Cars

Latest submissions

No submissions made in this challenge.

Can you classify Research Papers into different categories ?

Latest submissions

No submissions made in this challenge.

Can you dock a spacecraft to ISS ?

Latest submissions

No submissions made in this challenge.

Multi-Agent Reinforcement Learning on Trains

Latest submissions

No submissions made in this challenge.

Multi-Class Object Detection on Road Scene Images

Latest submissions

No submissions made in this challenge.

Localization, SLAM, Place Recognition, Visual Navigation, Loop Closure Detection

Latest submissions

No submissions made in this challenge.

Latest submissions

No submissions made in this challenge.

Detect Mask From Faces

Latest submissions

No submissions made in this challenge.

Identify Words from silent video inputs.

Latest submissions

No submissions made in this challenge.

A Challenge on Continual Learning using Real-World Imagery

Latest submissions

No submissions made in this challenge.

Latest submissions

See All
graded 200977

Music source separation of an audio signal into separate tracks for vocals, bass, drums, and other

Latest submissions

No submissions made in this challenge.

Latest submissions

No submissions made in this challenge.

Latest submissions

No submissions made in this challenge.

Latest submissions

See All
failed 247893
graded 247892

Make Informed Decisions with Shopping Knowledge

Latest submissions

No submissions made in this challenge.

Generate Videos with Temporal and Semantic Audio Sync

Latest submissions

No submissions made in this challenge.

Create Videos with Spatially Aligned Stereo Audio

Latest submissions

No submissions made in this challenge.

Latest submissions

No submissions made in this challenge.

Build Context-Aware Conversational NPC Agents

Latest submissions

No submissions made in this challenge.

Task-Oriented Conversational AI for NPC Agents

Latest submissions

No submissions made in this challenge.

Latest submissions

No submissions made in this challenge.
Participant Rating
vrv 0
cadabullos 0
cavalier_anonyme 0
ReachAMY 0
Participant Rating

Meta CRAG - MM Challenge 2025

🏆 Winner Spotlight Series md_d [Meta KDD Cup 2024]

Yesterday

In this spotlight, we explore the modular, logic-aware system from Mitchell DeHaven, username md_dh, secured 3rd place in Task 1.

:hammer_and_wrench: Mitchell’s Pipeline – MARAGS System Overview

A custom-built Multi-Adapter Retrieval-Augmented Generation System (MARAGS) included:

  • Document chunking via BeautifulSoup (<2000 characters per segment)
  • Cross-encoder reranking of segments by query relevance
  • Modular LoRA adapters fine-tuned per subtask
  • CoT prompting for complex reasoning
  • Evaluation through code execution of API responses

:small_blue_diamond: Hallucination Control and Answer Reliability

  • Contexts were pre-filtered for “hittability”—whether they included the actual answer
  • If uncertain, the model was explicitly prompted to output: “I don’t know”
  • API call responses were tested via eval() for execution correctness

:bulb: What Carries Over to 2025?

Mitchell’s strategy remains highly relevant for this year’s MM-RAG focus:

  • Modular tuning scales to multi-modal pipelines via adapter switching
  • Hittability filtering helps reduce noise across web, image, and KG fusion
  • Evaluation via execution mirrors this year’s emphasis on verifiability and trust
  • Chain-of-Thought prompting supports visual reasoning and multi-hop QA

Mitchell DeHaven is a machine learning engineer at Darkhive, with prior experience in NLP and speech systems at USC’s ISI.

Stay tuned for more insights from past CRAG standouts—and good luck with your submissions!

Read other winning strategies here: db3 and dRAGonRAnGers

🏆 Learning from Team dRAGonRAnGers’ Strategy [Meta CRAG 2024]

Yesterday

Winner Spotlight Series: Day 2 – dRAGonRAnGers
2nd Place in Task 1 | 3rd Place in Task 2 & 3

As we look ahead to the Meta CRAG-MM Challenge 2025, it’s worth revisiting the inventive strategies that emerged last year. In this edition of the Winner Spotlight series, we highlight Team dRAGonRAnGers from POSTECH, who earned podium finishes across all three tasks with their pragmatic and efficiency-driven approach to RAG. You can also read the complete technical report over here.

Their work is a lesson in thoughtful engineering—optimising for real-world constraints without compromising answer quality.


:mag: Challenge Recap: A Demanding Test of RAG

The 2024 CRAG Challenge pushed participants to develop Retrieval-Augmented Generation systems that could reason over web documents and structured graphs with minimal hallucinations. Success depended not just on accurate retrieval, but also on balancing cost, latency, and model robustness.


:bulb: Core Insight: Trust the Model—But Verify

At the heart of the dRAGonRAnGers’ approach was an elegant refinement of the RAG pipeline aimed at:

  • Avoiding unnecessary retrievals when the LLM already had a high-confidence response
  • Preventing hallucinations by validating outputs through self-reflection

Their strategy revolved around a two-stage enhancement process:


:small_blue_diamond: Step 1: Retrieval Bypass via LLM Confidence

Rather than treating retrieval as mandatory, the team built a mechanism to assess the confidence of the LLM’s internal knowledge (likely using fine-tuned LLaMA variants). When confidence crossed a defined threshold, the system skipped retrieval entirely, saving compute and latency.

This adaptive gating proved particularly effective for factoid or frequently seen questions—an increasingly relevant optimisation for production-grade QA systems.


:small_blue_diamond: Step 2: Post-Generation Answer Verification

Even when retrieval was bypassed or ambiguous data was returned, the team added a verification layer: a second pass through the LLM to judge the trustworthiness of the output.

This form of self-consistency checking acted as a safeguard, filtering out hallucinations and improving answer reliability.


:chart_with_upwards_trend: Outcome: Efficient, Accurate, Scalable

The combination of selective retrieval and post-hoc verification resulted in:

  • Lowered system load without sacrificing accuracy
  • Fewer hallucinations, particularly in borderline or low-signal queries
  • Improved responsiveness for multi-turn and interactive scenarios

In a challenge that increasingly reflects real-world constraints, their system offered a compelling balance between precision and pragmatism.


:busts_in_silhouette: Meet the Minds Behind dRAGonRAnGers

The team hails from the Data Systems Lab at POSTECH, blending deep academic research with a drive to tackle applied AI problems.

Their participation was driven by a shared goal: explore the real-world trade-offs of building reliable, cost-efficient RAG systems.


:arrows_counterclockwise: Lessons for 2025: Efficiency Is a Competitive Advantage

While the CRAG-MM Challenge 2025 introduces multi-modal and multi-turn elements, the principles behind dRAGonRAnGers’ design carry forward:

  • Retrieval Gating: In image-heavy queries, selectively triggering retrieval (e.g., only when OCR or visual tags lack clarity) could save valuable inference time.
  • Answer Verification: With more complex inputs (e.g., image-KG hybrids), validating generated answers before surfacing them remains crucial.
  • Resource-Aware Design: Their cost-conscious pipeline offers a strong blueprint for systems facing real-time or on-device constraints.

Stay tuned for more Winner Spotlights—and best of luck as you shape your own strategy for this year’s challenge.

Read other winning strategies here: db3 and md_dh

🏆 Behind the Winning Strategy of Team db3 [Meta CRAG 2024]

Yesterday

As we gear up for the new round of Meta CRAG MM Challenge 2025, let’s revisit the standout approaches from last year’s competition. In this Winner Spotlight, we dive into the strategy behind Team db3, who took the top spot across all three tasks in the Meta KDD Cup 2024 – CRAG Challenge. You can also read the complete technical report over here.

This deep dive is designed to inform and inspire participants aiming to push boundaries in retrieval-augmented generation (RAG) this year.


:mag: Challenge Overview: What Was CRAG 2024 All About?

The 2024 CRAG challenge focused on building RAG systems capable of sourcing relevant knowledge from web documents and mock knowledge graphs to answer complex queries. It tested not just retrieval and generation quality but also robustness and hallucination control.

Team db3, comprising Jiazun Chen and Yikuan Xia from Peking University, achieved:

  • :1st_place_medal: 1st in Task 1 – Retrieval Summarisation (28.4%)
  • :1st_place_medal: 1st in Task 2 – Knowledge Graph + Web Retrieval (42.7%)
  • :1st_place_medal: 1st in Task 3 – End-to-End RAG (47.8%)

:small_blue_diamond: Task 1: Retrieval Summarisation

Team db3 engineered a layered retrieval-generation pipeline:

  • Parse HTML with BeautifulSoup
  • Chunk text using LangChain into retrievable segments
  • Retrieve with the bge-base-en-v1.5 model
  • Rerank results using a custom relevance model
  • Add dynamic fallback: prompt the model to say “I don’t know” when uncertain

:small_blue_diamond: Tasks 2 & 3: Knowledge Graph + Web Integration

Their architecture evolved with more complex inputs and integrations:

  • Combine structured data (mock KGs) and unstructured web pages
  • Implement a Parent-Child Chunk Retriever for fine-grained retrieval
  • Use a tuned LLM to orchestrate API calls via a controlled, regularised set
  • Perform heavy reranking to ensure only the most relevant data reached the generator

:small_blue_diamond: Hallucination Mitigation

To keep outputs grounded and reliable, the team:

  • Fine-tuned the model to rely only on retrieved evidence
  • Added constraints to reduce overconfident generations
  • Used Python-based calculators for numerical reasoning tasks

:busts_in_silhouette: Meet the Team

Jiazun Chen and Yikuan Xia are third-year PhD candidates at Peking University, advised by Professor Gao Jun.

Their research focuses on:

  • Community search in massive graph datasets
  • Graph alignment for cross-domain analysis
  • Table data fusion across heterogeneous sources

:arrows_counterclockwise: What Carries Over from 2024 to 2025?

While the Meta CRAG-MM Challenge 2025 takes a leap into multi-modal and multi-turn territory, several principles from db3’s approach remain highly applicable:

  • Structured + Unstructured Retrieval
    db3’s integration of knowledge graphs and web data directly informs Task 2 of CRAG-MM, which fuses image-KG with web search.

  • Hallucination Mitigation
    Their use of grounded generation and standardised fallback (“I don’t know”) is vital in MM-RAG, where conciseness and truthfulness are tightly evaluated.

  • Reranking and Retrieval Granularity
    Techniques like Parent-Child Chunk Retrieval can be adapted to visual-context-aware retrieval in 2025.

  • LLM-as-Controller
    db3’s LLM-mediated API selection prefigures the multi-turn query orchestration required in this year’s task 3.

:jigsaw: In short: while the modality has evolved, the core disciplines—retrieval quality, grounding, and structured reasoning—remain front and centre. Studying the 2024 winning strategy is still a powerful head start for 2025.


Stay tuned for the next Winner Spotlight—and good luck with your submissions.

Read other winning strategies here: dRAGonRAnGers and md_dh

💬 Feedback & Suggestions

About 1 month ago

Hi @bunnyveil
This issue is now fixed! You should be able to invite new team members now.

The cragmm_search file is missing

About 1 month ago

Hi,

Please check out this post for dataset details

Commonsense Persona-Grounded Dialogue Challenge

CPDC Winner Spotlight: 💡 Strategies to Improve Solutions For Task 2

3 days ago

Hope you’re enjoying Round 1 of the CPDC 2025 Challenge! As you prepare for the upcoming round, we’re excited to share a spotlight on the winning strategies from CPDC 2023. These highlights offer practical insights and implementation tips to help strengthen your approach.

:bulb: The solutions for Task 1 of CPDC 2023 (a dialogue generation task) are most closely related to Task 2 of CPDC 2025, which also focuses on persona-consistent dialogue generation.


:one: Task 1 Winner
:1st_place_medal: First Place: Kaihua Ni
:bulb: Key Insight: Combining LLM Fine-tuning with Advanced Prompt Engineering
Username: @ni_kai_hua

Background: AI graduate from University of Leeds with experience at Augmentum and CareerBuilder. Specialises in AI, deep learning, and language dynamics.

Winning Strategy:

Two-Pronged Approach:

  • Fine-tuned an LLM to emulate specific individuals
  • Engineered precise, persona-aligned prompts to guide output generation

Key Methods:

Fine-Tuning with Transfer Learning:

  • Used curated datasets (dialogues, writings) aligned with target personas
  • Adapted models to reflect individual styles and semantics

Advanced Prompt Engineering:

  • Defined clear conversational goals
  • Subtly incorporated persona traits
  • Maintained coherence across multiple dialogue turns

Dialogue Coherence:

  • Applied attention window tuning and context control

Custom Evaluation Loop:

  • Built bespoke evaluation metrics aligned with CPDC scoring
  • Iterative refinement based on metrics

Ethical Safeguards:

  • Embedded privacy protections
  • Prevented harmful/inappropriate content
  • Ensured ethical persona emulation

Insight: Demonstrated how LLMs can generate nuanced, human-like dialogue without compromising integrity

:computer: Implementation Tips
Want to apply Kaihua’s approach to your solution? Here are some practical steps:

For the fine-tuning component:

  • Start with a smaller, more efficient LLM as your base model
  • Create a curated dataset that specifically represents your target personas
  • Focus on preserving stylistic elements in your training data, not just semantic content

For the prompt engineering component:

  • Structure your prompts with clear sections for conversation goal, persona traits, and dialogue history
  • Experiment with different attention window sizes to find optimal context retention
  • Implement a simple evaluation loop to measure improvements against CPDC’s scoring criteria

:two: Task 1 Runner-Up
:2nd_place_medal: Second Place: Zhiyu Wang
:bulb: Key Insight: Principles-Driven Prompt Engineering for Persona Alignment
Username: @wangzhiyu918
Team: Zhiyu Wang, Puhong Duan, Zhuojun Xie, Wang Liu, Bin Sun, Xudong Kang, Shutao Li

Background: PhD candidate at Hunan University focusing on vision-language understanding, LLMs, and multi-modal LLMs.

Winning Strategy:

Core Focus: Prompt engineering inspired by recent LLM advancements (ChatGPT, LLaMA)

Key Methods:

  • Studied The Art of ChatGPT Prompting guide
  • Based strategy on three principles:
    • Clarity: Specific language for accurate comprehension
    • Conciseness: Avoided unnecessary verbosity
    • Relevance: Ensured alignment with dialogue context and persona
  • Refined prompts using GPT-4
  • Deployed carefully designed prompt (available in their repository)

Insight: The methodical and prompt-focused design contributed to generating highly coherent, persona-aligned responses

:computer: Implementation Tips
Want to apply Zhiyu’s approach to your solution? Here are some practical steps:

Study effective prompting techniques:

  • Review prompting guidelines and best practices from established sources
  • Analyze the structure of successful prompts for persona-based dialogue

Apply the three core principles:

  • Clarity: Replace vague instructions with specific directives
  • Conciseness: Remove redundant or tangential information from prompts
  • Relevance: Ensure every element of your prompt directly contributes to persona alignment

Iterative refinement:

  • Use GPT-4 or similar models to test prompt variations
  • Create a systematic testing framework to compare prompt performance

:three: Task 1 Third Place
:3rd_place_medal: Third Place: Kaidi Yan
:bulb: Key Insight: Strategic Minimalism in Prompt Design
Username: @kevin_yan
Team: Kaidi Yan, Jiayu Liu

Background: Software engineer at a large technology company, primarily working on server-side C++ development, with recent focus on LLMs.

Winning Strategy:

Core Focus: Targeted prompt engineering, carefully adapted to new scoring rules and aimed at simulating natural dialogue flow

Key Methods:

  • Defined clear objective at the start of the prompt
  • Designed special prompts for initial utterances to simulate realistic conversation openers
  • Merged all prior utterances into a single user prompt instead of user/system pairs
  • Post-processed model responses for completeness and fluency
  • Deliberately kept prompt length short to avoid overfitting

Insight: While brevity may have limited peak performance, his approach prioritised adaptability and relevance — a strategic trade-off for generalisation

:computer: Implementation Tips
Want to apply Kaidi’s approach to your solution? Here are some practical steps:

Simplify your prompt structure:

  • Start with a clear, concise objective statement
  • Remove unnecessary complexity and instructions
  • Focus on the essential elements needed for persona alignment

Improve conversation handling:

  • Create specialised handling for conversation starters
  • Experiment with merging dialogue history into unified context
  • Implement lightweight post-processing for response quality

Balance brevity with performance:

  • Test incrementally shorter prompts while monitoring performance
  • Identify which prompt elements contribute most to score improvement
  • Find the optimal balance between prompt length and effectiveness

Sounding Video Generation (SVG) Challenge 2024

💬 Feedback & Suggestions

3 days ago

Hello,

The organisers at Sony are currently conducting the final round of human evaluation. As per the challenge rules, the top entries on the final leaderboard will be assessed through human evaluation, and the winning teams will be selected based on the results of this subjective assessment.

We are awaiting the outcome from the organisers and will share an update as soon as we receive it.

Thank you for your patience.

Integrating Contextual Dialogue and Task Execution

Task-Oriented Dialogue (Task 1)

Context-Aware Dialogue (Task 2)

Test Challenge 4/4-186134

snehananavati has not provided any information yet.