Loading
0 Follower
0 Following
Camaro
Daisuke Yamamoto

Location

JP

Badges

0
0
0

Activity

Jun
Jul
Aug
Sep
Oct
Nov
Dec
Jan
Feb
Mar
Apr
May
Jun
Mon
Wed
Fri

Challenge Categories

Loading...

Challenges Entered

Improve RAG with Real-World Benchmarks | KDD Cup 2025

Latest submissions

See All
graded 288862
graded 288861
graded 288860

Latest submissions

No submissions made in this challenge.

Latest submissions

No submissions made in this challenge.

Understand semantic segmentation and monocular depth estimation from downward-facing drone images

Latest submissions

No submissions made in this challenge.

Latest submissions

No submissions made in this challenge.

A benchmark for image-based food recognition

Latest submissions

See All
graded 177111
failed 177108
graded 176842

What data should you label to get the most value for your money?

Latest submissions

See All
graded 179189
graded 179151
graded 179149

Perform semantic segmentation on aerial images from monocular downward-facing drone

Latest submissions

No submissions made in this challenge.

Generating answers using image-linked data

Latest submissions

See All
graded 288862
graded 288859
graded 288858
Participant Rating
Participant Rating
Camaro has not joined any teams yet...

Meta CRAG - MM Challenge 2025

Clarification about the evaluation process

About 1 month ago

Thanks for the clarification!

Clarification about the evaluation process

About 1 month ago

I see, thank you for the clarification.
I appreciate you reaching out to the organizers. I will proceed on the assumption that we may not receive a response.

Thanks!

Clarification about the evaluation process

About 1 month ago

@yilun_jin8

To be honest, I don’t really understand why you replied, made changes, then deleted your response and are now staying silent about this post.
If a certain question can’t be answered, that’s totally fine. Please just let me know.

Clarification about the evaluation process

About 1 month ago

@yilun_jin8
Any updates? If there are some questions you can’t answer, please let me know so.
Thanks.

Clarification about the evaluation process

About 2 months ago

@yilun_jin8 @mohanty
can you check these questions?

Evaluation Method of Leaderboard

About 2 months ago

Thanks for the quick action!
In my humble opinion, modifying the evaluation prompt is not a solution.
You just need to declare that a prompt injection solution will be eliminated before selecting the top 10 teams.

Evaluation Method of Leaderboard

About 2 months ago

@yilun_jin8
Another question. What if all the top 10 teams in phase 2 will use submissions with prompt injection, like on the current leaderboard? I think the current auto-evaluation method is too weak against prompt injection and can be easily exploited.
I believe manual evaluators would consider such answer invalid(wrong), but since only 10 teams are selected for manual evaluation, there’s a risk that none of the top submissions are meaningful.

Clarification about the evaluation process

About 2 months ago

  1. Regarding participation eligibility, is my understanding correct?
  • Phase 1: All teams can participate
  • Phase 2: Only teams that successfully submit in Phase 1 can participate
  • Final Round: Only the top 10 teams in phase 2 based on automatic evaluation can participate
  1. How is the final submission selected? Can we change from the best leaderboard submission?

  2. Is there no length limit for the final evaluation? (The limit is 75 tokens for automatic evaluation)

    Full responses are manually checked for hallucinations.

  3. How is the generation of the first token detected?

    A 10-second timeout starts after the first token is generated.

  4. How is time per sample measured in the batch generation pipeline?

    Only answer texts generated within 30 seconds are considered.

  5. If we exceed the time limit, will we be immediately disqualified? Or just the sample will be considered as wrong (or missing)?

  6. Is a missing answer required to be an exact match to “I don’t know,” or are similar responses acceptable in manual evaluation? Which of the following statements is correct?

    Missing (e.g., “I don’t know”, “I’m sorry I can’t find …”) → Score: 0.0

    All missing answers should return a standard response: “I don’t know.”

Evaluation Method of Leaderboard

About 2 months ago

I’ve confirmed that now it’s updated. Thanks!

Evaluation Method of Leaderboard

About 2 months ago

Thanks, have you completed the process? It seems that some submissions have not been re-evaluated yet.

Evaluation Method of Leaderboard

About 2 months ago

Yes, it seems that a bug has recently appeared.
It looks like no one has been able to get a “correct” other than exact match.
Even when submitting the exact same commit as a submission that achieved “correct” on April 29, I can’t get the same score.

@yilun_jin8
Did you make any changes to the evaluation metric since then?

Evaluation Method of Leaderboard

About 2 months ago

Thanks!
May I ask why some submissions seems correctly submitted but not graded?

For example:
AIcrowd | Single-source Augmentation | Submissions #283034
AIcrowd | Single-source Augmentation | Submissions #283028” (not mine)

Evaluation Method of Leaderboard

About 2 months ago

Is the evaluation method for the leaderboard scores the same as in local_evaluation.py?
Are the prompts and LLM used for evaluation also the same?
I’m wondering because the scores I get locally don’t match the ones on the leaderboard.

Generative Interior Design Challenge 2024

Can't open baseline starter kit

Over 1 year ago

Now I can access it! Thanks for the quick fix.

Can't open baseline starter kit

Over 1 year ago

I got 404 when I tried to open baseline starter kit.
@snehananavati Could you please check it out?

Data Purchasing Challenge 2022

[Announcement] Leaderboard Winners

About 3 years ago

Thanks! It should be same as other platform like Kaggle, you can just create a discussion thread to share your approach! Of couse it would be the most helpful if you kindly share the code as well, but this competition was very structured so just sharing approach may be eough to understand what leads you to win:)

[Announcement] Leaderboard Winners

About 3 years ago

Big congrats for the winners, especially for @xiaozhou_wang, it seems you won the competition by a large margin! Really curious about your solution, it would be great if you can share with community:)

:rotating_light: Select submissions for final evaluation

About 3 years ago

Hi @shivam @dipam, do you have any timeline for the leaderboard update?

:rotating_light: Select submissions for final evaluation

About 3 years ago

Hi @shivam, is there any progress?

:rotating_light: Select submissions for final evaluation

About 3 years ago

Hi @dipam, thanks for hosting the interesting compeitition!
It seems the competition was finished, when will the leaderborad be finalized?

Camaro has not provided any information yet.