AIcrowd | camaro | Participants

0 Follower

0 Following

Camaro

Daisuke Yamamoto

Activity

Feb

Mar

Apr

May

Jun

Jul

Aug

Sep

Oct

Nov

Dec

Jan

Feb

Mon

Wed

Fri

Challenge Categories

Challenges Entered

Completed

Latest submissions

See All

graded	288862	Sun, 15 Jun 2025 22:10:30
graded	288861	Sun, 15 Jun 2025 22:09:36
graded	288860	Sun, 15 Jun 2025 22:09:19

Completed

Generative Interior Design Challenge 2024

AIcrowd

Machines Can See Summit

Revolutionising Interior Design with AI

Latest submissions

No submissions made in this challenge.

Completed

Commonsense Persona-Grounded Dialogue Challenge 2023

Sony Group Corporation

Evaluate Natural Conversations

Latest submissions

No submissions made in this challenge.

Completed

Scene Understanding for Autonomous Drone Delivery (SUADD'23)

AIcrowd

Amazon Prime Air

Understand semantic segmentation and monocular depth estimation from downward-facing drone images

Latest submissions

No submissions made in this challenge.

Completed

Sound Demixing Challenge 2023

AIcrowd

Sony Group Corporation

Moises.AI

Mitsubishi Electric Research Laboratories

Audio Source Separation using AI

Latest submissions

No submissions made in this challenge.

Completed

Food Recognition Benchmark 2022

Seerave Foundation

A benchmark for image-based food recognition

Latest submissions

See All

graded	177111	Sat, 19 Mar 2022 14:09:18
failed	177108	Sat, 19 Mar 2022 13:34:18
graded	176842	Tue, 15 Mar 2022 14:29:06

Completed

Data Purchasing Challenge 2022

Leibniz Centre for European Economic Research

What data should you label to get the most value for your money?

Latest submissions

See All

graded	179189	Thu, 7 Apr 2022 21:18:47
graded	179151	Thu, 7 Apr 2022 15:23:10
graded	179149	Thu, 7 Apr 2022 15:11:00

Completed

Semantic Segmentation

AIcrowd

Amazon Prime Air

Perform semantic segmentation on aerial images from monocular downward-facing drone

Latest submissions

No submissions made in this challenge.

Completed

Latest submissions

See All

graded	288862	Sun, 15 Jun 2025 22:10:30
graded	288859	Sun, 15 Jun 2025 22:08:59
graded	288858	Sun, 15 Jun 2025 22:07:08

Participant	Rating

Participant	Rating

Camaro has not joined any teams yet...

Meta CRAG - MM Challenge 2025

Clarification about the evaluation process

9 months ago

Thanks for the clarification!

Clarification about the evaluation process

9 months ago

I see, thank you for the clarification.
I appreciate you reaching out to the organizers. I will proceed on the assumption that we may not receive a response.

Thanks!

Clarification about the evaluation process

9 months ago

@yilun_jin8

To be honest, I don’t really understand why you replied, made changes, then deleted your response and are now staying silent about this post.
If a certain question can’t be answered, that’s totally fine. Please just let me know.

Clarification about the evaluation process

9 months ago

@yilun_jin8
Any updates? If there are some questions you can’t answer, please let me know so.
Thanks.

Clarification about the evaluation process

9 months ago

@yilun_jin8 @mohanty
can you check these questions?

Evaluation Method of Leaderboard

9 months ago

Thanks for the quick action!
In my humble opinion, modifying the evaluation prompt is not a solution.
You just need to declare that a prompt injection solution will be eliminated before selecting the top 10 teams.

Evaluation Method of Leaderboard

9 months ago

@yilun_jin8
Another question. What if all the top 10 teams in phase 2 will use submissions with prompt injection, like on the current leaderboard? I think the current auto-evaluation method is too weak against prompt injection and can be easily exploited.
I believe manual evaluators would consider such answer invalid(wrong), but since only 10 teams are selected for manual evaluation, there’s a risk that none of the top submissions are meaningful.

Clarification about the evaluation process

9 months ago

Regarding participation eligibility, is my understanding correct?

Phase 1: All teams can participate
Phase 2: Only teams that successfully submit in Phase 1 can participate
Final Round: Only the top 10 teams in phase 2 based on automatic evaluation can participate

How is the final submission selected? Can we change from the best leaderboard submission?
Is there no length limit for the final evaluation? (The limit is 75 tokens for automatic evaluation)

Full responses are manually checked for hallucinations.
How is the generation of the first token detected?

A 10-second timeout starts after the first token is generated.
How is time per sample measured in the batch generation pipeline?

Only answer texts generated within 30 seconds are considered.
If we exceed the time limit, will we be immediately disqualified? Or just the sample will be considered as wrong (or missing)?
Is a missing answer required to be an exact match to “I don’t know,” or are similar responses acceptable in manual evaluation? Which of the following statements is correct?

Missing (e.g., “I don’t know”, “I’m sorry I can’t find …”) → Score: 0.0

All missing answers should return a standard response: “I don’t know.”

Evaluation Method of Leaderboard

9 months ago

I’ve confirmed that now it’s updated. Thanks!

Evaluation Method of Leaderboard

10 months ago

Thanks, have you completed the process? It seems that some submissions have not been re-evaluated yet.

Evaluation Method of Leaderboard

10 months ago

Yes, it seems that a bug has recently appeared.
It looks like no one has been able to get a “correct” other than exact match.
Even when submitting the exact same commit as a submission that achieved “correct” on April 29, I can’t get the same score.

@yilun_jin8
Did you make any changes to the evaluation metric since then?

Evaluation Method of Leaderboard

10 months ago

Thanks!
May I ask why some submissions seems correctly submitted but not graded?

For example:
“AIcrowd | Single-source Augmentation | Submissions #283034”
“AIcrowd | Single-source Augmentation | Submissions #283028” (not mine)

Evaluation Method of Leaderboard

10 months ago

Is the evaluation method for the leaderboard scores the same as in local_evaluation.py?
Are the prompts and LLM used for evaluation also the same?
I’m wondering because the scores I get locally don’t match the ones on the leaderboard.

Generative Interior Design Challenge 2024

Can't open baseline starter kit

About 2 years ago

Now I can access it! Thanks for the quick fix.

Can't open baseline starter kit

About 2 years ago

I got 404 when I tried to open baseline starter kit.
@snehananavati Could you please check it out?

Data Purchasing Challenge 2022

[Announcement] Leaderboard Winners

Almost 4 years ago

Thanks! It should be same as other platform like Kaggle, you can just create a discussion thread to share your approach! Of couse it would be the most helpful if you kindly share the code as well, but this competition was very structured so just sharing approach may be eough to understand what leads you to win:)

Camaro has not provided any information yet.

Notebooks

Create Notebook

Filters

Private

Notebooks

Create Notebook

Filters

Private