Loading
0 Follower
0 Following
ben_swain

Organization

Software Engineer

Location

US

Badges

0
0
0

Connect

Activity

Mar
Apr
May
Jun
Jul
Aug
Sep
Oct
Nov
Dec
Jan
Feb
Mar
Mon
Wed
Fri

Challenge Categories

Loading...

Challenges Entered

Build an LLM agent for five real-world games

Latest submissions

See All
graded 306511

Multi-Agent Dynamics & Mixed-Motive Cooperation

Latest submissions

See All
graded 245514
graded 245513
graded 245480

Latest submissions

See All
graded 231410
failed 231409
graded 231408
Participant Rating
Participant Rating

Orak Game Agent Challenge 2025 Forum

Question on Scoring

2 months ago

I think we need clarification on this. I’ll pause work on this competition until then

Question on Scoring

2 months ago

One issue I see with the tie-breaking criteria is that it is fairly simple to create rule-based scripts to achieve perfect scores in the games. There’s probably already code for this online- so what is preventing teams from relying primarily on rule-based scripts to perform most of the work while a minimal LLM is included just to ensure the team has the optimal tie-breaking solution?

Since there’s nothing in the rules disallowing this, it is likely that the winning solutions will work this way.

Question on Scoring

2 months ago

The top team in the Track 1 leaderboard currently has a perfect score of 1.0 for all 4 games. I imagine there will soon be more teams with this score. I understand that the final evaluation will include hidden test cases/scenarios, but how can we determine if our agents are improving relative to other teams if we all have perfect scores of 1.0? Perhaps the current evaluation cases should be more difficult so we can better compare different teams progress?

Question about SC2 map setting in starter kit

2 months ago

It would be useful to provide the Ancient Cistern LIE file (along with any other map files required for running evaluation in linux). I’m still unsuccessful in finding or generating this file after searching through the zips on Maps - AI Arena Wiki and following the manual patching process in Large-Language-Models-play-StarCraftII/Maps at main · histmeisah/Large-Language-Models-play-StarCraftII · GitHub

AICrowd Gitlab Offline?

3 months ago

Appears to be back online now.

ben_swain has not provided any information yet.