AIcrowd | maciej_sypetkowski

0 Follower

0 Following

maciej_sypetkowski

Maciej Sypetkowski

Activity

Mar

Apr

May

Jun

Jul

Aug

Sep

Oct

Nov

Dec

Jan

Feb

Mar

Mon

Wed

Fri

Challenge Categories

Challenges Entered

Completed

Food Recognition Benchmark 2022

Seerave Foundation

A benchmark for image-based food recognition

Latest submissions

No submissions made in this challenge.

Completed

NeurIPS 2021 - The NetHack Challenge

AIcrowd

ASCII-rendered single-player dungeon crawl game

Latest submissions

See All

graded	163439	Sat, 30 Oct 2021 21:38:38
graded	162868	Fri, 29 Oct 2021 14:55:03
graded	160769	Sat, 23 Oct 2021 15:09:58

Participant	Rating

Participant	Rating

AutoAscend NeurIPS 2021 - The NetHack Challenge
View

NeurIPS 2021 - The NetHack Challenge

🧞 Requesting Feedback and Suggestions

Over 4 years ago

Hey @dipam,

I just gave some feedback about tight time limits in this comment.

Are the specs for the machine that the evaluations are run on available anywhere?

Over 4 years ago

Hello @jyotish,

If the throughput is in 1500-2000 range, it indicates that the maximum average number of steps is only {1500--2000} * 0.5 * 3600 / 128 = 21k -- 28k per game.
Also a note, that the example you gave doesn’t really take into account the environment step delay, because env.step(1) (1 is CompassDirection.E) is a no-op after a few steps when the character hits the wall (the turn counter stops to tick after that).

28k steps per game is an extremely tight limit if one were to go for ascension (we are!). Assuming that we need 200k turns for ascension, which can be roughly equivalent to 400k steps, indicates that we have to early drop at least 93% of all games. We already have >40k steps in average, and have to do hacks to maximize the median score, like quitting after exceeding the median (in fact Panic team does the same judging from the leaderboard).

Of course the assumption that agent takes no time to execute is not realistic. In our case environment takes ~15% of the entire execution time (measured locally so there’s no communication delay).

The competition goal is to develop the best agent, but right now it’s more like an performance optimization problem instead. Citing the challenge motivation: “The only restriction is on the compute and runtime during evaluation, though these will be set to very generous limits to support a wide range of possible implementations”. Currently the compute limit is far from being “very generous” and to do this, we believe the time limit should be increased at least 10 times.

So, we encourage you to increase the time limit as much as possible to give participants a better chance to beat the game. With the current limit ascension is extremely unlikely, but it’s quite possible if the limit is increased.

Evaluate rollouts timed out

Over 4 years ago

Admins responded to me that the video recording timed out and they will release a fix for it soon.

Evaluate rollouts timed out

Over 4 years ago

I got a timeout on “Evaluate Rollouts” step. I’m not sure what the step is exactly doing, but I guess it doesn’t involve running the agent. If that’s the case, can you increase time limit for this step?

What does this step do and why does it take so long? Some of our episodes are quite long (>100k steps), maybe that’s the reason?