You have been heralded from birth as the instrument of the gods. You are destined to recover the Amulet of Yendor for your deity or die in the attempt. Your hour of destiny has come. For the sake of us all: Go bravely!
In this challenge you will complete your quest by designing an agent which can navigate the procedurally generated, ascii dungeons of Nethack, a terminal-based video game complete with dangerous monsters, magical items, and hopefully enough food to survive!
You can design and train your agent however you please — with or without machine learning, using any external information you’d like, and with any training method and computational budget. The only requirement is that you submit an agent that can be evaluated by us (see the Competition Structure section below).
You will be judged by how often your agent successfully ascends with the Amulet. However, since this is a long and arduous quest, it is possible that no agent will succeed, in which case you will be ranked by the median in-game score that your agent achieves during the testing rounds.
Read on to learn more about the Nethack environment and the competition structure.
⚔️ The NetHack Learning Environment
The NetHack Learning Environment (NLE) is a Reinforcement Learning environment presented at NeurIPS 2020. NLE is based on NetHack 3.6.6 and designed to provide a standard RL interface to the game, and comes with tasks that function as a first step to evaluate agents on this new environment.
NetHack is one of the oldest and arguably most impactful video games in history, as well as being one of the hardest roguelikes currently being played by humans. It is procedurally generated, rich in entities and dynamics, and overall an extremely challenging environment for current state-of-the-art RL agents, while being much cheaper to run compared to other challenging testbeds. Through NLE, we wish to establish NetHack as one of the next challenges for research in decision making and machine learning.
🧪 Challenge Motivation
The Nethack Challenge provides an opportunity for AI Researchers, Machine Learning Enthusiasts and the broader community to compete and collaborate to benchmark their solutions of the NetHack Learning Environment(NLE). NLE contains complex multi-faceted sequential decision making tasks but is exceptionally cheap to simulate -- almost 14X faster than Atari -- and therefore we believe this presents one of the most interesting and accessible grand challenges in RL.
We encourage participants to use reinforcement learning (RL) agent architectures, training methods and other machine learning ideas. However, we do not restrict participants to just machine learning, the agents may be implemented and trained in any manner. The only restriction is on the compute and runtime during evaluation, though these will be set to very generous limits to support a wide range of possible implementations.
🏋️ Competition Structure
The Starter Pack
At the start of the competition, code will be shared in the form of a starter pack, complete with baselines, to allow participants to quickly get started developing their agents. Included will be the ability to evaluate agents against the testing protocol either locally or remotely, as well as to perform integration tests to determine whether their code will run on the evaluation server.
The competition will be split into a development and test phase. During the development phase, participants will be able to submit their agents to the leaderboard once a day and 512 evaluation runs will be performed to calculate a preliminary place on the dev-phase leaderboard. During this phase the number of overall submissions is not capped - only rate limited.
Mid-October the test phase will begin, and the top 15 participants for each track will be taken from the dev leaderboard and invited to join this test phase. Here participants will be able to submit their best agents 3 to the test-phase leaderboard and 4096 evaluation runs will be performed to calculate the final ranking. The final results will be presented at NeurIPS 2021!
Evaluation will be done on the default
NetHackChallenge-v0 environment, available on the latest version of
nle. This environment comes as close as possible to playing the real game of NetHack with a random character to start, and a full keyboard of actions to take.
The environment will run for a maximum of 1,000,000 steps or 30 minutes (whichever comes first), with each step taking at most 300 secs, before termination. Similarly, rollouts that fail to generate more than 1000 score after 50,000 steps will be terminated. Each dev phase assessment must run in under 2 hours, and test phase assessment in under 24hr. These restrictions are intended to be generous bounds to prevent abuse of the evaluation system resources, rather than to enforce particular efficiency constraints on agents, and terminated rollouts will receive their score at the point of termination.
It is advisable that you test your submissions locally. You will be provided rollout scripts you can run to allow you to debug failures and improve your agent after each completed evaluation, and benchmark the rollouts.
There are three possible tracks that your agent can be evaluated for:
- Best overall agent
- Best agent not using a neural network
- Best agent from an academic/independent team
When you register as a participant we will request the relevant information and your submission will automatically be competing in each track for which it is eligible. Winners will be announced at the NeurIPS 2021 workshop, and will be invited to collaborate on the post competition report.
The challenge features a Total Cash Prize Pool of $15,000 USD
⚖ This prize pool is equally divided among the 3 tracks.
Track 1: best overall agent
🥇 Winner: $3,000
🥈 Runner up: $2,000
Track 2: best agent not using a neural network
🥇 Winner: $3,000
🥈 Runner up: $2,000
Track 3: best agent from an academic/independent team
🥇 Winner: $3,000
🥈 Runner up: $2,000
❗ Please note: A participant/team is eligible to win multiple tracks in the challenge. (this is applicable if you qualify for the respective tracks)
- June 9th - Oct 15 - Development phase
- October 15th - Oct 31st - Test phase
🚀 Getting started
Make your first submission using the starter kit. 🚀
- Eric Hambro (Facebook AI Research)
- Sharada Mohanty (AIcrowd)
- Edward Grefenstette (Facebook AI Research)
- Minqi Jiang (Facebook AI Research)
- Robert Kirk (University College London)
- Vitaly Kurin (University of Oxford)
- Heinrich Kuttler (Facebook AI Research)
- Vegard Mella (Facebook AI Research)
- Nantas Nardelli (University of Oxford)
- Jack Parker Holder (University of Oxford)
- Roberta Raileanu (New York University)
- Tim Rocktaschel (Facebook AI Research)
- Danielle Rothermel (Facebook AI Research)
- Mikayel Samvelyan (Facebook AI Research)
- Dipam Chakraborty (AIcrowd)
📙 Learn more about NLE
Interview about the environment with Weights&Biases
Papers using the NetHack Learning Environment
- Zhang et al. BeBold: Exploration Beyond the Boundary of Explored Regions (Berkley, FAIR, Dec 2020)
- Küttler et al. The NetHack Learning Environment (FAIR, Oxford, NYU, UCL, NeurIPS 2020)
If you have any questions, please contact Sharada Mohanty (email@example.com), or consider posting on the Community Discussion board, or join the party on our Discord!