Round 1: Completed Round 2: Completed #neurips #reinforcement_learning
51.5k
594
42
632

The challenge has now concluded!
🔖 Don't forget to read the post-challenge report here

🚀 Check out the Starter Kit

## 🔥 Introduction

You have been heralded from birth as the instrument of the gods. You are destined to recover the Amulet of Yendor for your deity or die in the attempt. Your hour of destiny has come. For the sake of us all: Go bravely!

In this challenge you will complete your quest by designing an agent which can navigate the procedurally generated, ascii dungeons of Nethack, a terminal-based video game complete with dangerous monsters, magical items, and hopefully enough food to survive!

You can design and train your agent however you please — with or without machine learning, using any external information you’d like, and with any training method and computational budget. The only requirement is that you submit an agent that can be evaluated by us (see the Competition Structure section below).

You will be judged by how often your agent successfully ascends with the Amulet. However, since this is a long and arduous quest, it is possible that no agent will succeed, in which case you will be ranked by the median in-game score that your agent achieves during the testing rounds.

## ⚔️ The NetHack Learning Environment

The NetHack Learning Environment (NLE) is a Reinforcement Learning environment presented at NeurIPS 2020. NLE is based on NetHack 3.6.6 and designed to provide a standard RL interface to the game, and comes with tasks that function as a first step to evaluate agents on this new environment.

NetHack is one of the oldest and arguably most impactful video games in history, as well as being one of the hardest roguelikes currently being played by humans. It is procedurally generated, rich in entities and dynamics, and overall an extremely challenging environment for current state-of-the-art RL agents, while being much cheaper to run compared to other challenging testbeds. Through NLE, we wish to establish NetHack as one of the next challenges for research in decision making and machine learning.

You can read more about NLE in the NeurIPS 2020 paper, and about NetHack in its original README, at nethack.org, and on the NetHack wiki.

## 🧪 Challenge Motivation

The Nethack Challenge provides an opportunity for AI Researchers, Machine Learning Enthusiasts and the broader community to compete and collaborate to benchmark their solutions of the NetHack Learning Environment(NLE). NLE contains complex multi-faceted sequential decision making tasks but is exceptionally cheap to simulate -- almost 14X faster than Atari -- and therefore we believe this presents one of the most interesting and accessible grand challenges in RL.

We encourage participants to use reinforcement learning (RL) agent architectures, training methods and other machine learning ideas. However, we do not restrict participants to just machine learning, the agents may be implemented and trained in any manner. The only restriction is on the compute and runtime during evaluation, though these will be set to very generous limits to support a wide range of possible implementations.

## 🏋️ Competition Structure

### The Starter Pack

At the start of the competition, code will be shared in the form of a starter pack, complete with baselines, to allow participants to quickly get started developing their agents. Included will be the ability to evaluate agents against the testing protocol either locally or remotely, as well as to perform integration tests to determine whether their code will run on the evaluation server.

### Competition Phases

The competition will be split into a development and test phase. During the development phase, participants will be able to submit their agents to the leaderboard once a day and 512 evaluation runs will be performed to calculate a preliminary place on the dev-phase leaderboard. During this phase the number of overall submissions is not capped - only rate limited.

Mid-October the test phase will begin, and the top 15 participants for each track will be taken from the dev leaderboard and invited to join this test phase. Here participants will be able to submit their best agents 3 to the test-phase leaderboard and 4096 evaluation runs will be performed to calculate the final ranking. The final results will be presented at NeurIPS 2021!

### Evaluation Details

Evaluation will be done on the default NetHackChallenge-v0 environment, available on the latest version of nle. This environment comes as close as possible to playing the real game of NetHack with a random character to start, and a full keyboard of actions to take.

The environment will run for a maximum of 1,000,000 steps or 30 minutes (whichever comes first), with each step taking at most 300 secs, before termination. Similarly, rollouts that fail to generate more than 1000 score after 50,000 steps will be terminated. Each dev phase assessment must run in under 2 hours, and test phase assessment in under 24hr. These restrictions are intended to be generous bounds to prevent abuse of the evaluation system resources, rather than to enforce particular efficiency constraints on agents, and terminated rollouts will receive their score at the point of termination.

It is advisable that you test your submissions locally. You will be provided rollout scripts you can run to allow you to debug failures and improve your agent after each completed evaluation, and benchmark the rollouts.

### Tracks

There are four possible tracks that your agent can be evaluated for:

• Best overall agent
• Best agent substantially using a neural network
• Best agent not using a neural network
• Best agent from an academic/independent team

Please refer to the challenge rules for the full details of each track.

When you register as a participant we will request the relevant information and your submission will automatically be competing in each track for which it is eligible. Winners will be announced at the NeurIPS 2021 workshop, and will be invited to collaborate on the post competition report.

## 💰 Prizes

🥈 Runner up: $2,000 ### Track 4: best agent from an academic/independent team 🥇 Winner:$3,000

🥈 Runner up: \$2,000

❗ Please note: A participant/team is eligible to win multiple tracks in the challenge. (This is applicable if you qualify for the respective tracks)

## ⏱️ Timeline

• June 9th - Oct 15 - Development phase
• October 15th - Oct 31st - Test phase

## 🚀 Getting started

Make your first submission using the starter kit. 🚀

## 📝 Post-Challenge Report

So, what happened in the challenge, and what were the interesting results? 😄

You can read the post-challenge report here to find out!

## 🤖 Team

• Eric Hambro (Facebook AI Research)
• Edward Grefenstette (Facebook AI Research)
• Minqi Jiang (Facebook AI Research)
• Robert Kirk (University College London)
• Vitaly Kurin (University of Oxford)
• Heinrich Kuttler (Facebook AI Research)
• Vegard Mella (Facebook AI Research)
• Nantas Nardelli (University of Oxford)
• Jack Parker Holder (University of Oxford)
• Roberta Raileanu (New York University)
• Tim Rocktaschel (Facebook AI Research)
• Danielle Rothermel (Facebook AI Research)
• Mikayel Samvelyan (Facebook AI Research)
• Dipam Chakraborty (AIcrowd)

### Interview about the environment with Weights&Biases

Facebook AI Research’s Tim & Heinrich on democratizing reinforcement learning research

### Papers using the NetHack Learning Environment

We’d like to thank our sponsors for their contributions to the NetHack Challenge: