Loading
Round 1: 9 days left #reinforcement_learning
10.6k
125
3
100

๐Ÿš€Starter kit - Everything you need to submit

๐Ÿ“ƒProject Page - Documentation, API reference, and tutorials

๐Ÿ‘จโ€๐Ÿ’ปGoogle Colab - Starter notebook with free GPU acceleration

chat on Discord - Join our Discord for announcements, support, and discussion

๐Ÿ“… Timeline

  • Round 1: Robustness (June 30th - September 30th)
  • Rounds 2-3: Teamwork and Scale (October 1st - December 14th)
  • Team formation deadline (December 9th)

๐Ÿ”ฆ Introduction

In this challenge, you will design and build agents that can survive and thrive in a massively multiagent environment full of potential adversaries. Explore Neural MMO's procedurally generated maps, scavenge for resources, and acquire equipment to protect yourself while preventing other participants from doing the same.

You may use scripted, learned, or hybrid approaches incorporating any information and leveraging any computational budget for development. The only requirement is that you submit an agent that we can evaluate (see the Competition Structure section below).

Your agents will score points by completing high-level foraging, combat, and exploration objectives. Your agents will compete in tournaments against scripted bots and agents designed by other participants. We will assign a skill rating to your policy based on task completion, taking into account the skill of your opponents. The policy with the highest skill rating wins.

๐ŸŽฎ Neural-MMO

Neural-MMO is a platform for massively multiagent research featuring hundreds of concurrent agents, multi-thousand-step time horizons, high-level task objectives, and large, procedurally generated maps. Unlike game genres typically considered in reinforcement learning and agent-based intelligence in general, Massively Multiplayer Online (MMO) games simulate persistent worlds that support rich player interactions and a wider variety of progression strategies. These properties seem important to intelligence in the real world, and the objective of this competition is to spur agent-based research on increasingly general environments.

You can read more on the Neural-MMO project page.

Neural-MMO demo video

โœŠ Challenge motivation

The Neural-MMO Challenge provides a unique opportunity for participants to produce and test new methods for

  • Learning over long horizons.
  • Navigating multi-modal and loosely specified tasks
  • Robustness to competing agents from other participants not seen during training
  • Training cooperative many-agent policies (Rounds 2 & 3)

Full-scale MMOs are among the most complex games developed, are not typically open-source, and are not computationally accessible to most researchers. We have therefore chosen to build Neural MMO from the ground up to capture key elements of the genre while remaining efficient for training and evaluation. At the same time, we emphasize the importance of scripted baselines and support hand-coded submissions.

๐Ÿ‹๏ธ Competition structure

The Neural-MMO Challenge provides a unique opportunity for participants to explore robustness and teamwork in a massively multiagent setting with opponents not seen during training. There are three rounds in the competition. There are no qualifiers; you can submit to any or all of the rounds independently.

Starter kit

Neural MMO is fully open-source and includes scripted and learned baselines with all associated code. We provide a starter kit with example submissions, local evaluation tools, and additional debugging utilities. The documentation in the starter kit provided will walk you through installing dependencies and setting up the environment. Our goal is to enable you to make your first test submission within a few minutes of getting started.

Competition rounds

Round 1: Robustness

Create a single agent. You will be evaluated in a free-for-all against other participants on 128x128 maps with 128 agents for 1024 game ticks (time steps).

Round 2: Teamwork

Same as round 1, but now you will control a team of 8 agents. Your score will be computed using the union of achievements completed by any member of your team.

Round 3: Scale

Control a team of 32 agents on 1024x1024 maps with 1024 agents for 8192 game ticks.

๐ŸŽ– Evaluation

Evaluation configuration

You may script or train agents independent of the evaluation setting: environment modifications, domain knowledge, custom reward signals are all fair game. We will evaluate your agents using

Round Environment Config
Round 1 CompetitionRound1
Round 2 CompetitionRound2
Round 3 CompetitionRound3

 

These configs are available in the competition branch of Neural MMO and are a part of the starter kit. Barring large errors, configs will be fixed at the start of the corresponding rounds. For example, we may tweak the configs for rounds 2 & 3 before they launch.

Tournament evaluations

We will evaluate your agent in two stages.

Stage 1: Verses Scripted Bots

We will evaluate your agent against scripted baselines of a variety of skill levels. Your objective is to earn more achievement points (see Evaluation Metrics) than your opponents. We will estimate your agent's relative skill or match-making rank (MMR) based on several evaluations on different maps. We generate these maps using the same algorithm and parameters as provided in the starter kit, but we will use another random seed to produce maps outside of the direct training data.

Stage 2: Verses Other Participants

We will evaluate your agents against models submitted by other participants as well as our baselines. When there are only a few submissions at the start of the competition, we will take a uniform sample of agents for each tournament. Once we have enough submissions from the participants, we will run tournaments by sampling agents of similar estimated skill levels. Your objective is still to earn more achievement points than your opponents.

Compute budget for the agent

You may use any resources you like for training and development but are limited in CPU time and memory for each evaluation. These are set relatively high -- our objective is not to force aggressive model optimization or compression. The exact budget varies per round.

You have a limited number of submissions per day. Again, this budget is set per round and is intended to keep evaluation costs managable rather than to place a significant constraint on development. Making alt accounts to bypass this limit will result in disqualification.

Once your agent has passed the scripted evaluation, we will include it in the tournament pool, which carries the same CPU and memory limits. If the tournament pool gets too large to evaluate on our machines, we will prune old models with similar performance scores from the same participants (for example, virtually identical submissions).

โš–๏ธ Evaluation metrics

Your agent will be awarded 0-100 points in each tournament based on completing tasks from the achievement diary below. Score 4 points for easy (green) tasks, 10 points for normal (orange) tasks, and 25 points for hard (red) tasks. You only earn points for the highest tier task you complete in each category. The thresholds for each tier for the first round are given in the figure below. Later rounds will feature the same tasks but with different thresholds for completion.

Since achievement score varies against different opponents, we will only report Matchmaking Rating (MMR) on the leaderboard, but you will still have access to achievement scores.

๐Ÿ† Winner contributions

Based on the MMR, we will invite the top three teams from each round to contribute material detailing their approaches and be included as authors in a summary manuscript at the end of the competition. We will include the best-learned approach as a fourth if all the top three submissions include heavily scripted elements. We may include additional honourable mentions at our discretion for academically interesting approaches, such as those using exceptionally little compute or minimal domain knowledge. Honourable mentions will be invited to contribute a shorter section to the paper and have their names included inline.

We strongly encourage but do not require winners to open-source their code.

๐Ÿ“ž Contact

Discord is our main contact and support channel

๐Ÿ”—Discord Channel

If you have a longer discussion topic not well suited to a live text channel, you may post on the Discussion Forum.

๐Ÿ”—Discussion Forum

We strongly encourage you to use the public channels mentioned above for communications to the organizers. But if you're looking for a direct communication channel, feel free to reach out to us at:

๐Ÿค– Team

  • Joseph Suarez (MIT)
  • Siddhartha Laghuvarapu (AIcrowd Research)
  • Sharada Mohanty (AIcrowd)
  • Dipam Chakraborty (AIcrowd Research)
  • Jyotish Poonganam (AIcrowd)
  • Phillip Isola (MIT)

Participants

Notebooks

See all
Getting Started
By
siddhartha
2 months ago
0