AIcrowd | IJCAI 2022 - The Neural MMO Challenge

IJCAI2022 NMMO: Completed #reinforcement_learning

Parametrix.ai &

MIT &

THU_SIGS &

AIcrowd

50.9k

574

110

1679

🚀Starter kit - Everything you need to submit.

📃Project Page - Documentation, API reference, and tutorials.

📓Baseline - A training baseline for RL enthusiasts based on torchbeast. (Could reach 0.5 Top1 Ratio after 2 days' training and 0.8 in 4 days)

🔎Web viewer - A web replay viewer for our challenge (Usage details).

❓Support - Support channel could help you if you have any questions, issues, or something that needs to be discussed.

Welcome🎉 Our 2nd Neural MMO challenge is in conjunction with IJCAI 2022 competitions, with more prizes, challenges, and fun!🎉🎉

Announcement: IJCAI2022-Neural MMO PvE and PvP submit DDL is June 30th 11:59 PM PST (July 1st 2:59 PM GMT+8).

PvP Final Evaluation has two stages. In the first stage, your team would be ranked by SR, while each team should participate in 1,000 tournaments. If you are in the top 16th at the first stage, your team will automatically attend the second stage, ranking by average Achievement while attending additional 1,000 tournaments. The second stage will determine the final evaluation top 16th ranking.

How to attend the PvP final tournament?
1. The latest submission tagged with the "-pvp" suffix with a 25+ achievement score in Stage 1 will automatically attend the PvP matches. ( "-pvp" suffix example: "my-submission-v4-pvp")
2. Otherwise, your latest submission, which achieves a 25+ achievement score on PvE Stage 1, will be considered to attend the PvP matches.

See more Bonus Stage rules and reward details. 🚴 🚴

IJCAI2022-Neural MMO Env Update

Since there are some minor bugs in the attack calculation method, the IJCAI2022-Neural MMO environment will be updated, using the following script:

pip install git+http://gitlab.aicrowd.com/henryz/ijcai2022nmmo.git
python -c "import nmmo; assert nmmo.version == '1.5.3.17.a7'; print('ok')"

📑Description

Introduction

Survive and thrive in large open-worlds full of adversaries and competitors. Explore, forage, and fight your way through Neural MMO’s procedurally generated environments. Outcompete other participants trying to do the same.

Your task is to implement a policy ── a controller that defines how an agent team will behave within an environment. You may use scripted, learned, or hybrid approaches incorporating any information and leveraging any computational budget for development. Your agents will compete in tournaments against agents designed by other participants. We will assign a skill rating to your policy based on task completion, taking into account the skill of your opponents. The policy with the highest skill rating wins.

Neural MMO

Neural MMO is an open-source research platform that simulates populations of agents in procedurally generated virtual worlds. It is inspired by classic massively multiagent online role-playing games (MMORPGs or MMOs for short) as settings where lots of players using entirely different strategies interact in interesting ways. Unlike other game genres typically used in research, MMOs simulate persistent worlds that support rich player interactions and a wider variety of progression strategies. These properties seem important to intelligence in the real world, and the objective of this competition is to spur research towards increasingly general and cognitively realistic environments.

You can read more on the Neural MMO project page.

Challenge Motivation

The challenge focuses on robustness to new maps and new opponents. The team design introduces cooperation and specialization to different roles on top of this. These elements of intelligence are important in the real world and are not well explored in modern AI research.

What's more, we add gamification ideas this time to make the competition a good memory for every participant. The concepts of stages and arena are introduced in the PvE and PvP track. Hope you will enjoy them! 😄

🏋️ Competition structure

Evaluation environment

Your policy will control a team of 8 agents and will be evaluated in a free-for-all against other participants on 128x128 maps with 15 other teams for 1024 game ticks (time steps). Score points by completing tasks in the environment centring around exploration, foraging, combat, and equipment. You will get the score points for a task if any of your 8 agents accomplish it. There are no bonus points for completion by multiple agents on the same team, so it may be beneficial to have different agents perform different roles.

Starter kit

Neural MMO is fully open-source and includes scripted and learned baselines with all associated code. We provide a starter kit with example submissions, local evaluation tools, and additional debugging utilities. The documentation in the starter kit provided will walk you through installing dependencies and setting up the environment. Our goal is to enable you to make your first test submission within a few minutes of getting started.

🤼 Evaluation

You may script or train agents independent of the evaluation setting: environment modifications, domain knowledge, and custom reward signals are all fair game. The starter kit includes the config files that will be used in the evaluation. Note that you will not have access to the random seed used to generate evaluation maps. Barring large errors, configs will not change during the competition.

One team could submit up to 5 successful submissions to the competition each day. Making alt accounts bypass this limit will result in disqualification.

We will evaluate your agent in two settings.

PvE - Player's Policy vs Built-in AIs (of Different Tiers)

Three stages are designed in PvE, which differ in the tiers of the built-in AIs.

Stage 1: We will evaluate agents against scripted baselines immediately upon submission. The scripted baselines are open-source, so the evaluation environment is accessible during your training procedure. Your objective is to score more points than your opponents. This stage’s objective is to encourage the new participants to get familiar with this challenge quickly.

Stage 2/3: We will evaluate agents against two tiers of neural network baselines, which are elaborately designed by organizers and trained by Parametrix.AI. These neural network baselines will not be open-source during the competition. These built-in AIs are much more powerful than scripted AIs, and AIs in Stage 3 will be more powerful than that in Stage 2.

Bonus Stage is an additional stage for PvE, the built-in AIs in this stage are designed by three participant teams. Bonus Stage Designer Team including team "here", "doubleZ" and "master_kong_kong".

Bonus Stage: We will evaluate agents against Designer Team’s agents, which are elaborately selected and designed by Bonus Stage Designer Team members. Your objective is to score more points than Designer Team’s' agents. This stage's objective is to make participants' policies more robust while against built-in AIs with more various policies.

The reward details of each stage are shown in the winner distribution part.

PvP - Player's Policy vs Other Players' Policies

Once per week (adjusting by the number of participants), we will run matchmaking to throw your newest and qualified policy into games against other participants’ submissions. Your objective is to earn more achievement points than your opponents. We will use match results to update a True Skill rating estimate. This True Skill rating will be posted to the leaderboards.

Qualification: The participants who achieve 25 scores in the built-in AI environment of Stage 1 are qualified to challenge PvP Leaderboard.

Compute budget for the agent

You may use any resources you like for training and development but are limited in CPU time (600 ms/game tick) and storage (500 MB) for each evaluation. These are set relatively high ── our objective is not to force aggressive model optimization or compression.

Again, this budget is intended to keep evaluation costs manageable rather than to place a significant constraint on development.

Evaluation metrics

Your agent will be awarded 0-84 points in each tournament based on completing tasks from the achievement diary below. Score 4 points for easy (green) tasks, 10 points for normal (orange) tasks, and 21 points for hard (red) tasks. Achievement points don't add up and only use the highest score as the point in each category once you complete the task. The thresholds for each category are given in the figure below.

In the PvE Leaderboard, the Top1 Ratio and submission time are two main ranking metrics. Among them, the definition of Top1 is earning the highest achievement score in each tournament with built-in AIs. We first rank by the teams' Top1 Ratio and then rank the earlier submission higher while the Top1 Ratio is the same.

Since achievement scores vary against opponents, we will only use the True Skill rating to rank the PvP Leaderboard.

Final evaluation

At the submission deadline on June 30th, additional submissions will be locked. From July 1st to July 15th, we will continue to run PvP tournament with more rollouts. After this period, the leaderboard will be finalized and used to determine who gets respective ranking-based prizes.

📅 Timeline

April 14th, 2022 – Starter kit releases, PvE begins and submission system opens
May 5th, 2022 – PvP Tournament begins
June 23rd, 2022 – Entry and team formation deadline
June 30th, 2022 – Final submission deadline
July 1st, 2022 - July 15th,2022 – Final PvP evaluation
July 16th,2022 – Winners announced

Note: The competition organizers reserve the right to update the competition's timeline if they deem it necessary. All deadlines are at 11:59 PM PST on the corresponding day unless otherwise noted.

🏆 Winner contributions

In addition to the cash prizes listed below, we will invite the top three teams from each track to co-author a summary manuscript at the end of the competition. At our discretion, we may also include honourable mentions for academically interesting approaches, such as those using exceptionally little computing or minimal domain knowledge. Honourable mentions will be invited to contribute a shorter section to the paper and have their names included inline.

We strongly encourage but do not require winners to open-source their code.

PvE Leaderboard

Pioneer Award per Stage: For the first participant to achieve 1.0 Top1 Ratio of the built-in AI environment of a stage

Total Pioneer Award money is about $3,500;
The first participant to reach 1.0 Top1 Ratio in Stage 1/ 2/ 3 will receive $300/ 700/ 1,000;
The first participant to reach 1.0 Top1 Ratio in Bonus Stage will receive $1,500 (To be fair, Bonus Stage Designer Team members are excluded from the Bonus Stage leaderboard ranking); If there are no 1.0 Top1 Ratio participants, the Designer Team members will split the $1,500;
Each team can only claim the prize once in Stage 1/ 2/ 3 (When a team wins two or more Pioneer Awards, the highest Pioneer Award will be awarded by default);

Sprint Award: Each fortnight, the top 3 on the highest stage PvE Leaderboard will receive certificates.

PvP Leaderboard

Main Prize Distribution: Main Prize Distribution: Monetary prizes are distributed to the top 16 teams on the PvP Leaderboard at the end of the competition; the Top 64 teams will receive customized certificates.

PvP Aficionado: Customized certificate and rewards are awarded to the first Place for single-metric, i.e., most killing.

Sore Feet: Customized certificate and rewards are awarded to the first Place for single-metric, i.e., most exploration.

Note: All winners will be invited to co-author a summary of the whole competition for IJCAI2022!

📞 Support

🔗Discord Channel Discord is our main contact and support channel. Any questions and issues can also be posted on the discussion board.

💬AI Crowd Discussion Channel AI Crowd Discussion board provides a quick question, discussion, and issue method.

📃FAQ Document The FAQ document can help you query frequent questions and corresponding answers, and we will continue to update the document during the competition.

🐧QQ Our second support channel is Tencent QQ. The QQ group number is 1011878149.

👨‍🔬Professional inquiries

Joseph Suarez (jsuarez@mit.edu)
Jiaxin Chen (jiaxinchen@chaocanshu.ai)
Henry Zhu (henryzhu@chaocanshu.ai)
Jyotish Poonganam (jyotish@aicrowd.com)
Sharadha Mohanty (mohanty@aicrowd.com)

🤖 Team

Joseph Suarez (MIT)
Jiaxin Chen (Parametrix.AI)
Henry Zhu (Parametrix.AI)
Bo Wu (Parametrix.AI)
Hanmo Chen (Parametrix.AI, Tsinghua University)
Xiaolong Zhu (Parametrix.AI)
Chenghui Yu (Parametrix.AI, Tsinghua University)
Jyotish Poonganam (AIcrowd)
Vrushank Vyas (AIcrowd)
Shivam Khandelwal (AIcrowd)
Sharada Mohanty (AIcrowd)
Phillip Isola (MIT)
Julian Togelius (New York University)
Xiu Li (Tsinghua University)

🎥 Media Cooperation

机器之心 Synced