Warm-Up Round: Completed Round 1: Completed Round 2: 46 days left #neurips #reinforcement_learning

🚉 Introduction

  • 🚧 Round 2 is starting! Submissions are already open, but some rough edges are still being ironed out. The detailed changes in Round 2 will be published soon.

This challenge tackles a key problem in the transportation world: 
How to efficiently manage dense traffic on complex railway networks?

This is a real-world problem faced by many transportation and logistics companies around the world such as the Swiss Federal Railways and Deutsche Bahn. Your contribution may shape the way modern traffic management systems are implemented, not only in railway but also in other areas of transportation and logistics!

🚂 Background

The Flatland challenge aims to address the problem of train scheduling and rescheduling by providing a simple grid world environment and allowing for diverse experimental approaches.

This is the second edition of this challenge. In the first one, participants mainly used solutions from the operations research field. In this second edition we are encouraging participants to use solutions which leverage the recent progress in reinforcement learning.

🔗 The Flatland environment

🔗 Past winning solutions

Flatland preview

Flatland: the core task of this challenge is to manage and maintain railway traffic on complex scenarios in complex networks

📜 Tasks

Your goal is to make all the trains arrive at their target destination with minimal travel time. In other words, we want to minimize the number of steps that it takes for each agent to reach its destination. In the simpler levels, the agents may achieve their goals using ad-hoc decisions. But as the difficulty increases, the agents have to be able to plan ahead!


Problem example: this is a teaser of what we expect you to do

A central question while designing an agent is the observations used to take decisions. As a participant, you can either work with one of the base observations that are provided or better, design an improved observation yourself!

These are the three provided observations:

  • Global Observation: The whole scene is observed.
  • Local Grid Observation: A local grid around the agent is observed.
  • Tree Observation: The agent can observe its navigable path to some predefined depth.

🔗 Observations in Flatland

🔗 Create custom observations

⚖ Evaluation metrics

The primary metric is the mean normalized return from your agents - the higher the better.

The minimum possible value is -1.0, which occurs if none of the agents reach their goal during the episode. The maximum possible value is 0.0, which would occur if all the agents reached their targets in one time step, which is generally not achievable.

The agents have to act within strict time limits. You are allowed up to 5 minutes of initial planning time before any agent moves. Beyond that point, the agents have 5 seconds per time step to indicate their next actions. If the agents fail to act in time, the episode will fail and will receive of score of -1.0. Each evaluation can take up to 8 hours, after which the full evaluation will be cancelled.

🔗 Evaluation metrics

🔗 Time limits

🏆 Prizes

The prizes are four travel grants to the NeurIPS 2020 conference ✈ī¸

  • The first place team in the final round will be awarded one travel grant, whichever approach they use.
  • The top three teams in the final round which use a reinforcement learning approach for their winning submission will be awarded one travel grant each.

The prizes will be updated soon given that the NeurIPS conference will take place fully online this year.

The approach used for each submission needs to be specified in the aicrowd.json file as described in the submission guide.

The winning submissions will be verified manually by the organizers to ensure the method used matches what has been declared in the aicrowd.json file. The organizers have the final word when judging the validity of each submission.

If the overall first place team uses a reinforcement learning approach, then this team will be awarded two travel grants.

📅 Timeline

Here's the tentative timeline:

  • June 1st - July 9th: Warm-Up Round
  • July 10th - August 14th: Round 1
  • September 1st - October 19th: Round 2
  • October 20th - October 25th: Post Challenge Analysis
  • October 25th: Final Results Announced
  • October 16th - November 10th: Post Challenge Wrap-Up

There are no qualifying rounds: participants can join the challenge at any point until the final deadline. Prizes will be awarded according to Round 2 ranking.

🚉 Next stops

The Flatland documentation contains everything you need to get started with this challenge!

Want to dive straight in? 
🔗 Submit in 10 minutes

New to multi-agent reinforcement learning? 
🔗 Step by step guide

Want to explore advanced solutions such as distributed training and imitation learning?
🔗 Research baselines

📱 Contact

Join the Discord channel to exchange with other participants!

🔗 Discord Channel

If you have a problem or question for the organizers, use either the Discussion Forum or open an issue:

🔗 Discussion Forum

🔗 Technical Issues

We strongly encourage you to use the public channels mentioned above for communications between the participants and the organizers. But if you're looking for a direct communication channel, feel free to reach out to us at:

  • mohanty [at] aicrowd.com
  • florian [at] aicrowd.com
  • erik.nygren [at] sbb.ch

For press inquiries, please contact SBB Media Relations at press@sbb.ch

🤝 Partners



Getting Started


01 vetrov_andrew 273.339
03 Zain 239.935

Latest Submissions

makhnev_konstantin submitted
makhnev_konstantin submitted
harshadkhadilkar submitted
harshadkhadilkar submitted
telescopic failed