Round 1: 2 months

Flatland Challenge

Multi Agent Reinforcement Learning on Trains. Starting Soon

Misc Prizes : To Be Announced



The key question we want to answer here is: How can trains learn to automatically coordinate among themselves, so that there are minimal delays in large train networks ?


The Flatland Challenge is a competition to facilitate the progress of multi-agent reinforcement learning for any vehicle re-scheduling problem (VRSP). The challenge addresses a real-world problem faced by many transportation and logistics companies around the world (such as the Swiss Federal Railways (SBB)). Using reinforcement learning (or operations research methods), you must solve different tasks related to VRSP on a simplified 2D multi-agent railway simulations environment. Your contribution might influence and shape the way modern traffic management systems (TMS) are implemented not only in railway but also in other areas of transportation and logistics. This will be the first of a series of challenges related to Multi-Agent Reinforcement Learning.


The Swiss Federal Railways operates the densest mixed railway traffic in the world. SBB maintains and operates the biggest railway infrastructure in Switzerland. Today, there are more than 10,000 trains running each day, being rerouted at over 13,000 switches and managed by over 32,000 signals. Almost half of all goods within Switzerland and 1.2 million passengers are transported on this railway network each day. Due to the growing demand for mobility, SBB needs to increase the transportation capacity of the network by approximately 30%.

The increase in transport capacity can be achieved through different measures such as denser train schedules, large infrastructure investments, and/or investments in new rolling stock. However, the SBB currently lacks suitable technologies and tools to quantitatively assess these different measures.

A promising solution to this dilemma is a complete railway simulation that efficiently evaluates the consequences of infrastructure changes or schedule adaptations for network stability and traffic flow. A complete railway simulation consists of a full dynamical physics simulation as well as an automated traffic management system.

flatland_visual Flatland: Managing and maintaining punctual railway traffic on complex network configurations in the main task of this challenge. This image illustrates an early draft of the environment visualization.

The research group at SBB has developed a high performance simulator which simulates the dynamics of train traffic as well as the railway infrastructure. We are currently investigating the possibility of an automated TMS to use in the simulation. The role of the traffic management system is to select routes for all trains and decide on their priorities at switches in order to optimize traffic flow across the full network.

At the core of this challenge lies the general vehicle re-scheduling problem proposed by Li, Mirchandani and Borenstein in 2007 :

The vehicle rescheduling problem (VRSP) arises when a previously assigned trip is disrupted. A traffic accident, a medical emergency, or a breakdown of a vehicle are examples of possible disruptions that demand the rescheduling of vehicle trips. The VRSP can be approached as a dynamic version of the classical vehicle scheduling problem (VSP) where assignments are generated dynamically.

The “Flatland” Competition aims to address the vehicle rescheduling problem by providing a simplistic grid world environment and allowing for diverse solution approaches from the fields of reinforcement learning and operations research.

The problems are formulated as a 2D grid environment with restricted transitions between neighbouring cells to represent railway networks. On the 2D grid, multiple agents with different objectives must collaborate to maximize global reward. There are different tasks that need to be solved as explained in section.


Through the different tasks, you will become familiar with the “Flatland” environment and the specific difficulties of the re-scheduling problem. More details about the individual tasks will be announced in Mid-May, but in the meantime, here are a few teasers to keep you occupied :

Flatland 0.1: Here you see an early prototype of the Flatland environment. Restricted transitions and agent movements are not present yet. Agent behavior and collaboration is suboptimal. Final environment configurations will span much larger environments.

Flatland 0.1: The left animation illustrates local observations of each agent along its possible paths. On the right side you see the corresponding agent movement along the railway network.


If you have any questions, or are willing to help in the technical developmennt of this challenge as it takes shape, please feel free to reach out to us at :

  • mohanty [at] aicrowd.com
  • erik.nygren [at] sbb.ch