The following rules attempt to capture the spirit of the competition and any submissions found to be violating the rules may be deemed ineligible for participation by the organizers.

List of Changes:

As we make changes, they will be listed below.

August 1st: Limitations on participation have been relaxed to limitations on receiving awards.

General Rules

These rules apply to both tracks (Intro and Research).

  • Entries to the MineRL Diamond competition must be “open”. Teams will be expected to reveal most details of their method including source-code (special exceptions may be made for pending publications).
  • For a team to be eligible for receiving awards, each member must satisfy the following conditions:
    • be at least 18 and at least the age of majority in place of residence;
    • not reside in any region or country subject to U.S. Export Regulations; and
    • not be an organizer of this competition nor a family member of a competition organizer.
  • To receive any awards from our sponsors, competition winners must attend the NeurIPS workshop.
  • A team can submit separate entries to both tracks; performance in the tracks will be evaluated separately. Submissions between the two tracks are not linked in any way.
  • Interactions with the environment must be through the “step” function. Only the provided Gym interface may be used. Additional information may not be extracted from the simulator in any way.
  • Official rule clarifications will be made in the FAQ on the AIcrowd website.
    • The FAQ will be available on the AIcrowd page.
    • Answers within the FAQ are official answers to questions. Any informal answers to questions (e.g., via email) are superseded by answers added to the FAQ.

Research Track

These additional rules apply only to the Research track.

  • The submission must train a machine learning model without relying on human domain knowledge.
    • The reward function may not be changed (shaped) based on manually engineered, hard-coded functions of the state. For example, additional rewards for approaching tree-like objects are not permitted, but rewards for encountering novel states (“curiosity rewards”) are permitted.
    • Actions/meta-actions/sub-actions/sub-policies may not be manually specified in any way. For example, though a learned hierarchical controller is permitted, meta-controllers may not choose between two policies based on a manually specified condition, such as whether the agent has a certain item in its inventory. This restriction includes the composition of actions (e.g., adding an additional action which is equivalent to performing “walk forward for 2 seconds” or “break a log and then place a crafting table”).
    • State processing/pre-processing cannot be hard-coded with the exception of frame-stacking. For example, the agent can act every even-numbered timestep based on the last two observations, but a manually specified edge detector may not be applied to the observation. As another example, the agent’s observations may be normalized to be “zero-mean, variance one” based on an observation history or the dataset.
    • To ensure that the semantic meaning attached to action and observation labels are not exploited, the labels assigned to actions and observations have been obfuscated (in both the dataset and the environment). Actions and observations (with the exception of POV observations) have been embedded into a different space. Furthermore, during Round 2 submissions, the actions will be re-embedded. Any attempt to bypass these obfuscations will constitute a violation of the rules.
    • Models may only be trained against the MineRL environments ending with “VectorObf".  All of the MineRL environments have specific competition versions which incorporate action and observation space obfuscation. They all share a similar observation and action space embedding which are changed for Round 2.
    • The training budget is limited. Eight million (8,000,000) interactions with the environment may be used in addition to the provided dataset. If stacking observations / repeating actions, then each skipped frame still counts against this budget.
  • Participants may only use the provided dataset; no additional datasets may be included in the source file submissions nor may be downloaded during training evaluation, but pre-trained models which are publicly available by June 5th are permitted.
    • During the evaluation of submitted code, the individual containers will not have access to any external network in order to avoid any information leak. Relevant exceptions are added to ensure participants can download and use the pre-trained models included in popular frameworks like PyTorch and TensorFlow. Participants can request to add network exceptions for any other publicly available pre-trained models, which will be validated by AICrowd on a case-by-case basis.
    • All submitted code repositories will be scrubbed to remove files larger than 30MB to ensure participants are not checking in any model weights pretrained on the released training dataset.
    • Pretrained models are not allowed to have been trained on MineRL or any related or unrelated Minecraft data. The intent of this rule is to allow participants to use models which are, for example, trained on ImageNet or similar datasets. Don't abuse this.
  • The procedure for Round 1 is as follows:
    • At the end of Round 1, teams must submit source code to train their models. This code must terminate within four days on the specified platform.
    • For teams with the highest evaluation scores, this code will be inspected for rule compliance.
    • For those submissions where rule violations are found, the offending teams will be contacted for appeal. Unless a successful appeal is made, the organizers will remove those submissions from the competition and then evaluate additional submissions until Round 2 is at capacity.
    • The top 15 teams of Research track will proceed to the Round 2.
  • The procedure for Round 2 is as follows:
    • During Round 2, teams will submit their source code at most once every two weeks.
    • After each submission, the model will be trained for four days on a re-rendered, private dataset and domain, and the teams will receive the final performance of their model. The dataset and domain will contain matching changes to the action space and the observation space.
    • At the end of the round, final standings are based on the best-performing submission of each team during Round 2.