This is a simple notebook you can run on Colab to test and train your own baselines
%%capture # Install NLE [~3 mins] !pip install -U cmake !apt update -qq && apt install -qq -y flex bison libbz2-dev libglib2.0 libsm6 libxext6 git-lfs !pip install -U pip !pip install nle torch wandb
%%capture # Clone Repos (TorchBeast & Starter Kit) [~3 mins] !git lfs install !git clone https://github.com/condnsdmatters/torchbeast.git --recursive !git clone http://gitlab.aicrowd.com/nethack/neurips-2021-the-nethack-challenge.git
%%capture # Install TorchBeast [~17 mins] %env CMAKE_MAX_PARALLEL=20 !cd torchbeast \ && pip install -r requirements.txt \ && pip install ./nest \ && python setup.py install
%%capture # Install StarterKit [~2 mins] !cd neurips-2021-the-nethack-challenge && pip install -r requirements.txt && git lfs pull
The starter-kit comes with a pretrained models stored in
saved_models/torchbeast/* and the starter-kit is setup to submit one of these models by default.
By default the starter-kit is configured (in
- submit a
TorchBeastAgent... (defined in
- ... by loading
When training your own TorchBeast model, simply set
MODEL_DIR to point to the output directory where TorchBeast has saved your model.
In the meantime you can test submissions by running:
Training Your Own Agent¶
You can train your agent however you wish, but a standard IMPALA set up is given to you in
nethack_baselines/torchbeast/ to help you get started.
The best place to start is the
README.md which has info and suggestions on how to improve the model. However a very brief overview of the key files is:
config.yaml- The yaml file specifies the main flags used for the agent and training (accessible in the variable
models/baseline.py- Specify the model in use
polybeast_learner.py- Specify the learning step used for training.
polybeast_env.py- Specify the environments used for training.
When training its can be helpful to keep artefacts and logs of useful data. While TorchBeast logs to stdout (and some files) by default, there is also a supported integration with Weights and Biases! We suggest setting this up by doing the following step:
import wandb run = wandb.init() run.finish()
Now you can easily get to training! The command is simple.
WARNING If you wish to end your training run: ⌘/Ctrl + M I (Interrupt Execution). Colab struggles to complete cleanup on PolyBeast runs.
!python nethack_baselines/torchbeast/polyhydra.py total_steps=10000 num_actors=128 # Set other arguments here, eg wandb=True entity=<wandb-username>