Learning How to Walk
HiddenReinforcement learning environments with musculoskeletal models
Learning how to walk
Our movement originates in the brain. Many neurological disorders, such as Cerebral Palsy, Multiple Sclerosis, or strokes can lead to problems with walking. Treatments are often symptomatic, and it’s often hard to predict outcomes of surgeries. Understanding underlying mechanisms is key to improvement of treatments. This motivates our efforts to model the motor control unit of the brain.
In this challenge, your task is to model the motor control unit in a virtual environment. You are given a musculoskeletal model with 18 muscles to control. At every 10ms you send signals to these muscles to activate or deactivate them. The objective is to walk as far as possible in 5 seconds.
For modelling physics we use OpenSim - a biomechanical physics environment for musculoskeletal simulations. You can read more datails here.
NOTE : There have been a few changes to the API of the grading server. Please update your osim-rl
installation by :
pip install git+https://github.com/kidzik/osim-rl.git
and update your submission script by referring to : (https://github.com/stanfordnmbl/osim-rl/blob/master/scripts/submit.py#L43)[https://github.com/stanfordnmbl/osim-rl/blob/master/scripts/submit.py#L43]
In the meantime if you run into scary looking error messages when using your previous submission scripts, please do not panic !! :D :D !!
Credits
This challenge wouldn’t be possible without:
- OpenSim
- Stanford Neuromuscular Biomechanics Lab
- Stanford Mobilize Center
- OpenAI gym
- OpenAI http client
- keras-rl
- and many other teams, individuals and projects
For more details and queries please contact
Partners
Evaluation criteria
Your task is to build a function f
which takes current state observation
(31 dimensional vector) and returns muscle activations action
(18 dimensional vector) in a way that maximizes the reward.
The trial ends either if the pelvis of the model goes below 0.7
meter or if you reach 500
iterations (corresponding to 5
seconds in the virtual environment). Let N
be the length of the trial. Your total reward is simply the position of the pelvis on the x
axis after N
steps. The value is given in centimeters.
After each iteration you get a reward equal to the change of the x
axis of pelvis during this iteration.
You can test your model on your local machine. For submission, you will need to interact with the remote environment: crowdAI sends you the current observation
and you need to send back the action you take in the given state.
Resources
Please refer to the Getting Started guide in the Dataset section of the challenge, for more details on how to access the challenge environments, and also for a basic tutorial on how to make your first submission.
Prizes
The winner will be invited to the 2nd Applied Machine Learning Days at EPFL in Switzerland on January 29 & 30, 2018, with travel and accommodation covered.
Datasets License
Participants
