Learning to Run: NeurIPS 2017 & 2018
Łukasz Kidziński, a postdoc at Stanford University, likes a challenge. And when NeurIPS, the world’s leading deep learning conference in the world, announced its first call for challenges in 2017, he didn’t wait around.
In fact, at this point, he had already run a challenge, “Learning to Walk”, where participants were tasked with controlling a musculoskeletal model skeleton in an open source simulation environment called OpenSim. Running this as a NeurIPS challenge would expose this interesting problem to the entire world.
We were excited thinking about such a demanding challenge that required complex code submissions that needed to be evaluated extensively in a simulation environment. “Thanks to the agile structure of the platform, it was easy to design a machine learning challenge outside of the standard framework of the training/test data challenges” said Łukasz, with whom we had worked to prepare the challenge. Everyone was extremely happy when the proposal was accepted - especially after it received reviews like “Very interesting proposal that will probably attract large participation, has significant impact and would entail usage of state-of-the-art models in a hot topic (Reinforcement Learning).”
The challenge became very popular, not only because it was a cutting-edge reinforcement learning problem, but also because quite a few sponsors came on board, notably also Nvidia, who offer a $69,000 DGX Station to the winning team and Titan GPUs for the 2nd and 3rd ranked, which raised the stakes quite a bit. In addition, Amazon AWS had agreed to sponsor and provide some of the computing resources for this challenge.
"Thanks to the agile structure of the platform, it was easy to design a machine learning challenge outside of the standard framework of the training/test data challenges."
The challenge ran for a few months, and it was incredibly exciting to see the progress made during that time. As with any cutting-edge reinforcement learning challenge, speed quickly became a major issue. We were delighted to see even folks like Trevor Blackwell (YC co-founder) chime in on GitHub, and while it was true that the simulator wasn’t the fastest it could be, the community found incredibly creative solutions to make the skeleton walk even with this serious constraint.
In the beginning, with random muscle skeletons, the initial skeletons behaved mostly like this:
Not much walking going on here! The goal of the challenge was to maximize the horizontal distance from the initial position of the pelvis, to the final position after 10 seconds (the simulation was aborted if the pelvis went below 0.65 meters above ground, indicating that the skeleton was collapsing). Thus, the first solutions were simply exploiting the simple fact that throwing the skeleton forward would already generate a positive score!
Another local maximum that was discovered by the networks soon after was that if one jump worked fine, many jumps should work even better. And in a short time, we had jumping skeletons emerging on scene!
Adapting the reward function to account for things like penalty for muscle activation (muscle activation uses energy, after all), helped the agents steer away from unrealistic “hoppy” gaits. The policies learnt nevertheless became quite entertaining to look at, as skeletons started to move in all kinds of funny ways that were reminiscent of Monty Python’s Ministry of Silly Walks:
Very clearly, the skeletons were indeed learning to walk and run! Eventually, the sophistication increased to a level that was unexpected and extremely impressive. Here is the winning solution submitted by Nnaisense:
After celebrating the winners both at NeurIPS and the Applied Machine Learning Days, the question arose whether to submit another challenge proposal in the coming year. Again under Łukasz’s leadership, the team proposed a new challenge called “AI for prosthetics”. The idea was to use the same set up as in the year before, but to add a few additional complexities, of which one was to replace one of the lower legs with a fixed prosthetic leg.
It was great to see that this proposal was once again accepted, with three out of four reviews rating it as “Strong Accept: I'd be upset if this wasn't chosen to be a NIPS Competition”. Participation in this new challenge was again intense, with Nnaisense eventually coming in second just after the team from Baidu Research.
We were excited to push the envelope of machine learning competitions with these two challenges, and we’re looking forward to solving many more cutting-edge problems with collaborators. As Łukasz said, the platform “enabled us to quickly build a large CS community around our software, and participants started using our software for teaching and research in reinforcement learning. For new challenges, I would definitely choose AIcrowd over other platforms again.”
NeurIPS 2018 ”AI for prosthetics” and NeurIPS 2017 “Learning to Run” challenges lay the foundations of the emerging field of biomedical reinforcement learning. Fusing biomechanics, computer science, neuroscience, and medical research to explore a grand challenge in human movement, motor control, and assistive devices, the challenges attracted together 993 participants and 6729 submissions. The GitHub repository of the challenge has been forked 184 times and starred 573 times.
"For new challenges, I would definitely choose AIcrowd over other platforms again."
The competitions were built on free and open source components coming from academics, and have already become a teaching and research resource at several universities. Continuing our series of challenges for the third year in a row, we aim at establishing it as a benchmark of the emerging field.