Learn to Move Through a Combination of Policy Gradient Algorithms: DDPG, D4PG, and TD3

Nicolas Bach* Andrew Melnik* Malte Schilling Timo Korthals
and Helge Ritter
07 Jan 2021



Deep Reinforcement Learning has recently seen progress for continuous control tasks, driven by yearly challenges such as the NeurIPS Competition Track. This work combines complementary characteristics of two current state of the art methods, Twin-Delayed Deep Deterministic Policy Gradient and Distributed Distributional Deep Deterministic Policy Gradient, and applied this in the state-of-the-art Learn to moveβ€”Walk Around locomotion control challenge which was part of the NeurIPS 2019 Competition Track. The combined approach showed improved results and achieved the 4th place in this competition. The article presents this combination and evaluates the performance.

Back to AIcrowd Research