Learn to Move Through a Combination of Policy Gradient Algorithms: DDPG, D4PG, and TD3
07 Jan 2021
Deep Reinforcement Learning has recently seen progress for continuous control tasks, driven by yearly challenges such as the NeurIPS Competition Track. This work combines complementary characteristics of two current state of the art methods, Twin-Delayed Deep Deterministic Policy Gradient and Distributed Distributional Deep Deterministic Policy Gradient, and applied this in the state-of-the-art Learn to move—Walk Around locomotion control challenge which was part of the NeurIPS 2019 Competition Track. The combined approach showed improved results and achieved the 4th place in this competition. The article presents this combination and evaluates the performance.