**Learning to Locomote: Understanding How Environment Design Matters for Deep Reinforcement Learning**
**Daniele Reda, Tianxin Tao, Michiel van de Panne
University of British Columbia**
**[MIG 2020](https://computing.clemson.edu/vcl/mig2020/)**
ACM SIGGRAPH Conference on Motion, Interaction and Games
![](envdesign1.png height="190px" border="1")
![](envdesign2.png height="190px" border="1")
__Abstract__
Learning to locomote is one of the most common tasks in physics-based animation and deep
reinforcement learning (RL). A learned policy is the product of the problem to be solved, as
embodied by the RL environment, and the RL algorithm. While enormous attention has been devoted to
RL algorithms, much less is known about the impact of design choices for the RL environment. In
this paper, we show that environment design matters in significant ways and document how it can
contribute to the brittle nature of many RL results. Specifically, we examine choices related to
state representations, initial state distributions, reward structure, control frequency, episode
termination procedures, curriculum usage, the action space, and the torque limits. We aim to
stimulate discussion around such choices, which in practice strongly impact the success of RL when
applied to continuous-action control problems of interest to animation, such as learning to
locomote.
__Paper__
[PDF (5.1 Mb)](2020-MIG-envdesign.pdf) MIG 2020, final version
[ArXiV page](https://arxiv.org/abs/2010.04304)
__Videos__
Talk (12 min)
Paper video (4 min)
__bibtex__
`````````````````````````
@inproceedings{2020-envdesign
title={Learning to Locomote: Understanding How Environment Design Matters for Deep Reinforcement Learning},
author={Daniele Reda and Tianxin Tao and Michiel van de Panne},
booktitle = {Proc. ACM SIGGRAPH Conference on Motion, Interaction and Games},
year={2020}
}
`````````````````````````