We present a data-driven deep reinforcement learning (DRL) method for the optimization of a hierarchically structured control policy that includes the central pattern generator. This method, which is ...