Register now After registration you will be able to apply for this opportunity online.
This opportunity is not published. No applications will be accepted.
How Can Biomechanics Enhance Learning Quadrupedal Locomotion?
We seek a highly motivated student interested in the deep reinforcement learning and legged locomotion, to help us in understanding the ways in which the reward signal and characteristics of the environment lead to different locomotion behaviors. Specifically, we want to use properties from biomechanics, classical mechanics, and animal kinesiology to craft better reward/cost functions and training curriculums in order to improve both the results and speed of training neural-network locomotion control policies.
Keywords: Robotics, Model-free Deep Reinforcement Learning, Curriculum learning
Curriculum learning, as applied to Deep Reinforcement Learning has been shown to improve performance and efficiency of learning motor skills. This almost intuitive approach enables better and faster learning by making the learning problem easier at the beginning of training, and progressively increasing difficulty as the agent becomes more and more capable.
The goal of this project is to be able to design better training curricula for training locomotion policies for ANYmal in the simulation. Firstly, we want to explore, in-depth, the effects of different environments, i.e. reward functions and termination conditions, using knowledge from other related fields so that we can generate robust locomotion policies using from base principles and rely less on hand-crafted and overly engineered reward terms. Secondly, we seek to explore an automatic goal generation scheme or other possible optimization methods such that we can automate the training curricula for training locomotion policies
Curriculum learning, as applied to Deep Reinforcement Learning has been shown to improve performance and efficiency of learning motor skills. This almost intuitive approach enables better and faster learning by making the learning problem easier at the beginning of training, and progressively increasing difficulty as the agent becomes more and more capable. The goal of this project is to be able to design better training curricula for training locomotion policies for ANYmal in the simulation. Firstly, we want to explore, in-depth, the effects of different environments, i.e. reward functions and termination conditions, using knowledge from other related fields so that we can generate robust locomotion policies using from base principles and rely less on hand-crafted and overly engineered reward terms. Secondly, we seek to explore an automatic goal generation scheme or other possible optimization methods such that we can automate the training curricula for training locomotion policies
- Review of relevant literature
- Design, implementation, and comparison of reward terms and termination conditions
- Design, implementation, and comparison of automated curriculum learning strategies
- Ablation analysis and significance testing of the proposed approaches against baseline existing solutions
- Review of relevant literature - Design, implementation, and comparison of reward terms and termination conditions - Design, implementation, and comparison of automated curriculum learning strategies - Ablation analysis and significance testing of the proposed approaches against baseline existing solutions
- Hands-on experience in deep learning and robotics
- Excellent working knowledge of C++
- Experience with TensorFlow is a plus
- Hands-on experience in deep learning and robotics - Excellent working knowledge of C++ - Experience with TensorFlow is a plus