Humanoid robots require both robust lower-body locomotion and precise upper-body manipulation to conduct diverse tasks. While recent RL approaches robustly train whole-body loco-manipulation policies, they often lack precise manipulation with high DoF arms. In this paper, we propose decoupling upper-body control from locomotion, using inverse kinematics (IK) and motion retargeting for precise manipulation, while RL is focused on robust lower-body locomotion. However, this decoupling can reduce system robustness. To address this, we introduce PMP (Predictive Motion Priors), trained with a Conditional Variational Autoencoder (CVAE) to represent upper-body motion. We train the policy conditioned on this upper-body motion representation for robust locomotion. Our experiments show that CVAE features are crucial for maintaining robustness and that our approach significantly outperforms RL-based whole-body control in precise manipulation.
@article{lu2024pmp,
title={Mobile-TeleVision: Predictive Motion Priors for Humanoid Whole-Body Control},
author={Lu, Chenhao and Cheng, Xuxin and Li, Jialong and Yang, Shiqi and Ji, Mazeyu and Yuan, Chengjing and Yang, Ge and Yi, Sha and Wang, Xiaolong},
journal={arXiv preprint arXiv:2412.07773},
year={2024}
}