Mobile-TeleVision:
Predictive Motion Priors for Humanoid Whole-Body Control

Chenhao Lu*             Xuxin Cheng*             Jialong Li*             Shiqi Yang             Mazeyu Ji             Chengjing Yuan            
Ge Yang             Sha Yi             Xiaolong Wang



Real-World Loco-Manipulation

Robots walking, standing in place and manipulating objects with whole-body control.







Cross-Country Whole-Body Control

Robustness


Motions from MoCap Data

Unitree H1
Fourier GR1

Abstract

Humanoid robots require both robust lower-body locomotion and precise upper-body manipulation to conduct diverse tasks. While recent RL approaches robustly train whole-body loco-manipulation policies, they often lack precise manipulation with high DoF arms. In this paper, we propose decoupling upper-body control from locomotion, using inverse kinematics (IK) and motion retargeting for precise manipulation, while RL is focused on robust lower-body locomotion. However, this decoupling can reduce system robustness. To address this, we introduce PMP (Predictive Motion Priors), trained with a Conditional Variational Autoencoder (CVAE) to represent upper-body motion. We train the policy conditioned on this upper-body motion representation for robust locomotion. Our experiments show that CVAE features are crucial for maintaining robustness and that our approach significantly outperforms RL-based whole-body control in precise manipulation.

Team

Chenhao Lu

Chenhao Lu*

Xuxin Cheng

Xuxin Cheng*

Jialong Li

Jialong Li*

Shiqi Yang

Shiqi Yang

Mazeyu Ji

Mazeyu Ji

Chengjing Yuan

Chengjing Yuan

Ge Yang

Ge Yang

Sha Yi

Sha Yi

Xiaolong Wang

Xiaolong Wang

BibTeX


@article{lu2024pmp,
title={Mobile-TeleVision: Predictive Motion Priors for Humanoid Whole-Body Control},
author={Lu, Chenhao and Cheng, Xuxin and Li, Jialong and Yang, Shiqi and Ji, Mazeyu and Yuan, Chengjing and Yang, Ge and Yi, Sha and Wang, Xiaolong},
journal={arXiv preprint arXiv:2412.07773},
year={2024}
}