Yongyuan Cheryl Liang

My research focuses on developing large foundation models and agentic intelligence. I actively explore both theoretical frameworks and empirical findings, with specific research interests in:

  • Large Multimodal Models and World Models for general agentic tasks, including virtual and physical intelligence.
  • Post Training: Multi-objective or cross-modality alignment for large models.
In the previous few years, I have worked on Reinforcement Learning, Representations and Robustness.

I'm currently on the full-time job market.

profile photo

Email  /  Google Scholar  /  Github /  Twitter

I am a PhD student at UMD CS. I was fortunate to work with Jianwei Yang and Huazhe Xu.

I conducted research at NVIDIA, Adobe, and Microsoft Research. I received my B.S. degree in Mathematics from Sun Yat-sen University.

If you’re interested in my research, potential collaborations, or simply want to catch up, feel free to drop me an email.

May' 26  

SAW-Bench won the Best Paper Award Runner-Up at CVPR 2026 WMAS.

Apr' 26  

SAW-Bench to appear in ICML 2026 as Spotlight (2%).

Apr' 26  

Drop a new blog post about Vibe Agents Step Into the Real World.

Mar' 26  

We release a new blog post about Spatial Memory in Frontier Models.

Feb' 26  

Three papers to appear in CVPR 2026 (2 main track and 1 findings).

Jan' 26  

MomaGraph selected as Oral presentation (1%) in ICLR 2026.

Jan' 26  

One papers to appear in ICRA 2026.

Jan' 26  

Two papers to appear in ICLR 2026 (ROVER and MomaGraph).

Sept' 25  

One paper to appear in NeurIPS 2025.

Feb' 25  

Magma to appear in CVPR 2025.

Jan' 25  

Two papers to appear in ICLR 2025.

Jan' 25  

Start to update Awesome-Generalist-Agents.

Sept' 24  

Make-An-Agent to appear in NeurIPS 2024.

June' 24  

ACE selected as Oral presentation (1%) in ICML 2024.

May' 24  

Two papers to appear in ICML 2024.

Jan' 24  

Three papers to appear in ICLR 2024, including two spotlights and one poster.


Selected Publications and Preprints
Filter by: show selected / show all by date / Large Multimodal Model / Reinforcement Learning / Other Topics

* denotes Equal Contributions and Project Lead; † indicates Equal Advising.

Embodied Large Multimodal Model

Learning Situated Awareness in the Real World
Chuhan Li, Joy Hsu*, Yongyuan Liang*, Ruilin Han*, Rajiv Dhawan, Jiajun Wu, Ming-Hsuan Yang, Xin Eric Wang

ICML, 2026 Spotlight, Top 2%
Best Paper Award Runner-Up at CVPR Workshop WMAS, 2026
Project Page  /  Paper  /  Benchmark /  Twitter

Embodied Large Multimodal Model

MomaGraph: State-Aware Unified Scene Graphs with Vision-Language Models for Embodied Task Planning
Yuanchen Ju*, Yongyuan Liang*, Yen-Jen Wang*, Gireesh Nandiraju, Yuanliang Ju, Seungjae Lee, Qiao Gu, Elvis Hsieh, Furong Huang†, Koushil Sreenath†

ICLR, 2026 Oral, Top 1%
Project Page  /  Paper  /  Code /  Benchmark /  Twitter

Agentic Large Multimodal Model

Anticipatory Planning for Multimodal Agents
Yongyuan Liang, Shijie Zhou, Yu Gu, Hao Tan, Gang Wu, Franck Dernoncourt, Jihyung Kil, Ryan A. Rossi, Ruiyi Zhang

CVPR Findings, 2026
Paper  /  Twitter

Unified Multimodal Model

ROVER: Benchmarking Reciprocal Cross-Modal Reasoning for Omnimodal Generation
Yongyuan Liang*, Wei Chow*, Feng Li, Ziqiao Ma, Xiyao Wang, Jiageng Mao, Jiuhai Chen, Jiatao Gu, Yue Wang†, Furong Huang†

ICLR, 2026
Project Page  /  Paper  /  Code /  Benchmark /  Twitter

3D Large Multimodal Model

Lemon: A Unified and Scalable 3D Multimodal Model for Universal Spatial Understanding
Yongyuan Liang, Xiyao Wang, Yuanchen Ju, Jianwei Yang, Furong Huang

arXiv, 2025
Spotlight Talks at CVPR Workshop CVinW, 2025
Project Page  /  Paper  /  Code /  Models & Datasets /  Twitter

Agentic Large Multimodal Model

Magma: A Foundation Model for Multimodal AI Agents
Magma Team

CVPR, 2025
Project Page  /  Paper  /  Code /  Models & Datasets /  Twitter

Embodied Large Multimodal Model

TraceVLA: Visual Trace Prompting Enhances Spatial-Temporal Awareness for Generalist Robotic Policies
Ruijie Zheng*, Yongyuan Liang*, Shuaiyi Huang, Jianfeng Gao, Hal Daumé III, Andrey Kolobov, Furong Huang, Jianwei Yang

ICLR, 2025
Oral Talks at ICLR Workshop GenBot, 2025
Project Page  /  Paper  /  Code /  Models /  Twitter

Generative Model

Make-An-Agent: A Generalizable Policy Network Generator with Behavior-Prompted Diffusion
Yongyuan Liang, Tingqiang Xu, Kaizhe Hu, Guangqi Jiang, Furong Huang, Huazhe Xu

NeurIPS, 2024
Oral Talks at NeurIPS Workshop AFM, 2024
Project Page  /  Paper  /  Code /  Models & Dataset /  Twitter

Reinforcement Learning

ACE: Off-Policy Actor-Critic with Causality-Aware Entropy Regularization
Tianying Ji*, Yongyuan Liang*, Yan Zeng, Yu Luo, Guowei Xu, Jiawei Guo, Ruijie Zheng, Furong Huang, Fuchun Sun, Huazhe Xu

ICML, 2024 Oral, Top 1%
Project Page  /  Paper  /  Code /  Twitter

Blogs
Mar 2026

What We Talk About When We Talk About Spatial Memory in Frontier Models

Spatial Awareness, World Modeling

Apr 2026

Vibes Meet Gravity: AI Agents Step Into the Real World

Vibe Agents, Agentic Multimodal Models

Coming soon

Avocado: Multi-Objective Alignment of Language Models

Alignment Steering, interpretability


Professional Service

Conference Program Committee: ICML(2022-2025), NeurIPS(2021-2025), ICLR(2021-2026)

Workshop Program Committee: FMDM at NeurIPS 2023, Bi-Align at ICLR 2025, CVinW at CVPR 2025


Misc

If my name is a bit tricky to pronounce for you, I’d love to go by Cheryl [ˈʃerəl].

Classic INTJ-A.

I've been playing the violin🎻 for over 15 years and served as Principal First Violin in the university orchestra. I also play the piano for more than 10 years.

I enjoy reading Japanese and Western literature. Here's some of my reading notes.

Been a fan of Novak Djokovic since 2012.

My Erdős number = 4.

life in photos 🍊



© Yongyuan Liang. All rights reserved for content and custom design.
Base template by Jon Barron.