Chenming Zhu

The University of Hong Kong

  • Ph.D. Student @ HKU IDS | HKU-MMLab
  • Hong Kong SAR

I am a third-year Ph.D. student at the HKU Musketeers Foundation Institute of Data Science (HKU-IDS), as well as HKU-MMLab, The University of Hong Kong, supervised by Prof. Xihui Liu.

I received my B.Eng. degree from the University of Electronic Science and Technology of China (UESTC) and MPhil degree from The Chinese University of Hong Kong, Shenzhen (CUHK-Shenzhen), supervised by Prof. Xiaoguang Han. Before joining HKU-IDS, I spent wonderful time with great minds and interesting friends at Shanghai AI Laboratory.

My current research interests lie in Visual Agentic Spatial Intelligence, Unified Multimodal Model (UMM), and World Model. I’m open to potential collaborations; feel free to drop me an email if you are interested.

Publications

Visual Spatial Intelligence

G2TAM: Geometry Grounded Track Anything Model

Arxiv 2026

Chenming Zhu, Peizhou Cao, Jingli Lin, Wenbo Hu, Yunlong Ran, Tai Wang, Jiangmiao Pang, Xihui Liu

MMSI-Video-Bench: A Holistic Benchmark for Video-Based Spatial Intelligence

Arxiv 2026

Jingli Lin*, Runsen Xu*, Shaohao Zhu, Sihan Yang, Peizhou Cao, Yunlong Ran, Miao Hu, Chenming Zhu, Yiman Xie, Yilin Long, Wenbo Hu, Dahua Lin, Tai Wang, Jiangmiao Pang

G2VLM: Geometry Grounded Vision Language Model with Unified 3D Reconstruction and Spatial Reasoning

CVPR 2026

Wenbo Hu, Jingli Lin, Yilin Long, Yunlong Ran, Lihan Jiang, Yifan Wang, Chenming Zhu, Runsen Xu, Tai Wang, Jiangmiao Pang

MMSI-Bench: A Benchmark for Multi-Image Spatial Intelligence

ICLR 2026

Sihan Yang*, Runsen Xu*, Yiman Xie, Sizhe Yang, Mo Li, Jingli Lin, Chenming Zhu, Xiaochen Chen, Haodong Duan, Xiangyu Yue, Dahua Lin, Tai Wang, Jiangmiao Pang

OST-Bench: Evaluating the Capabilities of MLLMs in Online Spatio-temporal Scene Understanding

NeurIPS 2025

Jingli Lin*, Chenming Zhu*, Runsen Xu, Xiaohan Mao, Xihui Liu, Tai Wang, Jiangmiao Pang

Project Lead

Embodied 3D Perception

LLaVA-3D: A Simple yet Effective Pathway to Empowering LMMs with 3D Capabilities

ICCV 2025

Chenming Zhu, Tai Wang, Wenwei Zhang, Jiangmiao Pang, Xihui Liu

ScanReason: Empowering 3D Visual Grounding with Reasoning Capabilities

ECCV 2024

Chenming Zhu, Tai Wang, Wenwei Zhang, Kai Chen, Xihui Liu

MMScan: A Multi-Modal 3D Scene Dataset with Hierarchical Grounded Language Annotations

NeurIPS 2024

Ruiyuan Lyu, Tai Wang, Jingli Lin, Shuai Yang, Xiaohan Mao, Yilun Chen, Runsen Xu, Haifeng Huang, Chenming Zhu, Dahua Lin, Jiangmiao Pang

EmbodiedScan: A Holistic Multi-Modal 3D Perception Suite Towards Embodied AI

CVPR 2024

Tai Wang*, Xiaohan Mao*, Chenming Zhu*, Runsen Xu, Ruiyuan Lyu, Peisen Li, Xiao Chen, Wenwei Zhang, Kai Chen, Tianfan Xue, Xihui Liu, Cewu Lu, Dahua Lin, Jiangmiao Pang

Vision-Language Navigation (VLN)

StreamVLN: Streaming Vision-and-Language Navigation via SlowFast Context Modeling

ICRA 2026

Meng Wei, Chenyang Wan, Xiqian Yu, Tai Wang, Yuqiang Yang, Xiaohan Mao, Chenming Zhu, Wenzhe Cai, Hanqing Wang, Yilun Chen, Xihui Liu, Jiangmiao Pang

InternVLA-N1: An Open Dual-System Vision-Language Navigation Foundation Model with Learned Latent Plans

Technical report 2026

Core Contributor

Projects

MMDetection3D

OpenMMLab next-generation platform for general 3D perception. (GitHub > 5k stars)

Core Maintainer & Developer

  • MMDetection3D unifies the pipeline and modular design of mono3D, LiDAR-based, and multi-modality 3D object detection.
  • It supports state-of-the-art 3D object detectors of different modalities in multiple indoor and outdoor datasets.
  • It builds strong foundations, in a universal framework, for general 3D object detection.

Honors and Awards

  • 2022 2nd of the Waymo 3D Camera-only Detection Challenge 2022
  • 2017-2018 / 2018-2019 Excellent Undergraduate Scholarship of UESTC
  • 2018 Outstanding Student Award of School of Computer Science and Engineering, UESTC

Academic Services

I served as a reviewer for CVPR, ICCV, ECCV, NeurIPS, AAAI, ICLR, and ICML.