Zhipeng Bao

Hey, my name is Zhipeng Bao, currently a researcher at Google, training Google's on-device visual foundation models.

I obtained my PhD at Robotics Institute, Carnegie Mellon University supervised by Prof. Martial Hebert. I also earned my master degree (MSR) at CMU supervised by Prof. Martial Hebert. During my time at CMU, I worked closely with Prof. Yu-Xiong Wang and Dr. Pavel Tokmakov. I recieved my bachelor degree majoring in Electronic Engineering from Tsinghua University.

I enjor exploring foods, traveling, and capturing photos. A few attempts are in my old Gallery.

Email / CV / Scholar / Github

Research

My research interest lies in free-from visual generation (e.g., T2I, perception, editing, etc.), spanning across both image and video domains.

	Walk through Paintings: Egocentric World Models from Internet Priors Anurag Bagchi, Zhipeng Bao, Yu-Xiong Wang, Pavel Tokmakov, Martial Hebert Under review, 2026 project page / paper / code
	UniGen-AR: Unifying Visual Generation with Auto-Regressive Modeling Zhipeng Bao, Zhen Zhu, Nupur Kumari, Anurag Bagchi, Yu-Xiong Wang, Pavel Tokmakov, Martial Hebert Under review, 2026 paper
	ReferEverything: Towards segmenting everything we can speak of in videos Anurag Bagchi, Zhipeng Bao, Yu-Xiong Wang, Pavel Tokmakov, Martial Hebert ICCV, 2025 project page / paper code
	Diff-2-in-1: Bridging Generation and Dense Perception with Diffusion Models Shuhong Zheng, Zhipeng Bao, Ruoyu, Zhao, Martial Hebert, Yu-Xiong Wang ICLR, 2025 project page / paper
	Lexicon3D: Probing Visual Foundation Models for Complex 3D Scene Understanding Yunze Man, Shuhong Zheng, Zhipeng Bao, Martial Hebert, Liangyan Gui, Yu-Xiong Wang NeurIPS, 2024 project page / code / paper
	Separate-and-Enhance: Compositional Finetuning with Text-to-Image Diffusion Models Zhipeng Bao, Yijun Li, Krishna Kumar Singh, Yu-Xiong Wang, Martial Hebert SIGGRAPH, 2024 project page / code / paper
	Multi-task View Synthesis with Neural Radiance Fields Shuhong Zheng, Zhipeng Bao, Martial Hebert, Yu-Xiong Wang ICCV, 2023 project page / code / paper
	Objects Discovery from Motion-guided Tokens Zhipeng Bao, Pavel Tokmakov, Yu-Xiong Wang, Adrien Gaidon, Martial Hebert CVPR, 2023 project page / code / paper
	Beyond RGB: Scene-property Synthesis with Neural Radiance Fields Mingtong Zhang, Shuhong Zheng, Zhipeng Bao, Martial Hebert, Yu-Xiong Wang WACV, 2023 code / paper
	Generative Modeling for Multi-task Visual Learning Zhipeng Bao, Martial Hebert, Yu-Xiong Wang ICML, 2022 code / paper
	Discovering Objects that Can Move Zhipeng Bao, Pavel Tokmakov, Allan Jabri, Yu-Xiong Wang, Adrien Gaidon, Martial Hebert CVPR, 2022 project page / code / paper
	Bowtie Networks: Generative Modeling for Joint Few-Shot Recognition and Novel-View Synthesis Zhipeng Bao, Yu-Xiong Wang, Martial Hebert ICLR, 2021 code / paper
	Single-Image Facial Expression Recognition Using Deep 3D Re-Centralization Zhipeng Bao, Shaodi You, Gu Lin, Zhenglu Yang ICCV Workshops, 2019 paper
	A Joint Method for Marker-free Alignment of Tilt Series in Electron Tomography Renmin Han, Zhipeng Bao, Xiangrui Zeng, Tongxin Niu, Min Xu, Xin Gao ISMB, 2019 paper

Misc

Internships	[05/2024-08/2024]: GenAI, Meta, Research Intern, hosted by Xiaofang Wang [05/2023-08/2023]: Adobe Research, Research Intern, hosted by Yijun Li [05/2022-08/2022]: Toyota Research Institute, Research Intern, hosted by Pavel Tokmakov [05/2021-08/2021]: Toyota Research Institute, Research Intern, hosted by Pavel Tokmakov
Reviewing	Since 2020: CVPR, ICCV, ECCV, NeurIPS, ICLR, ICML
Teaching	TA, CMU 16-824: Visual Learning and Recognition, Fall 2023 TA, CMU 16-385: Computer Vision, Spring 2023

Many thanks to Jon Barron for sharing the initial template.

Last update: March 2026.

Research

Misc

Internships

Reviewing

Teaching