Foundation Models Meet Embodied Agents
@ CVPR 2026 Workshop
Tue June 3rd, 2026, Room TBD
at Denver, Colorado, US
Recent advances in foundation models, including Large Language Models (LLMs), Vision–Language Models (VLMs), and Vision–Language–Action Models (VLAs), have supported embodied agents in performing a wide range of tasks in real-world and simulated environments. However, challenges such as fine-grained visual perception and long-horizon reasoning still remain significant barriers to reliable embodied decision-making.
In this workshop, we aim to bring together researchers from computer vision, robotics, and natural language processing to advance grounded perception, planning, and action for embodied intelligence. We focus on a unified decision-making pipeline, spanning goal understanding, subgoal decomposition, action sequencing, and transition modeling, to enable scalable and generalizable embodied agents.
Topics of Interest
We welcome contributions including, but not limited to, the following directions:
Submission Guidelines
Paper Types
We welcome submissions covering:
Formats & Rules
We host multiple evaluation tracks to benchmark embodied intelligence:
All deadlines are 11:59 pm UTC-12h (“Anywhere on Earth”).
| Submission Deadline | May 1st 2026 (23:59pm AoE) |
|---|---|
| Call for Program Committee Members | May 1st 2026 (23:59pm AoE) |
| Decision Notifications | May 18th 2026 (23:59pm AoE) |
| Camera-Ready Deadline (Non-Archival) | May 25th 2026 (23:59pm AoE) |
| Workshop Date | June 3rd 2026 |
| Time | Program |
|---|---|
| 09:00–09:10 | Opening Remarks |
| 09:10–09:40 | Keynote 1 - Kristen Grauman (UT Austin) |
| 09:40–10:10 | Keynote 2 - Yunzhu Li (Columbia University) |
| 10:10–10:40 | Keynote 3 - Wei-Chiu Ma (Cornell University) |
| 10:40–11:30 | Spotlight Session (6 min talks) |
| 11:30–12:30 | Poster Session |
| 12:30–13:30 | Student Mentoring Lunch Session |
| 13:30–14:00 | Keynote 4 - Lingjie Liu (University of Pennsylvania) |
| 14:00–14:30 | Keynote 5 - Yuke Zhu (UT Austin) |
| 14:30–15:00 | Keynote 6 - Xiaolong Wang (UC San Diego) |
| 15:00–15:50 | Panel Discussion |
| 15:50–17:10 | Oral Presentations (12 min talk + 3 min Q&A) |
| 17:10–17:30 | Best Paper Presentation (15 min talk + 5 min Q&A) |
| 17:30–17:40 | Closing Remarks |
Accepted papers will be announced after the review process.