Foundation Models Meet Embodied Agents
@ CVPR 2026 Workshop
Thu June 4th, 2026, Room 703
at Denver, Colorado, US
Recent advances in foundation models, including Large Language Models (LLMs), Vision–Language Models (VLMs), and Vision–Language–Action Models (VLAs), have supported embodied agents in performing a wide range of tasks in real-world and simulated environments. However, challenges such as fine-grained visual perception and long-horizon reasoning still remain significant barriers to reliable embodied decision-making.
In this workshop, we aim to bring together researchers from computer vision, robotics, and natural language processing to advance grounded perception, planning, and action for embodied intelligence. We focus on a unified decision-making pipeline, spanning goal understanding, subgoal decomposition, action sequencing, and transition modeling, to enable scalable and generalizable embodied agents.
Topics of Interest
We welcome contributions including, but not limited to, the following directions:
Submission Guidelines
Paper Types
We welcome submissions covering:
Formats & Rules
We host multiple evaluation tracks to benchmark embodied intelligence:
All deadlines are 11:59 pm UTC-12h (“Anywhere on Earth”).
| Submission Deadline | |
|---|---|
| Call for Program Committee Members | |
| Decision Notifications | |
| Camera-Ready Deadline (Non-Archival) | |
| Workshop Date | June 4th 2026 |
Tentative — the workshop is half-day this year, so the timeline will be adjusted accordingly.
| Time | Program |
|---|---|
| 09:00–09:10 | Opening Remarks |
| 09:10–09:40 | Keynote 1 - Kristen Grauman (UT Austin) |
| 09:40–10:10 | Keynote 2 - Yunzhu Li (Columbia University) |
| 10:10–10:40 | Keynote 3 - Wei-Chiu Ma (Cornell University) |
| 10:40–11:30 | Spotlight Session (6 min talks) |
| 11:30–12:30 | Poster Session |
| 12:30–13:30 | Student Mentoring Lunch Session |
| 13:30–14:00 | Keynote 4 - Lingjie Liu (University of Pennsylvania) |
| 14:00–14:30 | Keynote 5 - Yuke Zhu (UT Austin) |
| 14:30–15:00 | Keynote 6 - Xiaolong Wang (UC San Diego) |
| 15:00–15:50 | Panel Discussion |
| 15:50–17:10 | Oral Presentations (12 min talk + 3 min Q&A) |
| 17:10–17:30 | Best Paper Presentation (15 min talk + 5 min Q&A) |
| 17:30–17:40 | Closing Remarks |
Accepted papers will be announced after the review process.