Foundation Models Meet Embodied Agents

@ CVPR 2026 Workshop

Tue June 3rd, 2026, Room TBD

at Denver, Colorado, US


Call For Papers Schedule

Call for Papers

Recent advances in foundation models, including Large Language Models (LLMs), Vision–Language Models (VLMs), and Vision–Language–Action Models (VLAs), have supported embodied agents in performing a wide range of tasks in real-world and simulated environments. However, challenges such as fine-grained visual perception and long-horizon reasoning still remain significant barriers to reliable embodied decision-making.

In this workshop, we aim to bring together researchers from computer vision, robotics, and natural language processing to advance grounded perception, planning, and action for embodied intelligence. We focus on a unified decision-making pipeline, spanning goal understanding, subgoal decomposition, action sequencing, and transition modeling, to enable scalable and generalizable embodied agents.

Topics of Interest

We welcome contributions including, but not limited to, the following directions:

  • Long-horizon reasoning & planning
  • Spatial intelligence & physical understanding
  • World models, memory, and interaction
  • Vision-language-action learning and evaluation
  • Benchmarks, datasets, and evaluation protocols for embodied agents

Submission Guidelines

Paper Types

We welcome submissions covering:

  • Research papers: Long papers (8 pages) showcasing novel findings, methods, or theoretical advancements.
  • Short/Abstract papers: Features exploratory work (4 pages or 2 pages excluding references) that may be preliminary but presents innovative concepts, early results, or thought-provoking viewpoints that stimulate discussion and future work.
  • Position papers: Offer critical perspectives on trends and challenges within the field (no less than 8 pages).
  • Survey papers: Provide thorough reviews of specific topics, mapping the current research landscape and suggesting directions for future exploration (no less than 8 pages).

Formats & Rules

  • All types allow unlimited references and appendices.
  • Submissions should follow CVPR two-column style and be anonymous; see the CVPR-26 author kit for details.
  • Please submit through OpenReview submission portal (TBD).
  • Contributions will be non-archival but hosted on our workshop website, and thus dual submission is allowed where permitted by third parties. We welcome submissions that are under submission or accepted by other conferences. Please mention it in the last sentence of the paper abstract if your paper has been under submission or accepted by other conferences. Paper awards will prefer the original submissions.

Challenge / Benchmark Track

We host multiple evaluation tracks to benchmark embodied intelligence:

  • ENAcT – a new challenge on evaluating embodied cognition of VLMs with world modeling of egocentric interaction
  • EmbodiedBench – comprehensive benchmarking of VLM-based embodied agents across perception, reasoning, and action
  • Embodied Agent Interface (EAI) – evaluating LLM-based agents on goal interpretation, subgoal decomposition, action sequencing, and transition modeling

Important Dates

All deadlines are 11:59 pm UTC-12h (“Anywhere on Earth”).

Submission Deadline May 1st 2026 (23:59pm AoE)
Call for Program Committee Members May 1st 2026 (23:59pm AoE)
Decision Notifications May 18th 2026 (23:59pm AoE)
Camera-Ready Deadline (Non-Archival) May 25th 2026 (23:59pm AoE)
Workshop Date June 3rd 2026

Schedule

Time Program
09:00–09:10 Opening Remarks
09:10–09:40 Keynote 1 - Kristen Grauman (UT Austin)
09:40–10:10 Keynote 2 - Yunzhu Li (Columbia University)
10:10–10:40 Keynote 3 - Wei-Chiu Ma (Cornell University)
10:40–11:30 Spotlight Session (6 min talks)
11:30–12:30 Poster Session
12:30–13:30 Student Mentoring Lunch Session
13:30–14:00 Keynote 4 - Lingjie Liu (University of Pennsylvania)
14:00–14:30 Keynote 5 - Yuke Zhu (UT Austin)
14:30–15:00 Keynote 6 - Xiaolong Wang (UC San Diego)
15:00–15:50 Panel Discussion
15:50–17:10 Oral Presentations (12 min talk + 3 min Q&A)
17:10–17:30 Best Paper Presentation (15 min talk + 5 min Q&A)
17:30–17:40 Closing Remarks

Accepted Papers

Accepted papers will be announced after the review process.

Organizers

Organizing Committee @ CVPR26

Avatar

Qineng Wang

Northwestern University

Avatar

Manling Li

Northwestern University

Avatar

Kangrui Wang

Northwestern University

Avatar

Canyu Chen

Northwestern University

Avatar

Ruohan Zhang

Stanford & Northwestern University

Avatar

Wenlong Huang

Stanford University

Avatar

Jiayuan Mao

UPenn

Avatar

Weiyu Liu

University of Utah

Avatar

Yining Hong

Stanford University

Avatar

Jiatao Gu

UPenn

Avatar

Zhiwen Fan

Texas A&M University

Challenge Committee

Avatar

Kangrui Wang

Northwestern University

Avatar

Rui Yang

UIUC

Avatar

Tianwei Bao

Northwestern University

Avatar

Qineng Wang

Northwestern University

Steering Committee @ CVPR26

Avatar

Yejin Choi

Stanford University

Avatar

Jiajun Wu

Stanford University

Avatar

Li Fei-Fei

Stanford University

Sponsors

Sponsors to be announced

Contact

Please email cvpr2026-fmea-workshop@googlegroups.com if you have any questions.