Foundation Models Meet Embodied Agents

@ CVPR 2026 Workshop

Thu June 4th, 2026, Room 703

at Denver, Colorado, US


Call For Papers Submission Portal Schedule

Call for Papers

Recent advances in foundation models, including Large Language Models (LLMs), Vision–Language Models (VLMs), and Vision–Language–Action Models (VLAs), have supported embodied agents in performing a wide range of tasks in real-world and simulated environments. However, challenges such as fine-grained visual perception and long-horizon reasoning still remain significant barriers to reliable embodied decision-making.

In this workshop, we aim to bring together researchers from computer vision, robotics, and natural language processing to advance grounded perception, planning, and action for embodied intelligence. We focus on a unified decision-making pipeline, spanning goal understanding, subgoal decomposition, action sequencing, and transition modeling, to enable scalable and generalizable embodied agents.

Topics of Interest

We welcome contributions including, but not limited to, the following directions:

  • Long-horizon reasoning & planning
  • Spatial intelligence & physical understanding
  • World models, memory, and interaction
  • Vision-language-action learning and evaluation
  • Benchmarks, datasets, and evaluation protocols for embodied agents

Submission Guidelines

Paper Types

We welcome submissions covering:

  • Research papers: Long papers (8 pages) showcasing novel findings, methods, or theoretical advancements.
  • Short/Abstract papers: Features exploratory work (4 pages or 2 pages excluding references) that may be preliminary but presents innovative concepts, early results, or thought-provoking viewpoints that stimulate discussion and future work.
  • Position papers: Offer critical perspectives on trends and challenges within the field (no less than 8 pages).
  • Survey papers: Provide thorough reviews of specific topics, mapping the current research landscape and suggesting directions for future exploration (no less than 8 pages).

Formats & Rules

  • All types allow unlimited references and appendices.
  • Submissions should follow CVPR two-column style and be anonymous; see the CVPR-26 author kit for details.
  • Please submit through OpenReview submission portal.
  • Contributions will be non-archival but hosted on our workshop website, and thus dual submission is allowed where permitted by third parties. We welcome submissions that are under submission or accepted by other conferences. Please mention it in the last sentence of the paper abstract if your paper has been under submission or accepted by other conferences. Paper awards will prefer the original submissions.

Challenge / Benchmark Track

We host multiple evaluation tracks to benchmark embodied intelligence:

  • ENACT – a new challenge on evaluating embodied cognition of VLMs with world modeling of egocentric interaction
  • EmbodiedBench – comprehensive benchmarking of VLM-based embodied agents across perception, reasoning, and action
  • Embodied Agent Interface (EAI) – evaluating LLM-based agents on goal interpretation, subgoal decomposition, action sequencing, and transition modeling
  • RoboMME – a new challenge for robotic generalist policies in a diverse set of memory-critical manipulation tasks

Important Dates

All deadlines are 11:59 pm UTC-12h (“Anywhere on Earth”).

Submission Deadline May 1st 2026 (23:59pm AoE) May 10th 2026 (23:59pm AoE)
Call for Program Committee Members May 1st 2026 (23:59pm AoE) May 10th 2026 (23:59pm AoE)
Decision Notifications May 18th 2026 (23:59pm AoE) May 25th 2026 (23:59pm AoE)
Camera-Ready Deadline (Non-Archival) May 25th 2026 (23:59pm AoE) May 30th 2026 (23:59pm AoE)
Workshop Date June 4th 2026

Speakers

Avatar

Kristen Grauman

UT Austin

Avatar

Wei-Chiu Ma

Cornell University

Avatar

Xudong Wang

Physical Intelligence

Avatar

Kaichun Mo

NVIDIA

Avatar

An-Chieh Cheng

UC San Diego

Schedule

Tentative — all times are in Denver local time (Mountain Time), in the afternoon.

Time Program
1:00–1:05 PM Opening
1:05–1:45 PM Invited Talk 1 - Kristen Grauman (UT Austin)
1:45–2:25 PM Invited Talk 2 - Wei-Chiu Ma (Cornell University)
2:25–3:05 PM Invited Talk 3 - Xudong Wang (Physical Intelligence)
3:05–3:55 PM Contributed Talks
3:55–4:30 PM Poster Session / Coffee Break
4:30–5:10 PM Invited Talk 4 - Kaichun Mo (NVIDIA)
5:10–5:50 PM Invited Talk 5 - An-Chieh Cheng (UC San Diego)
5:50–6:00 PM Award Ceremony + Closing Remarks

Accepted Papers

Congratulations to the authors of the 34 papers accepted to FMEA @ CVPR 2026. All accepted papers are non-archival and will be presented in the poster session.

Organizers

Organizing Committee @ CVPR26

Avatar

Qineng Wang

Northwestern University

Avatar

Kangrui Wang

Northwestern University

Avatar

Canyu Chen

Northwestern University

Avatar

Pingyue Zhang

Northwestern University

Avatar

Ruohan Zhang

Stanford & Northwestern University

Avatar

Wenlong Huang

Stanford University

Avatar

Jiayuan Mao

UPenn

Avatar

Weiyu Liu

University of Utah

Avatar

Yining Hong

Stanford University

Avatar

Jiatao Gu

UPenn

Avatar

Zhiwen Fan

Texas A&M University

Avatar

Manling Li

Northwestern University

Challenge Committee

Avatar

Kangrui Wang

Northwestern University

Avatar

Rui Yang

UIUC

Avatar

Qineng Wang

Northwestern University

Avatar

Yinpei Dai

University of Michigan

Steering Committee @ CVPR26

Avatar

Yejin Choi

Stanford University

Avatar

Jiajun Wu

Stanford University

Avatar

Li Fei-Fei

Stanford University

Contact

Please email cvpr2026-fmea-workshop@googlegroups.com if you have any questions.