Submission Topics
An embodied agent is a generalist agent that can take natural language instructions from humans and perform a wide range of tasks in diverse environments. Recent years have witnessed the emergence of Large Language Models as powerful tools for building Large Agent Models, which have shown remarkable success in supporting embodied agents for different abilities such as goal interpretation, subgoal decomposition, action sequencing, and transition modeling (causal transitions from preconditions to post-effects).
However, moving from Foundation Models to Embodied Agents poses significant challenges in understanding lower-level visual details, and long-horizon reasoning for reliable embodied decision-making. We will cover the advances of the foundation models into Large Language Models Vision-Language Models, and Vision-Language-Action Models. In this tutorial, we will comprehensively review existing paradigms for foundations for embodied agents, and focus on their different formulations based on the fundamental mathematical framework of robot learning, Markov Decision Process (MDP), and present a structured view to investigate the robot’s decision-making process.
We welcome submissions on all topics related to Foundation Models and their interactions with Embodied Agents. We will also announce a Best Paper Award at our workshop.
Submission Instructions
We solicit long papers (8 pages), short papers (4 pages), abstract papers (2 pages) with unlimited references/appendices. The contributions will be non-archival but will be hosted on our workshop website. Submissions should be formatted in CVPR two-column style and should be anonymous ; see the CVPR-25 author kit for details. Please submit through OpenReview submission portal.
Please email cvpr2025-foundationmodel-embodied@googlegroups.com if you have any questions.