Abstract
Despite advances in embodied AI, agent reasoning systems still struggle tocapture the fundamental conceptual structures that humans naturally use tounderstand and interact with their environment. To address this, we propose anovel framework that bridges embodied cognition theory and agent systems byleveraging a formal characterization of image schemas, which are defined asrecurring patterns of sensorimotor experience that structure human cognition.By customizing LLMs to translate natural language descriptions into formalrepresentations based on these sensorimotor patterns, we will be able to createa neurosymbolic system that grounds the agent's understanding in fundamentalconceptual structures. We argue that such an approach enhances both efficiencyand interpretability while enabling more intuitive human-agent interactionsthrough shared embodied understanding.