
Outline and Objectives
Human Motion Analysis through Multimodal inputs has emerged as an actively researched topic in recent years. This workshop is an attempt to bring the community together and discuss how to unify advances in visual, depth, inertial, and physiological sensing for robust motion understanding across domains and embodiments.
There are two focus points of this workshop. The first part focuses on multimodal perception, and discusses topics like unified representations, temporal reasoning, and data-efficient learning for action analysis. The second part explores embodied, human-centered intelligence, including topics like foundation models, edge-efficient deployment, as well as responsible evaluation. By bridging multimodal perception and human-centered intelligence, the workshop promotes interdisciplinary dialogue toward robust, generalizable, and ethically aligned frameworks for real-world human motion understanding.
Topics of interest
The topics covered in the workshop include, but are not limited to:
- Multimodal human action and behavior analysis from visual, depth, inertial, and physiological data.
- Cross-sensor fusion and alignment for motion understanding.
- Multimodality and Robustness.
- Temporal reasoning and long-term modeling of human activities and interactions.
- Advances in human motion representations for multimodal human motion understanding.
- Human-centric Foundation and generative models (e.g., diffusion, transformers, LLMs).
- Self, weakly, and unsupervised learning methods for data-efficient and cross-domain generalization.
- Advances in edge-deployable, and energy-efficient AI models for real-time human sensing.
- Introducing robustness to occlusions, crowded scenes, and domain or subject variability.
- Responsible and human-centered evaluation: fairness, bias mitigation, privacy, and transparency.
- Applications in healthcare, rehabilitation, sports performance, workplace safety, Extended Reality (XR), and robotics.
Invited Speakers
Jianfei YangTalk Title Multimodal Foundation Model for Language-Grounded Human Sensing and Reasoning Bio Jianfei Yang is an Assistant Professor at Nanyang Technological University (NTU), where he leads the Multimodal AI and Robotic Systems (MARS) Lab. His research focuses on Human-Centric Physical AI and Embodied AI, integrating multimodal sensing, foundation models, and robotics for real-world applications such as human sensing, activity understanding, and intelligent interaction. |
Ronald PoppeTalk Title Temporal Coordination in Fine-Grained Analysis of Parent-Child Interactions Bio Ronald Poppe is an associate professor in the Information and Computing Sciences Department of Utrecht University. His research interests center around the analysis of human (interactive) behavior from videos and other sensors, with applications in media analysis and generation, and in the clinical domain. He received a Ph.D. from the University of Twente, The Netherlands (2009) and was a visiting researcher at the Delft University of Technology, Stanford University, and University of Lancaster. He is a senior member of the IEEE. |
Thomas PloetzTalk Title Sensor-Based Human Activity Recognition as the Basis for Effective Health and Wellbeing Assessments Bio Thomas Ploetz is a Computer Scientist with expertise and decades of experience in Pattern Recognition and Machine Learning research (PhD from Bielefeld University, Germany). His core research lies in the field of wearable and ubiquitous computing with specific focus on computational behavior analysis that is driven by the automated analysis of what people are doing and how this changes over time — all based on the automated analysis of multimodal time series data that are captured using a range of sensors that are either body worn or integrated into the built environment. Main driving functions for his work are "in the wild" deployments and as such the development of systems and methods that have a real impact on people's lives. He works as a Professor of Computing at the School of Interactive Computing at the Georgia Institute of Technology in Atlanta, USA, where he leads the Computational Behavior Analysis research lab (cba.gatech.edu). There he is also the Associate Chair for Graduate Studies. Thomas is a passionate educator teaching (very) large classes on Artificial Intelligence and Mobile, and Ubiquitous Computing on a regular basis at Georgia Tech, and worldwide through guest lectures and keynotes. Thomas has been very active in the mobile and ubiquitous, including wearable computing community. He is co-editor in chief of the Proc. of the ACM on Interactive, Mobile, Wearable, and Ubiquitous computing technology (IMWUT) — the flagship ACM journal in the field — and has twice been co-chair of the technical program committee of the International Symposium on Wearable Computing (ISWC), and was general co-chair of the 2022 Int. Joint Conf. On Pervasive and Ubiquitous Computing (Ubicomp). Thomas is a Distinguished Member of the ACM. |
Organizers
Olivia NocentiniPostDoctoral Researcher at the Italian Institute of Technologye-mail: olivia.nocentini@iit.it |
|
Rishabh DabralResearch Group Leader at the Max Planck Institute for Informaticse-mail: rdabral@mpi-inf.mpg.de |
|
Niaz AhmadPostdoctoral Research Fellow in the CVIS Lab at Toronto Metropolitan University |
|
Marta LorenziniSenior Technician at the Italian Institute of Technologye-mail: marta.lorenzini@iit.it |
|
Arash AjoudaniDirector of the Human-Robot Interfaces and Interaction Laboratory at the Italian Institute of Technologye-mail: arash.ajoudani@iit.it |