ws_moma_2026_title

ws_moma_2026_outline

Outline and Objectives

Human Motion Analysis through Multimodal inputs has emerged as an actively researched topic in recent years. This workshop is an attempt to bring the community together and discuss how to unify advances in visual, depth, inertial, and physiological sensing for robust motion understanding across domains and embodiments.

There are two focus points of this workshop. The first part focuses on multimodal perception, and discusses topics like unified representations, temporal reasoning, and data-efficient learning for action analysis. The second part explores embodied, human-centered intelligence, including topics like foundation models, edge-efficient deployment, as well as responsible evaluation. By bridging multimodal perception and human-centered intelligence, the workshop promotes interdisciplinary dialogue toward robust, generalizable, and ethically aligned frameworks for real-world human motion understanding.

Topics of interest

The topics covered in the workshop include, but are not limited to:

Multimodal human action and behavior analysis from visual, depth, inertial, and physiological data.
Cross-sensor fusion and alignment for motion understanding.
Multimodality and Robustness.
Temporal reasoning and long-term modeling of human activities and interactions.
Advances in human motion representations for multimodal human motion understanding.
Human-centric Foundation and generative models (e.g., diffusion, transformers, LLMs).
Self, weakly, and unsupervised learning methods for data-efficient and cross-domain generalization.
Advances in edge-deployable, and energy-efficient AI models for real-time human sensing.
Introducing robustness to occlusions, crowded scenes, and domain or subject variability.
Responsible and human-centered evaluation: fairness, bias mitigation, privacy, and transparency.
Applications in healthcare, rehabilitation, sports performance, workplace safety, Extended Reality (XR), and robotics.

ws_moma_2026_invited_speakers

Invited Speakers

Jianfei Yang

Talk Title

Multimodal Foundation Model for Language-Grounded Human Sensing and Reasoning

Bio

Jianfei Yang is an Assistant Professor at Nanyang Technological University (NTU), where he leads the Multimodal AI and Robotic Systems (MARS) Lab. His research focuses on Human-Centric Physical AI and Embodied AI, integrating multimodal sensing, foundation models, and robotics for real-world applications such as human sensing, activity understanding, and intelligent interaction.

Ronald Poppe

Talk Title

Temporal Coordination in Fine-Grained Analysis of Parent-Child Interactions

Bio

Ronald Poppe is an associate professor in the Information and Computing Sciences Department of Utrecht University. His research interests center around the analysis of human (interactive) behavior from videos and other sensors, with applications in media analysis and generation, and in the clinical domain. He received a Ph.D. from the University of Twente, The Netherlands (2009) and was a visiting researcher at the Delft University of Technology, Stanford University, and University of Lancaster. He is a senior member of the IEEE.

Thomas Ploetz

Talk Title

Sensor-Based Human Activity Recognition as the Basis for Effective Health and Wellbeing Assessments

Bio

Thomas Ploetz is a Computer Scientist with expertise and decades of experience in Pattern Recognition and Machine Learning research (PhD from Bielefeld University, Germany). His core research lies in the field of wearable and ubiquitous computing with specific focus on computational behavior analysis that is driven by the automated analysis of what people are doing and how this changes over time — all based on the automated analysis of multimodal time series data that are captured using a range of sensors that are either body worn or integrated into the built environment. Main driving functions for his work are "in the wild" deployments and as such the development of systems and methods that have a real impact on people's lives. He works as a Professor of Computing at the School of Interactive Computing at the Georgia Institute of Technology in Atlanta, USA, where he leads the Computational Behavior Analysis research lab (cba.gatech.edu). There he is also the Associate Chair for Graduate Studies. Thomas is a passionate educator teaching (very) large classes on Artificial Intelligence and Mobile, and Ubiquitous Computing on a regular basis at Georgia Tech, and worldwide through guest lectures and keynotes. Thomas has been very active in the mobile and ubiquitous, including wearable computing community. He is co-editor in chief of the Proc. of the ACM on Interactive, Mobile, Wearable, and Ubiquitous computing technology (IMWUT) — the flagship ACM journal in the field — and has twice been co-chair of the technical program committee of the International Symposium on Wearable Computing (ISWC), and was general co-chair of the 2022 Int. Joint Conf. On Pervasive and Ubiquitous Computing (Ubicomp). Thomas is a Distinguished Member of the ACM.

Suining Henry He

Talk Title

Human-Mobility Interaction: A Multimodal Tale of Micromobility

Bio

Suining Henry He is currently working as the Associate Professor (with Tenure) at School of Computing, University of Connecticut (UConn). Before that, Henry was working as a Tenure-Track Assistant Professor at UConn since 09/2019. He leads the UConn's Ubiquitous and Urban Computing Lab. Before joining UConn, he worked as a postdoctoral research fellow at the Real-Time Computing Lab (RTCL), University of Michigan. Henry received NSF CAREER Award in 2023, Google Research Scholar Program Award in 2021 and NVIDIA Applied Research Accelerator Program Award in 2021, and two UConn Research Excellence Program (REP) Awards in 2022 and 2020, and held Google PhD Fellowship in Mobile Computing in 2015. Hery received the IEEE MASS Best Paper Runner-up Award in 2020 and IEEE RTSS Outstanding Paper Award in 2021. He has been ranked among the Stanford's World's Top 2% Scientists since 2020. Henry's research is supported, in part, by National Science Foundation (NSF), United States Department of Agriculture (USDA), Google, and NVIDIA. His research interests include Human-centered AI, GeoAI, and AI of Things.

ws_moma_2026_organizers

Organizers

	Olivia Nocentini PostDoctoral Researcher at the Italian Institute of Technology e-mail: olivia.nocentini@iit.it
	Rishabh Dabral Research Group Leader at the Max Planck Institute for Informatics e-mail: rdabral@mpi-inf.mpg.de
	Niaz Ahmad Postdoctoral Research Fellow in the CVIS Lab at Toronto Metropolitan University e-mail: niazahmad89@torontomu.ca
	Marta Lorenzini Senior Technician at the Italian Institute of Technology e-mail: marta.lorenzini@iit.it
	Arash Ajoudani Director of the Human-Robot Interfaces and Interaction Laboratory at the Italian Institute of Technology e-mail: arash.ajoudani@iit.it