The aim of Event-Driven Perception for Robotics (EDPR) is to make robots function in full autonomy: energetically and computationally. This requires simultaneously saving on power consumption, reducing the total number of bits elaborated (or equivalently transmitted, stored, etc.) and developing efficient computation.
Autonomy brings about a number of technical requirements depending on whether we consider power consumption and computational autonomy that lead to untethered machines, or rather behavioural autonomy, whereby robots take autonomous decisions based on set goals and real-time interaction with the environment. In practice, the key to any autonomy is information encoding. Efficient encoding of sensory signals allows for an optimal representation of information, reducing the cost of acquiring, transmitting and storing unnecessary data, and simultaneously allowing for a better extraction of relevant information, which in turn enables more robust decision-making.
Biological systems are autonomous in the sense described above, as evolution developed computational strategies for making sense of the external noisy and ambiguous signals to produce appropriate behaviour in real time, at the lowest possible energetic cost and using an inhomogeneous substrate for computation comprising slow and stochastic elements. The properties of biological systems are a reference point for the realisation of robots that face similar computational and energetic constraints and have to replicate basic human skills for reliable and robust interaction with the environment and cooperation with humans.
Event-Driven Perception for Robotics will induce a paradigm shift in robotics, based on the biologically inspired emerging concept of event-driven sensing and processing that leads to better robots able to acquire, transmit and process information only when needed, optimising the use of resources, leading to real-time, low-cost, operation.
We investigate two main sensing modes: touch and vision, with the long term goal of progressively substituting most of the sensors of the iCub with their ED counterpart.In the visual domain, we work on the improvement of pixel functionality, noise resilience and size, and the development of data serialisation, a crucial step towards the integration of higher resolution sensors on the robot. We proposed a novel and more robust circuit for change detection in the visual signal, designed to tackle one of the major drawbacks of change detection, by filtering high frequency noise without low pass limiting the response to large and fast transients.The sparseness of tactile input over space and time calls for ED encoding, where the sensors are not continuously sampled, rather wake-up at stimulation. The iCub is currently equipped with capacitive sensors, at the same time, different groups within IIT are developing new materials and technologies for tactile transducers. This line of research aims at complementing such developments withneuromorphic ED readout circuits for tactile sensing, based on POSFET devices.
The iCub is progressively updated to integrate ED technology. A modular infrastructure, supported by FPGA-based technology, serialisation , and YARP middleware, supports the integration of different ED sensors, neuromorphic computational platforms (SpiNNakerand DYNAP) and software modules for ED sensory processing for seamless integration on the robot. Amongst the latest developments, we implemented a new vision system integrating upgraded ED and frame-based sensors. The low spatial resolution, large field of view and motion sensitive ED sensors coupled with low temporal but high spatial resolution and small field of view frame-based sensors parallels the organisation of the primate's foveated vision. Coarse large field of view periphery can be used to detect salient regions in the scene, that guide sequential saccades that put the region of interest in the high acuity fovea for detailed stimulus processing. To explore ED tactile sensing, we are working on the emulation of ED encoding using the current capacitive sensors integrated on the iCub. Besides the improvement in communication bandwidth thanks to the sensor compression and use of the serial AER protocol, the final goal of this activity is to acquire asynchronous data from different types of sensors (vision and skin at first) and study the use of temporal correlations for multi-sensory integration.
The development of ED sensing and the relative infrastructure for its integration on the iCub is instrumental to the development of an autonomous robot that exploits efficient sensory compression, enabling fast and low cost acquisition, storage and computation. Our results show that the temporal signature of events from vision sensors adds information about the visual input and that information about the visual stimuli is maximised when it is encoded with a temporal resolution of few milliseconds; this temporal resolution is preserved in higher hierarchical computational layers, improving the separability between objects. The core idea of research in this domain is to exploit this additional temporal information and the high temporal resolution coupled with low data rate for developing methods to process moving stimuli in real time. This, coupled with static precise spatial information from traditional frame-based cameras, will greatly enhance computer vision for robots that have to interact with objects and people in real time, adapting to sudden changes, failures and uncertainties.
Motion segmentation and perception
Robust speech perception
The fine temporal dynamics of ED vision can be exploited to implement a speech recognition system based on speech production related information (such as movement of the lips, opening, closure, shape, etc....) to improve models of temporal dynamics in speech and compensate for poor acoustic information due to noisy acoustic environments. The temporal features extracted from ED visual signal will be used for the yet unexplored cross-modal ED speech segmentation that will drive processing of speech. To increase the robustness to acoustic noise and atypical speech, acoustic and visual features will be combined to recover phonetic gestures of the inner vocal tract (articulatory features). Visual, acoustic and (recovered) articulatory features will be the observation domain of a novel speech recognition system for the robust recognition of key phrases.
The lab offers the unique iCub platform equipped with event-driven sensors for vision (coupled with high resolution frame-based cameras) and touch and with neuromorphic platforms for the implementation of spiking neural networks (such as SpiNNaker and ROLLS).
NATIONAL AND INTERNATIONAL
- Vision and Natural Computation - Vision Institute - UPMC
- Neuromorphic Cognitive Systems - Institute of Neuroinformatics - UZH|ETHZ
- Neuromorphic Behaving Systems - CITEC - Univ. Bielefeld
- Robotics and Perception Group - Institute of Neuroinformatics - UZH|ETHZ
- Advanced Processor Technologies -- Univ. Manchester
- Cosmic Lab - DITEN - Univ. of Genova
- Instituto de Microelectrónica de Sevilla (IMSE-CNM)
- Neural Computation - CNCS@IIT
- MiNES group Dipartimento di Elettronica e Telecomunicazioni - PoliTo
- Electronics Design Lab
- Humanoid Sensing and Perception