The aim of Event-Driven Perception (EDP) is to induce a paradigm shift in robotics. Our research is based on the emerging concept of Event-Driven sensing and processing that leads to better robots able to acquire, transmit and process information only when needed, optimising the use of resources, leading to real-time, low-cost, operation.
Development and incremental improvement of diverse types of ED sensor
We investigate two main sensing modes: touch and vision, with the long term goal of progressively substituting most of the sensors of the iCub with their ED counterpart.In the visual domain, we work on the improvement of pixel functionality, noise resilience and size, and the development of data serialisation, a crucial step towards the integration of higher resolution sensors on the robot. We proposed a novel and more robust circuit for change detection in the visual signal, designed to tackle one of the major drawbacks of change detection, by filtering high frequency noise without low pass limiting the response to large and fast transients.The sparseness of tactile input over space and time calls for ED encoding, where the sensors are not continuously sampled, rather wake-up at stimulation. The iCub is currently equipped with capacitive sensors, at the same time, different groups within IIT are developing new materials and technologies for tactile transducers. This line of research aims at complementing such developments withneuromorphic ED readout circuits for tactile sensing, based on POSFET devices.
Integration of ED sensors and computational platforms on the neuromorphic iCub
The iCub is progressively updated to integrate ED technology. A modular infrastructure, supported by FPGA-based technology, serialisation , and YARP middleware, supports the integration of different ED sensors, neuromorphic computational platforms (SpiNNakerand DYNAP) and software modules for ED sensory processing for seamless integration on the robot. Amongst the latest developments, we implemented a new vision system integrating upgraded ED and frame-based sensors. The low spatial resolution, large field of view and motion sensitive ED sensors coupled with low temporal but high spatial resolution and small field of view frame-based sensors parallels the organisation of the primate's foveated vision. Coarse large field of view periphery can be used to detect salient regions in the scene, that guide sequential saccades that put the region of interest in the high acuity fovea for detailed stimulus processing. To explore ED tactile sensing, we are working on the emulation of ED encoding using the current capacitive sensors integrated on the iCub. Besides the improvement in communication bandwidth thanks to the sensor compression and use of the serial AER protocol, the final goal of this activity is to acquire asynchronous data from different types of sensors (vision and skin at first) and study the use of temporal correlations for multi-sensory integration.
Development of ED sensory processing - the use of time for computing
The development of ED sensing and the relative infrastructure for its integration on the iCub is instrumental to the development of an autonomous robot that exploits efficient sensory compression, enabling fast and low cost acquisition, storage and computation. Our results show that the temporal signature of events from vision sensors adds information about the visual input and that information about the visual stimuli is maximised when it is encoded with a temporal resolution of few milliseconds; this temporal resolution is preserved in higher hierarchical computational layers, improving the separability between objects. The core idea of research in this domain is to exploit this additional temporal information and the high temporal resolution coupled with low data rate for developing methods to process moving stimuli in real time. This, coupled with static precise spatial information from traditional frame-based cameras, will greatly enhance computer vision for robots that have to interact with objects and people in real time, adapting to sudden changes, failures and uncertainties.
Motion segmentation and perception
ED sensing allows the observation of the full trajectory of a moving object, this capability can be exploited to improve the behaviour of robots during manipulation and grasping, as well as for interaction with persons and objects in collaborative tasks. Despite the inherent segmentation of moving objects and the spatio-temporal information available from ED sensory stream, in a robotic scenario, where the robot moves in a cluttered environment, a large number of events arise from the edges in the scene. We are developing methods to robustly select a salient target (using stimulus-driven models of selective attention) and track it with probabilistic filtering in the event-space, as well as methods to compute the motion of objects and discount events due to ego-motion.
Robust speech perception
The fine temporal dynamics of ED vision can be exploited to implement a speech recognition system based on speech production related information (such as movement of the lips, opening, closure, shape, etc....) to improve models of temporal dynamics in speech and compensate for poor acoustic information due to noisy acoustic environments. The temporal features extracted from ED visual signal will be used for the yet unexplored cross-modal ED speech segmentation that will drive processing of speech. To increase the robustness to acoustic noise and atypical speech, acoustic and visual features will be combined to recover phonetic gestures of the inner vocal tract (articulatory features). Visual, acoustic and (recovered) articulatory features will be the observation domain of a novel speech recognition system for the robust recognition of key phrases.