As early as the sixties Alvin Lieberman said:
A result in all cases is that there is not, first, a cognitive representation of the proximal pattern that is modality-general, followed by a translation to a particular distal property; rather, perception of the distal property is immediate, which is to say that the module has done all the hard work.
This elegant idea was however strongly debated at the time mostly because it was difficult to test, validation through the implementation on a computer system was impossible, and in fact only recently has the theory gained support from experimental evidence . We are now reinstating this idea in machine learning by analyzing data from human speech and its motor counterpart (tongue movement, lips position, activation of the vocal folds). We expect to show that the use of motor information during training can improve the recognition of speech in difficult and noisy environments (e.g. in the presence of coarticulation).
Related project: Contact
|A recent paper by D’Ausilio et al. [see reference] shows a double dissociation in a TMS experiment involving the primary motor cortex. This is the first result showing a direct causal connection between the activation of the primary motor cortex and the perception of speech.
|Experimental setup for the simultaneous acquisition of speech and motor data. In this case an articulograph, microphones, ultrasound, a camera and a laringograph were employed.||Improved recognition rates in a computational experiment using audio and motor data for the recognition of various phonemes. Note as the performance is improved in the coarticulation cases where pure audio-based recognition is poor.|
G.Metta, G. Sandini, L. Natale, L. Craighero, L. Fadiga Understanding mirror neurons: a bio-robotic approach. In Interaction Studies. Volume 7 Issue 2. pp. 197-232, 2006.
L. Craighero, G. Metta, G. Sandini, L. Fadiga. The Mirror-Neurons System: data and models. In Progress in Brain Research, 164 "From Action to Cognition". von Hofsten C. & Rosander K. editors. ISBN: 978-0-444-53016-5. Elsevier. 2007.
M. Grimaldi, B. Gili Fivela, F. Sigona, M. Tavella, P. Fitzpatrick, L. Craighero, L. Fadiga, G. Sandini, G. Metta. New Technologies for Simultaneous Acquisition of Speech Articulatory Data: 3D Articulograph, Ultrasound and Electroglottograph. In proceedings of LangTech, 28-29th February 2008, Rome, Italy.
D’Ausilio et al., The Motor Somatotopy of Speech Perception Current Biology (2009).