The Centre for Translational Neurophysiology, led by Prof. Luciano Fadiga, is looking for 1 post-doc to involve in the Biosignal-based Automatic Speech Recognition project.
The biosignals considered will range from the electrical activity of the brain motor cortex (ECoG - Electrocorticography) to kinematics of the vocal tract articulators (EMA - Electromagnetic articulography) and visual cues (facial movements related to speech production).
Job Description: In the last few years, automatic speech recognition (ASR), where words are recognized from the audio signal, has achieved impressive results. A far more challenging task is the recognition of speech from other (non-acoustic) kinds of signal (e.g., signal describing facial movements related to speech production). Such “biosignals” can be used when the audio is not available (e.g., brain signal from locked-in patients) or can be combined with a weak acoustic signal (as, e.g., in audio-visual speech recognition where the signal-to-noise ratio is very low). Biosignal-based ASR needs to address many more problems than audio-based ASR, ranging from a noisier and less discriminative phonetic content to an infinitely smaller amount of available training data.
The goal of the proposed project is the automatic recognition of continuous speech from (i) ECoG data recorded from brain areas devoted to speech production and from (ii) other biosignals that carry speech production information, such as, e.g., facial movements recorded with a camera or kinematics of the outer and inner vocal tract recorded with EMA.
In this context, we are looking for a highly motivated candidate who will work on data analysis and machine learning techniques for biosignal-based speech recognition. The post-doc will mainly develop machine learning-based strategies to recognize phonemes (or broader categories of phonemes) from continuous speech. A key aspect of this project will be the exploration of novel techniques that would allow to extract descriptions of fine phonetically relevant movements of the vocal tract from one or more biosignals. Such descriptions will then serve as input features for a more accurate speech recognition. Such approach would follow the Articulatory ASR approach developed by our group (Badino et al., 2016) and based on the solid neuroscientific background of our group (D’Ausilio et al., 2009).
Skills: We are looking for highly motivated people and inquisitive minds with the curiosity to use a new and challenging technology that requires a rethinking of silent speech interfaces to achieve a high payoff in terms of accuracy and robustness to speaker variability.
We expect candidates to also have the following additional skills:
Team-work, PhD tutoring and general lab-related activities will be part of the tasks to carry out.
The successful candidate will work in a fully equipped laboratory in the Center for Translational Neurophysiology in Ferrara with the possibility to collaborate with an interdisciplinary team of engineers, biologists and material scientists.
Salary will be highly competitive and commensurate with qualification and experience. Interested applicants should submit CV, list of publications, 2 reference letters and a statement of research interest both to email@example.com and firstname.lastname@example.org quoting “Postdoctoral position in Brain- and biosignal-based speech recognition BC: 73546” in the subject line.
Please apply by March 15, 2018 .
Istituto Italiano di Tecnologia (http://www.iit.it) is a private Foundation which promotes Italy's technological development and higher education in science and technology. Research at IIT is carried out in highly innovative scientific fields with state-of-the-art technology.
In order to comply with Italian law (art. 23 of Privacy Law of the Italian Legislative Decree n. 196/03), the candidate is kindly asked to give his/her consent to allow IIT to process his/her personal data. We inform you that the information you provide will be solely used for the purpose of assessing your professional profile to meet the requirements of Istituto Italiano di Tecnologia. Your data will be processed by Istituto Italiano di Tecnologia, with its headquarters in Genoa, Via Morego 30, acting as the Data Holder, using computer and paper-based means, observing the rules on the protection of personal data, including those relating to the security of data. Please also note that, pursuant to art.7 of Legislative Decree 196/2003, you may exercise your rights at any time as a party concerned by contacting the Data Manager.
Istituto Italiano di Tecnologia is an Equal Opportunity Employer that actively seeks diversity in the workforce.
Denby, B., Schultz, T., Honda, K., Hueber, T., Gilbert, M., Brumberg, J.S. (2010), “Silent speech interfaces”, Speech Communication vol. 52 pp. 270–287.
Brumberg, J.S., Nieto-Castanon, A., Kennedy, P.R., Guenther, F.H. (2010) “Brain–computer interfaces for speech communication”, Speech Communication, vol. 52, pp. 367-379.
Mesgarani, N., Cheung, C., Johnson, K., Chang, E.F. (2014) “Phonetic Feature Encoding in Human Superior Temporal Gyrus”. Science, 28 Feb 2014, vol. 343(6174), pp. 1006-1010
Badino, L., Canevari, C., Fadiga, L., Metta, G. (2016) " Integrating articulatory data in deep neural network-based acoustic modeling ", Computer Speech and Language, vol 36, pp. 173–195.
D’Ausilio A., Pulvermüller F., Salmas P., Bufalari I., Begliomini C., Fadiga L. (2009) “The motor somatotopy of speech perception”. Current Biology, 19, 381-385.