The Centre for Translational Neurophysiology of Speech and Communication (CTNSC), led by Prof. Luciano Fadiga, is looking for 1 post-doc to be involved in the Multi-modal Speech Recognition project.
Job Description: In the last few years, automatic speech recognition (ASR) has achieved outstanding results and almost matches human performance in some recognition tasks (e.g., close-microphone speech recognition) if large training datasets are provided. However, ASR systems still have strong limitations. One of the strongest limitations is the inability to solve the “cocktail party” problem, i.e., ASR systems cannot recognize the speech of a target speaker in a multi-talker environment, unless very strong assumptions (not applicable in real usage scenarios) are made.
The goal of the proposed project is to build a single channel ASR system that combines audio-visual information and prior knowledge of speech production processes (Badino et al., 2016, Badino, 2016) for distant speech recognition in a multi-talker environment.
In this context, we are looking for a highly motivated candidate who will work on deep learning techniques for articulatory audio-visual speech recognition. The post-doc will mainly develop recurrent neural network-based strategies (see e.g., Chan et al. 2016) that exploit visual information and prior knowledge of speech production processes to implement attentional mechanisms. The attentional mechanisms will allow tracking of the target speaker’s speech in the presence of concurrent speakers.
The approach will be strongly bio-inspired. The CTNSC has been recording, through electrocorticography (ECoG), brain signals from human patients while performing speech recognition tasks during awake neurosurgery. That will offer the unique opportunity to study the strategies the human brain adopts while performing the cocktail party (see, e.g., Mesgarani & Chang, 2012) and translate them into an automatic system.
The candidate should have the following skills:
The successful candidate will work in a fully equipped laboratory in the Center for Translational Neurophysiology in Ferrara with the possibility to collaborate with an interdisciplinary team of engineers, biologists and material scientists.
Salary will be highly competitive and commensurate with qualification and experience.
Interested applicants should submit CV, list of publications, names of 2 referees and a statement of research interest both to email@example.com and firstname.lastname@example.org quoting “Postdoctoral position in Multi-modal Speech Recognition - BC: 73026” in the subject line.
Please apply by April 28th 2017.
Istituto Italiano di Tecnologia (http://www.iit.it) is a private Foundation which promotes Italy's technological development and higher education in science and technology. Research at IIT is carried out in highly innovative scientific fields with state-of-the-art technology.
In order to comply with Italian law (art. 23 of Privacy Law of the Italian Legislative Decree n. 196/03), the candidate is kindly asked to give his/her consent to allow IIT to process his/her personal data. We inform you that the information you provide will be solely used for the purpose of assessing your professional profile to meet the requirements of Istituto Italiano di Tecnologia. Your data will be processed by Istituto Italiano di Tecnologia, with its headquarters in Genoa, Via Morego 30, acting as the Data Holder, using computer and paper-based means, observing the rules on the protection of personal data, including those relating to the security of data. Please also note that, pursuant to art.7 of Legislative Decree 196/2003, you may exercise your rights at any time as a party concerned by contacting the Data Manager.
Istituto Italiano di Tecnologia is an Equal Opportunity Employer that actively seeks diversity in the workforce.