Lecturers

Prof. Hermann Ney Hermann Ney is a full professor of computer science at RWTH Aachen University, Germany. His main research interests lie in the area of statistical methods for pattern recognition and human language technology and their specific applications to speech recognition, machine translation and handwriting recognition. In particular, he has worked on dynamic programming and discriminative training for speech recognition, on language modelling and on phrase-based approaches to machine translation. His work has resulted in more than 700 conference and journal papers (h-index 80, estimated using Google scholar). He is a fellow of both IEEE and ISCA. In 2005, he was the recipient of the Technical Achievement Award of the IEEE Signal Processing Society. In 2010, he was awarded a senior DIGITEO chair at LIMIS/CNRS in Paris, France. In 2013, he received the award of honour of the International Association for Machine Translation.

Dr. Xavier Anguera Ing. [MS] 2001 UPC University (Barcelona, Spain), [MS] 2001 European Masters in Language and Speech, Ph.D. 2006 UPC University, with a thesis on speaker diarization for multi-microphone meeting recordings. From 2001 to 2003 he worked for Panasonic Speech Technology Lab in Santa Barbara, CA on multi-language text-to-speech. From 2004 to 2006 he was a visiting researcher at the International Computer Science Institute (ICSI) in Berkeley, CA. from 2007 to 2015 he was a research scientist at Telefonica Research in Barcelona. Since 2015 he is the founder and CEO of Sinkronigo.com, a startup that wants to help people improve their reading and language skills. His research interests cover speech processing (both speaker and content-based) and multimodal multimedia processing. He has published over 80 peer reviewed papers and has multiple accepted or pending patents. He is an active member of IEEE and ACM associations, for which he has served in the organization and in the PC of several multimedia and speech conferences.

Omid Ghahabi received the M.Sc. degree in electrical engineering from Shahid Beheshti University, Tehran, Iran, in 2009. From 2009 to 2011, he has been with the speech processing group of the Research Center of Intelligent Signal Processing (RCISP), Tehran, Iran. He is now a Ph.D. candidate at Universitat Politecnica de Catalunya (UPC) - BarcelonaTech, Spain. He is working as a researcher in the speech processing group of the Signal Theory and Communications Department of UPC. He is also a member of the Research Center for Language and Speech Technologies and Applications (TALP), Barcelona, Spain. His research interests include speech processing, speaker recognition, and deep learning. He is the author of several journal and conference papers on these topics.

Prof. Juan Ignacio Godino Llorente was born in Madrid, Spain. He received the B.Sc. and M.Sc. degrees in Telecommunications Engineering, and the PhD. degree in Computer Science in 1992, 1996 and 2002, respectively, all from Universidad Politécnica de Madrid (UPM), Spain. From 1996 to 2003 he was with the UPM as Associate Professor at the Circuits and Systems Engineering Dept. From 2003 to 2005 he joined the Signal Theory and Communications Dept. at the University of Alcala. From 2005, he joined again UPM, being the Head of the Circuits and Systems Engineering Dept. from 2006 till 2010. Since 2011 he is Full Professor of the Signal Theory and Communications Dept. of the UPM. He has been the Spanish coordinator of the 2103 COST Action, and the General Chairman of the 3rd Advanced Voice Function Assessment Workshop. He is a member of ISCA and a senior member of the IEEE. During his carreer, he has lead more than 20 research projects funded by national or international public bodies and by the industry. He has published more than 45 research papers in journals indexed in the JCR, and more than 50 papers in international peer reviewed conferences. His research accumulates more than 1500 cites, with a Hirsch index of 20. During the academic term 2003-2004, he was a Visiting Professor at Salford University, Manchester, UK. His main research interests are in the field of biomedical signal and image processing, with a special foucs on voice pathologies.

Dr. Jordi Luque received his Engineering of Telecommunication degree [MS]. from Technical University of Catalonia (UPC) in 2005, with a thesis titled "Complex Networks: The Visibility Graph", and his PhD in 2012 from the same University, with a thesis titled: "Speaker Diarization and Tracking in Multiple-sensor Environments". After graduating from his Masters he worked until 2010 in the Department of Signal Theory and Communications (TSC), founded by a Spanish grant FPI, where he collaborated in several national and international projects. He mainly worked on the research and development of person recognition techniques, based on audio and video modalities and applied on the framework of smart-rooms. From November 2009 until March 2010 he was visiting L2F INESC Lab at Lisbon, where he worked on speaker and channel variability in the speaker detection task. During those years, he participated in several national and international technology evaluations as RT and SRE evaluations organized by NIST or the Spanish Albayzin evaluations organized by RTTH. From October 2010 until August 2011 he worked on a start-up company from the University of Barcelona (UB), applying machine learning techniques to image biomarkers for disease diagnostic of neonates. At the end of 2011 he worked as assistant researcher in the Department of Applied Mathematics and Statistics of E.T.S.I.A. at the Technical University of Madrid (UPM), pursuing research on the analysis of speech properties by applying techniques from statistical physics. Nowadays, he hold a position of Research Scientist in Telefonica I+D involving research and technology development mainly related to speech processing, by leveraging and applying previous knowledge to different business areas inside a Telco company. His interests are related to the field of signal processing and time series analysis applied to speech, statistical machine learning techniques and the physics of the complex networks.

Dr. Enric Monte is an Associate Professor of the Signal Theory and Communications department of the Universitat Politécnica de Catalunya (UPC). Mr. Enric Monte received his degree in telecomunication engineering in 1987, his PhD in Digital Signal processing in 1992, a Degree in Philosophy in 2000 and a degree in Mathematics in 2010. His research areas include: Digital Signal Processing, Automatic Speech Recognition, Statistical Machine Translation Neural Networks, and Medical Digital Signal Processing Applications. Mr. Monte has participated in 8 R&D&I projects funded in competitive tenders by public or private bodies and has 20 publications in indexed journals, and 65 publications in conferences.

Prof. Jose A. R. Fonollosa, M.S. and Ph.D. in electrical engineering from the Universitat Politècnica de Catalunya (UPC, 1986 and 1989). Member of the Department of Signal Theory and Communications since 1986, currently as Full Professor. Visitor, from August 1991 to July 1992, of the Signal and Image Processing Institute, University of Southern California. He received the 1992 Marconi Young Scientist Award. He was co-founder (1999) of the company VERBIO specialised in spoken language technology. From 2006 to 2010, he was director of the Center for Language and Speech Technologies and Applications (TALP). He has published more than 150 scientific papers in statistical signal analysis and their application in communication systems, speech processing and statistical machine translation. During the last 5 years, his research has focused on spoken language technologies, machine learning and optimization, participating in several Spanish (AVIVAVOZ, BUCEADOR, SpeechTech4All) and international projects (FAME, LC- STAR, TC-STAR, FAUST, GE Flight Quest 2). He was the 1st prize winner of the challenge Industrial Internet GE Flight Quest 2.

Dr. Heiga Zen received his PhD from the Nagoya Institute of Technology, Nagoya, Japan, in 2006. Before joining Google in 2011, he was an Intern/Co-Op researcher at the IBM T.J. Watson Research Center, Yorktown Heights, NY (2004-2005), and a Research Engineer at Toshiba Research Europe Ltd. Cambridge Research Laboratory, Cambridge, UK (2008-2011). His research interests include statistical speech synthesis and recognition. He was one of the original authors and the first maintainer of the HMM-based speech synthesis system, HTS (http://hts.sp.nitech.ac.jp).

Latest news

Poster Award

Fernando de la Calle, from Universidad Carlos III de Madrid, has been distinguished with the Best Student Poster Award for his work Robustness of biologically motivated features for DNN-based ASR.
Congratulations!