Develop a conversational interface that will allow users to obtain information, stored in a database, over the telephone using speech.
Sponsored by: FCT (POSI/PLP/14319/2001)
Start: January 2004
Duration: 3 years
Team (when the project started)
Project Leader: Nuno Mamede
- Luísa Coheur (got the Ph.D on Dec/2004)
- David Matos (got the Ph.D on July/2005)
- Porfírio Filipe (Ph.D)
- Sérgio Paulo (Ph.D)
- Márcio Mourão (got the M.Sc on Nov/2005)
- João Graça (M.Sc)
The primary goal of this project is integration. It is intended as a vehicle for bringing together, for the first time, research contributions from all the members of the recently created Spoken Language Systems Lab (L2F - Laboratório de sistemas de Língua Falada) of INESC ID. Hence, it will integrate teams with very different expertise - speech processing, neural networks and natural language processing - who will join efforts to develop a conversational interface for accessing and retrieving online information.
Whereas from an international point of view the project objectives may not seem too ambitious, given the much more advanced state of the art in language engineering for other languages, for European Portuguese, they do represent a very significant research effort. Building a prototype of a spoken dialog interface using state of the art core language technologies is therefore the first step towards being able to address in the future innovative research areas such as multilingual information access, animated multimodal conversational agents.
The main tasks in developing research testbeds for communicative interaction include developing: medium and large vocabulary continuous speech recognition modules, speech synthesis modules, natural language understanding and generation modules, and DMs.
In spite of aiming at a general architecture, minimising domain dependency, the system will have to be trained and tested in a particular domain. We shall start by specifying the functionality of a relatively simple demonstrator, based on a mixed-initiative dialog approach in a very limited domain such as, for instance, meteorology. At a later stage, the portability of the approach will be tested in a more complex domain such as travelling and scheduling. The final task of the project will be the evaluation of the two demonstrators.
The main objective of this project is to integrate different language technologies in order to develop a conversational interface that will allow users to obtain information, stored in a database, over the telephone using speech.
In developing two conversational interface demonstrators, this project will contribute to the improvement of core language technologies that have not been integrated before by the working team. By bringing together researchers with different areas of expertise, we expect to be able to integrate, on the one hand, speech recognition and natural language understanding, and on the other hand, speech synthesis with natural language generation. Moreover, we plan to integrate these core language technologies with dialog management and user modelling.
Such a project will also be of primary importance in setting up the research infrastructure for a relatively large number of doctoral (3) and master (3) theses.
- T1 - Demonstrator Specification
- T2 - System Architecture
- T3 - Speech recognition
- T4 - Query understanding
- T5 - Dialogue Representation and Management
- T6 - Language generation
- T7 - Speech Synthesis
- T8 - Integration and Evaluation