SPEECHDAT

The SPEECHDAT corpus collection for European Portuguese was divided into 2 phases: collection of 1000 telephone calls (preparatory MLAP Project SPEECHDAT-M); and collection of 4000 telephone calls (Language Engineering Project SPEECHDAT-II). The project incorporates databases from all official languages of the E. U. and some major dialectal variants. The work was done by INESC under a subcontract with Portugal Telecom.

Goal: realistic corpus for training and assessment of isolated and continuous speech utterances (whole word or subword approaches), which can be used for developing voice driven teleservices.

Webpage of the SPEECHHDAT-II project with recordings of the Portuguese database