The BD-PUBLICO database (Base de Dados em Português eUropeu, vocaBulário Largo, Independente do orador e fala COntínua) was collected by INESC in the framework of an European project (SPRACH), and a national project (PRAXIS XXI Program), and with the collaboration of Instituto Superior Técnico (IST) and the PÚBLICO newspaper. This corpus aimed at the development of large vocabulary, speaker-independent continuous speech recognition systems.