Robust Speech Processing using Observation Uncertainty and Uncertainty Propagation

From L²F

Call for Papers for a Special Session at Interspeech 2015

We are pleased to announce a special session at Interspeech 2015 in Dresden focusing on Observation Uncertainty (OU) and Uncertainty Propagation (UP) research in speech processing with

*Deadline March 20*

Observation Uncertainty (OU) and Uncertainty Propagation (UP) techniques for robust speech processing have shown notable success across various sub-domains as e.g. ASR or Speaker Recognition. Despite its healthy status, OU/UP research in speech processing is currently fragmented across various disciplines and application domains. Consequently, research papers on uncertainty have to compete with other techniques in e.g. ASR or Speaker Identification tracks. This generally discourages the proposal of novel or bold ideas pertaining to the topic. This is particularly relevant in the current environment in which the strong technology transfer and real-data-rich problems force disciplines to adapt themselves more quickly. In particular, although investigations on how to port OU/UP techniques to Deep Neural Networks (DNNs) exist, this area is still yet to be studied. Given the disruptive character of DNNs and the fact that they are not amenable to classic adaptation techniques, this poses a very interesting application dimension for OU/UP techniques. For these reasons, we have proposed a special session devoted to uncertainty as an opportunity for researchers of different sub-domains of speech processing to share ideas, explore new application domains as e.g. paralinguistics, or text-based uncertainties and consolidate OU/UP research in DNNs.

The session would cover

  • Methods exploiting observation and parameter uncertainty as well as uncertainty propagation for robust speech processing e.g., for inference, learning, adaptation or model selection.
  • We specially welcome papers on the topic deep learning and uncertainty. We are confident this is the right moment to tackle this issue and already expect works on the topic.
  • Uncertainty in training data: “BIG” data is upon us, but the quality of that ‘found’ data is not the same as carefully collected data. What are the best ways to address data uncertainty when ground-truth is unknown?
  • All sub-fields are welcome, since this is not restricted to only ASR and speaker recognition communities but also works in e.g., paralinguistics or transcription level uncertainties (lattices).

We would also like to use this opportunity to motivate the sharing of software tools based on uncertainty techniques among the community as well as reproducible research. For this purpose we will provide software for uncertainty-related techniques, such as tools for the estimation of observation uncertainties and for their application to ASR, through the wiki page of the Robust Speech Processing Special Interest Group of ISCA (RoSP-SIG).

https://wiki.inria.fr/rosp/Software

All updates on the session progress and available tools will be reported on this website.

Update 2015-02-10: Tools for uncertainty propagation and observation uncertainty using HTK/Kaldi released (see WiKi link above).

Do not hesitate to contact us if you have any question:

Ramón Fernandez Astudillo (ramon.astudillo@inesc-id.pt)

Shinji Watanabe (watanabe@merl.com)

Ahmed Hussen AbdelAziz (Ahmed.HussenAbdelAziz@rub.de)

Dorothea Kolossa (dorothea.kolossa@rub.de)