International conference on acoustics, speech and signal processing - ICASSP’2012 edition:37 location:Kyoto, Japan date:25-30 March 2012
A speech recognition system that automatically learns word models for a small vocabulary from examples of its usage, without using prior linguistic information, can be of great use in cognitive robotics, human-machine interfaces, and assistive devices. In the latter case, the user's speech
capabilities may also be affected. In this paper, we consider a NMF-based learning framework capable of doing this, and experimentally show that its learning rate crucially depends on how the speech data is represented.
Higher-level units of speech, which hide some of the complex variability of the acoustics, are found to yield faster learning rates.
Driesen J., Van hamme H., ''Fast word acquisition in an NMF-based learning framework'', Proceedings 37th international conference on acoustics, speech and signal processing - ICASSP’2012, pp. 5137-5140, March 25-30, 2012, Kyoto, Japan.