Title: The Joint Optimization of Spectro-Temporal Features and Neural Net Classifiers
Authors: Kovács, György
Tóth, László
Issue Date: 2013
Publisher: Springer
Host Document: Text, Speech, and Dialogue vol:8082 pages:552-559
Series Title: Lecture Notes in Computer Science
Conference: International Conference on Text, Speech, and Dialogue (TSD) edition:16 location:Pilsen: CZECH REPUBLIC date:SEP 01-05, 2013
Abstract: In speech recognition, spectro-temporal feature extraction and the training of the acoustical model are usually performed separately. To improve recognition performance, we present a combined model which allows the training
of the feature extraction filters along with a neural net classifier. Besides expecting that this joint training will result in a better recognition performance, we also expect that such a neural net can generate coefficients for spectro-temporal filters and also enhance preexisting ones, such as those obtained with the two-dimensional Discrete Cosine Transform (2D DCT) and Gabor filters. We tested
these assumptions on the TIMIT phone recognition task. The results show that while the initialization based on the 2D DCT or Gabor coefficients is better in some cases than with simple random initialization, the joint model in practice always outperforms the standard two-step method. Furthermore, the results can be significantly improved by using a convolutional version of the network.
Description: The final publication is available at:
ISBN: 978-3-642-40584-6
ISSN: 0302-9743
Publication status: published
KU Leuven publication type: IC
Appears in Collections:Non-KU Leuven Association publications

Files in This Item:
File Description Status SizeFormat
Joint_optimization_TSD2013.pdf Published 215KbAdobe PDFView/Open


All items in Lirias are protected by copyright, with all rights reserved.

© Web of science