Title: Non-Negative Factor Analysis of Gaussian Mixture Model Weight Adaptation for Language and Dialect Recognition
Authors: Bahari, Mohamad Hasan ×
Dehak, Najim
Van hamme, Hugo
Burget, Lukas
Ali, Ahmed M.
Glass, Jim #
Issue Date: Jul-2014
Publisher: IEEE
Series Title: IEEE Transactions on Audio Speech and Language Processing vol:22 issue:7 pages:1117-1129
Abstract: Recent studies show that Gaussian mixture model (GMM) weights carry less, yet complementary, information to GMM means for language and dialect recognition. However, state-of-the-art language recognition systems usually do not use this information. In this research, a non-negative factor analysis (NFA) approach is developed for GMM weight decomposition and adaptation. This modeling, which is conceptually simple and computationally inexpensive, suggests a new low-dimensional utterance representation method using a factor analysis similar to that of the i-vector framework. The obtained subspace vectors are then applied in conjunction with i-vectors to the language/dialect recognition problem. The suggested approach is evaluated on the NIST 2011 and RATS language recognition evaluation (LRE) corpora and on the QCRI Arabic dialect recognition evaluation (DRE) corpus. The assessment results show that the proposed adaptation method yields more accurate recognition results compared to three conventional weight adaptation approaches, namely maximum likelihood re-estimation, non-negative matrix factorization, and a subspace multinomial model. Experimental results also show that the intermediate-level fusion of i-vectors and NFA subspace vectors improves the performance of the state-of-the-art i-vector framework especially for the case of short utterances.
Description: Bahari M.H., Dehak N., Van hamme H., Burget L., Ali A.M., Glass J., ''Non-negative factor analysis of Gaussian mixture model weight adaptation for language and dialect recognition'', IEEE/ACM transactions on audio, speech, and language processing, vol. 22, no. 7, pp. 1117-1129, July 2014.
ISSN: 1558-7916
Publication status: published
KU Leuven publication type: IT
Appears in Collections:ESAT - PSI, Processing Speech and Images
× corresponding author
# (joint) last author

Files in This Item:
File Description Status SizeFormat
3695_final.pdf Published 1763KbAdobe PDFView/Open Request a copy
NFA_11_onecolumn-libre.pdf Published 670KbAdobe PDFView/Open

These files are only available to some KU Leuven Association staff members


All items in Lirias are protected by copyright, with all rights reserved.

© Web of science