This paper reviews the current understanding of acoustic-phonetic issues and the problems arising when trying to recognize speech from non-native speakers. Conceptually, regional accents are well modeled by systematic shifts in pronunciation. Therefore, simultaneous recognition of multiple regional variants may be performed by using multiple acoustic models in parallel, or by adding pronunciation variants in the dictionary. Recognition of non-native speech is much more difficult because it is influenced both by the native language of the speaker and non-native target language. It is characterized by a much greater speaker variability due to different levels of proficiency. A few language-pair specific transformation rules describing prototypical nativized pronunciations was found to be useful both in general speech recognition and in dedicated applications. However, due to the nature of the errors and the cross-language transformations, non-native speech recognition will remain inherently much harder. Moreover, the trend in speech recognition towards more detailed modeling seems to be counterproductive for the recognition of non-native speech and limits progress in this field. (C) 2001 Elsevier Science B.V. All rights reserved.
Van Compernolle D., ''Recognizing speech of goats, wolves, sheep and ... non-natives'', Speech communication, vol. 35, no. 1-2, pp. 71-79, August 2001, Elsevier Science B.V.