Home > Archive > 2020 > Volume 10 Number 5 (Sept. 2020) >
IJMLC 2020 Vol.10(5): 630-636 ISSN: 2010-3700
DOI: 10.18178/ijmlc.2020.10.5.984

Improved Language-Independent Speaker Identification in a Non-contemporaneous Setup

Smarajit Bose, Amita Pal, Anish Mukherjee, and Debasmita Das

Abstract—One of the most effective approaches available in the literature for Automatic Speaker Identification is based on Gaussian Mixture Models (GMMs) with Mel Frequency Cepstral Coefficients (MFCCs) as features (Reynolds (1995). The use of GMMs for modeling speaker identity is motivated by the interpretation that the Gaussian components represent some general speaker-dependent spectral shapes, and the capability of mixtures to model arbitrary densities. In an earlier work, the authors have presented and demonstrated empirically (using the benchmark speech corpus NTIMIT) how combining two different well-known set of features (MFCCs and Perceptual Linear Predictive Coefficients (PLPCs)) and using ensemble classifiers in conjunction with the Principal Component Transformation (PCT) and some robust statistical estimation techniques, enhances significantly the performance of the baseline MFCC-GMM speaker recognition system. In this work, the authors demonstrate that this approach, besides being statistically robust, is also significantly more robust than the baseline system to language mismatch in a non-contemporaneous setup. This has been done with the help of ISIS/NISIS, a bilingual dual-channel speech corpus with multi-session speech recordings.

Index Terms—Mel frequency cepstral coefficients, perceptual linear predictive coefficients, Gaussian mixture models, ensemble classifiers, classification accuracy, trimmed mean.

Smarajit Bose and Amita Pal are with the Interdisciplinary Statistical Research Unit (ISRU), Applied Statistics Division, Indian Statistical Institute, Kolkata, India (e-mail: {smarajit,pamita}@isical.ac.in).
Anish Mukherjee is with the Department of Statistics, University of Missouri, Columbia, MO, USA (e-mail: anishmk9@gmail.com).
Debasmita Das is with the Department of Statistics, University of Connecticut, Storrs, CT, USA (e-mail: debasmita88@yahoo.com).

[PDF]

Cite: Smarajit Bose, Amita Pal, Anish Mukherjee, and Debasmita Das, "Improved Language-Independent Speaker Identification in a Non-contemporaneous Setup," International Journal of Machine Learning and Computing vol. 10, no. 5, pp. 630-636, 2020.

Copyright © 2020 by the authors. This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited (CC BY 4.0).

 

General Information

  • E-ISSN: 2972-368X
  • Abbreviated Title: Int. J. Mach. Learn.
  • Frequency: Quaterly
  • DOI: 10.18178/IJML
  • Editor-in-Chief: Dr. Lin Huang
  • Executive Editor:  Ms. Cherry L. Chen
  • Abstracing/Indexing: Inspec (IET), Google Scholar, Crossref, ProQuest, Electronic Journals LibraryCNKI.
  • E-mail: ijml@ejournal.net


Article Metrics in Dimensions