Speech Replay Spoofing Attack Detection System Based on Fusion of Classification Algorithms

  • А.А. Лепендин Алтайский государственный университет (Барнаул, Россия)
  • Я.А. Филин Алтайский государственный университет (Барнаул, Россия)
  • П.В. Малинин Алтайский государственный университет (Барнаул, Россия)
Keywords: automatic speaker verification, voice spoofing, replay attacks, universal background model, i-vector, probabilistic linear discriminant analysis, tree boosting, model fusion


Fast development of modern technologies of digital processing and speech recording leads to the fact that it is necessary to take into account the potential threats from the speech replay attacks. We propose our ensemble fusion replay attack detection system. It uses constant Q cepstral coefficients as speech features and short-time mean normalization for their preprocessing. The set of binary classifiers includes multiple Gaussian mixture models based Bayesian classifier, i-vector based Gaussian Probabilistic Linear Discriminant Analysis and XGBoost tree boosting algorithm. Fusion of scores was made by modified logistic regression algorithm from BOSARIS toolbox. ASV Spoof 2017 corpus is utilized in the experiments as the main database for anti-spoofing systems evaluation. Obtained results demonstrate that the proposed system can provide substantially better performance than the baseline Gaussian mixture model classifier. The pre-processing of cepstral features is crucial for the better performance of the system. High evaluation performance can be obtained using only few algorithms in a set. The attained value of equal error rate EER=12.44% for our fusion classifier is competitive with the best results obtained during last two years.

DOI 10.14258/izvasu(2018)1-19


Download data is not yet available.

Author Biographies

А.А. Лепендин, Алтайский государственный университет (Барнаул, Россия)
кандидат физико-математических наук, доцент кафедрыприкладной физики, электроники и информационной безопасности Алтайского государственногоуниверситета
Я.А. Филин, Алтайский государственный университет (Барнаул, Россия)
магистрант кафедры прикладной физики, электроники и информационной безопасности Алтайского государственного университета
П.В. Малинин, Алтайский государственный университет (Барнаул, Россия)
кандидат технических наук, доцент кафедры прикладной физики, электроники и информационной безопасности Алтайского государственного университета


Kinnunen T., Sahidullah M., Delgado H., Todisco M., Evans N., Yamagishi J., Lee K.A. The ASVspoof 2017 challenge: Assessing the limits of replay spoofing attack detection // Proc. INTERSPEECH 2017. 2017. D01:10.21437/ Interspeech.2017-1111.

Wu Z., Yamagishi J., Kinnunen T., Hanil^i C., Sahidullah M., Sizov A., Evans N., Todisco M., Delgado H. ASVspoof: The Automatic Speaker Verification Spoofing and Countermeasures Challenge // IEEE Journal of Selected Topics in Signal Processing. — 2017. — Vol. 11, No. 4. D0I:10.1109/ JSTSP2017.2671435.

K. Lee, A. Larcher, G. Wang, P. Kenny, N. Brummer, D. A. van Leeuwen, H. Aronowitz, et al. The RedDots data collection for speaker recognition // Proc. Interspeech, Annual Conf. of the Int. Speech Comm. Assoc., 2015.

Morrison G.S. Tutorial on logistic-regression calibration and fusion:converting a score to a likelihood ratio // Australian Journal of Forensic Sciences. — 2013. — Vol. 45, No. 2. DOI: 10.1080/00450618.2012.733025.

Reynolds D.A., Quatieri T.F., Dunn R.B. Speaker verification using adapted Gaussian mixture models // Digital Signal Processing. — 2000. — Vol. 10, No. 1. DOI: 10.1006/ dspr.1999.0361.

Senoussaoui M., Kenny P, Dehak N., Dumouchel P An i-vector extractor suitable for speaker recognition with both micro-phone and telephone speech // Proc. Odyssey Speaker and Language Recogntion Workshop, 2010.

Verma P, Das PK. I-vectors in speech processing applications: a survey // International Journal of Speech Technolng. — 2015. — Vol. 18, No. 4. DOI: 10.1007/978-981-10-6626-9_18.

Chen T., Guestrin C. XGBoost: A Scalable Tree Boosting System // KDD’16 Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2016.

Shikha G., Jaafar J., Fatimah W., Ahmad W., Bansal A. Feature Extraction using MFCC // International Journal of signal and image processing (SIPIJ). — 2013. — Vol. 4, No. 4. DOI: 10.5121/sipij.2013.4408.

Todisco M., Delgado H., Evans N. A new feature for automatic speaker verification anti-spoofing: Constant Q cepstral coefficients // Speaker Odyssey Workshop, Bilbao, Spain. 2016.

Brown J. C. Calculation of a constant Q spectral transform // Journal of Acoustic Society America. — 1991. — Vol. 89, No. 1.

Alam M., Ouellet P, Kenny P., O’Shaughnessy D. Comparative evaluation of feature normalization techniques for speaker verification // Advances in Nonlinear Speech Processing: 5th International Conference on Nonlinear Speech Processing, NOLISP 2011. DOI: 10.1007/978-3-642-25020-0_32.

Dehak N., Kenny P, Dehak R., Dumouchel P., Ouellet P. Front-End Factor Analysis For Speaker Verification // IEEE Transactions on Audio, Speech and Language Processing. — 2010. — Vol. 19, No. 4. DOI: 10.1109/TASL.2010.2064307.

Sadjadi S. O., Slaney M., Heck L. MSR identity toolbox v1.0: A MATLAB toolbox for speaker recognition research // Proc. IEEE Signal Process. Soc. Speech Lang. Tech. Committee Newsl. 2013.
How to Cite
Лепендин, А., Филин, Я., & Малинин, П. (2018). Speech Replay Spoofing Attack Detection System Based on Fusion of Classification Algorithms. Izvestiya of Altai State University, (1(99), 107-112. https://doi.org/https://doi.org/10.14258/izvasu(2018)1-19
Математика и механика