Performance of Speaker Verification Using CSM and TM
Keywords:Autoassociative neural network, of Relative Spec-tral Transform-Perceptual Linear Prediction (RASTA-PLP), Close Speaking Microphone, Throat microphone
In this paper, we presented the performance of a speaker verification system based on features computed from the speech recorded using a Close Speaking Microphone(CSM) and Throat Microphone(TM) in clean and noisy environment. Noise is the one of the most complicated problem in speaker verification system. The background noises affect the performance of speaker verification using CSM. To overcome this issue, TM is used which has a transducer held at the throat resulting in a clean signal and unaffected by background noises. Acoustic features are computed by means of Relative Spectral Transform-Perceptual Linear Prediction (RASTA-PLP). Autoassociative neural network (AANN) technique is used to extract the features and in order to confirm the speakers from clean and noisy environment. A new method is presented in this paper, for verification of speakers in clean using combined CSM and TM. The verification performance of the proposed combined system is significantly better than the system using the CSM alone due to the complementary nature of CSM and TM. It is evident that an EER of about 1.0% for the combined devices (CSM+TM) by evaluating the FAR and FRR values and the overall verification of 99% is obtained in clean speech.
Yuvan Yujin, Zhao Peihua and Zhou Qun, “Research of speaker recognition based on combination of LPCC and MFCC”, In: IEEE, 2010.
D. O Shaughnessy, Speech Communications A Human and Machine, Universities Press (India) Limited, 2001.
Ravi P. Ramachandran, Kevin R. Farrell and Roopashri Ramachan-dran, “Speaker recognition – general classifier approaches and data fusion methods,” Pattern Recognition, Vol. 35, pp. 2801–2821, De-cember 2002.
Anuradha S. Nigade and J. S. Chitode, “Throat Microphone Signals for Isolated Word Recognition Using LPC”, International Journal of Advanced Research in Computer Science and Software Engineering, Vol. 2, No. 8, August 2012.
A. Shahina, B. Yegnanarayanan and M.R Kesheorey, “Throat micro-phone signal for speaker recognition,” in Proc. Int. Conf. Spoken Language Processing, 2004.
Li Zhu and Qing Yang, “Speaker Recognition System Based On weighted feature parameter”, International conference on solid state devices and mate-rials science, pp. 1515-1522, 2012.
H. Hermansky, “Perceptual linear predictive (plp) analysis for speech,” J. Acoustic Soc. Am., pp. 1738–1752, 1990.
H. Hermansky and N. Morgan, “Rasta processing of speech,” IEEE Trans. On Speech and Audio Processing, Vol. 2, pp. 578–589, 1994.
Luigi Galotto, J.O.P. Pinto, L.C. Leite, L.E.B da Silva and B.K. Bose, “Evaluation of the auto-associative neural network based sensor compensation in drive sytems,” IEEE Industry Applications Society Annual Meeting, pp. 1–6, October 2008.
P. Dhanalakshmi, S. Palanivel and V. Ramalingam, “Classification of audio signals using aann and gmm,” Applied Soft Computing, Vol. 11, No. 10, pp. 716–723, January 2011.
S. Jothilakshmi, “Spoken keywords detection using autoassociative neural networks,” Springer-International Journal of Speech Technol-ogy, pp. 83–89, August 2014.
Sondhi Benesty and Huang, Text-dependent Speaker Recognition, 2008.