Open Access   Article Go Back

Speaker Recognition System Using Deep Learning with Convolutional Neural Network

Sandeep Kumar1 , Samridhi Dev2

Section:Research Paper, Product Type: Journal Paper
Volume-8 , Issue-10 , Page no. 60-64, Oct-2020

CrossRef-DOI:   https://doi.org/10.26438/ijcse/v8i10.6064

Online published on Oct 31, 2020

Copyright © Sandeep Kumar, Samridhi Dev . This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

View this paper at   Google Scholar | DPI Digital Library

How to Cite this Paper

  • IEEE Citation
  • MLA Citation
  • APA Citation
  • BibTex Citation
  • RIS Citation

IEEE Style Citation: Sandeep Kumar, Samridhi Dev, “Speaker Recognition System Using Deep Learning with Convolutional Neural Network,” International Journal of Computer Sciences and Engineering, Vol.8, Issue.10, pp.60-64, 2020.

MLA Style Citation: Sandeep Kumar, Samridhi Dev "Speaker Recognition System Using Deep Learning with Convolutional Neural Network." International Journal of Computer Sciences and Engineering 8.10 (2020): 60-64.

APA Style Citation: Sandeep Kumar, Samridhi Dev, (2020). Speaker Recognition System Using Deep Learning with Convolutional Neural Network. International Journal of Computer Sciences and Engineering, 8(10), 60-64.

BibTex Style Citation:
@article{Kumar_2020,
author = {Sandeep Kumar, Samridhi Dev},
title = {Speaker Recognition System Using Deep Learning with Convolutional Neural Network},
journal = {International Journal of Computer Sciences and Engineering},
issue_date = {10 2020},
volume = {8},
Issue = {10},
month = {10},
year = {2020},
issn = {2347-2693},
pages = {60-64},
url = {https://www.ijcseonline.org/full_paper_view.php?paper_id=5231},
doi = {https://doi.org/10.26438/ijcse/v8i10.6064}
publisher = {IJCSE, Indore, INDIA},
}

RIS Style Citation:
TY - JOUR
DO = {https://doi.org/10.26438/ijcse/v8i10.6064}
UR - https://www.ijcseonline.org/full_paper_view.php?paper_id=5231
TI - Speaker Recognition System Using Deep Learning with Convolutional Neural Network
T2 - International Journal of Computer Sciences and Engineering
AU - Sandeep Kumar, Samridhi Dev
PY - 2020
DA - 2020/10/31
PB - IJCSE, Indore, INDIA
SP - 60-64
IS - 10
VL - 8
SN - 2347-2693
ER -

VIEWS PDF XML
315 349 downloads 184 downloads
  
  
           

Abstract

The task of identifying humans by their voice seems to be an easy task for human beings as people interact with a particular person, their mind is upskilled with that voice and the brain becomes proficient enough to easily recognize that particular voice next time. Using this human mind concept, the structure is designed and implemented. In the proposed system Convolutional Neural Network (CNN) has been used. 110 voice samples from 11 different participants/speakers have been collected. These voice signals were converted into the form of an image of the signal spectrogram. 90% of data were used for training and the remaining 10% was used for testing. Implementation was done in RStudio with R programming language. The system achieved 82% accuracy. The proposed system is facile and lucrative.

Key-Words / Index Term

Convolutional neural network, speaker recognition, Keras, voice signal spectrogram, tuneR

References

[1]. Rajsekhar G., “Real-Time Speaker Recognition using MFCC and VQ”, Ph.D. Thesis, Department of Electronics & Communication Engineering, National Institute of Technology Rourkela, pp. 9-71, 2008.
[2]. S. Furui, “An Overview of Speaker Recognition Technology”, ESCA Workshop on Automatic Speaker Recognition, Identification and Verification, Martigny, Switzerland, pp. 1-9, April 1994.
[3]. Hemant A. and T. K. Basu, “Advances in Speaker Recognition: A Feature-Based Approach,” Int. Conf. Artificial Intelligence and Pattern Recognition, AIPR’07, Orlando, Florida, USA, July 9-12, pp. 528-537, 2007.
[4]. Waghmare, et. al., “Emotion Recognition System from Artificial Marathi Speech using MFCC and LDA Techniques” 2014.
[5]. P. L. De Leon, et. al., "Evaluation of Speaker Verification Security and Detection of HMM-Based Synthetic Speech," in IEEE Transactions on Audio, Speech, and Language Processing, vol. 20, no. 8, pp. 2280-2290, Oct. 2012.
[6]. Kim, et. al., “Dysarthric speech database for universal access research”. Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH. 1741-1744. 2008.
[7]. Shrishirmal, et. al., “Development of Marathi Language Speech Database from Marathwada Region” 2015.
[8]. P. J. Castellano, et. al., "Telephone-based speaker recognition using multiple binary classifiers and Gaussian mixture models," IEEE International Conference on Acoustics, Speech, and Signal Processing, Munich, 1997, pp. 1075-1078 vol.2. 1997.
[9]. G. Doddington, “Speaker Recognition – Identifying People by their Voice”, Proceedings of IEEE, vol.73, 1651-1664, Nov. 1985.
[10]. Yeldener, S. & Rieser, J.H., “A background noise reduction technique based on sinusoidal speech coding systems. Acoustics, Speech, and Signal Processing”, International Conference on. 3. 1391 - 1394 vol.3. 10.1109/ICASSP.2000.861840.
[11]. Ch. Srinivasa Kumar, P. M. Rao., “Design Of An Automatic Speaker Recognition System Using MFCC, Vector Quantization And LBG Algorithm”, International Journal of Computer Sciences and Engineering, Vol. 3, No. 8, pp.2942-2954, 2011.
[12]. Parmar Dharmistha R, “a survey on speaker recognition with various feature extraction techniques, “International Journal of Computer Sciences and Engineering, Vol. 7, Issue. 8, pp.884-887, 2019.