A Review of Keyword Spotting as an Audio Mining Technique

B.K. Deka, P. Das

Open Access Article Go Back

A Review of Keyword Spotting as an Audio Mining Technique

B.K. Deka¹ , P. Das²

Section:Review Paper, Product Type: Journal Paper
Volume-7 , Issue-1 , Page no. 757-769, Jan-2019

CrossRef-DOI: https://doi.org/10.26438/ijcse/v7i1.757769

Online published on Jan 31, 2019

Copyright © B.K. Deka, P. Das . This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

View this paper at Google Scholar | DPI Digital Library

XML View

PDF Download

How to Cite this Paper

IEEE Citation
MLA Citation
APA Citation
BibTex Citation
RIS Citation

IEEE Style Citation: B.K. Deka, P. Das, “A Review of Keyword Spotting as an Audio Mining Technique,” International Journal of Computer Sciences and Engineering, Vol.7, Issue.1, pp.757-769, 2019.

MLA Style Citation: B.K. Deka, P. Das "A Review of Keyword Spotting as an Audio Mining Technique." International Journal of Computer Sciences and Engineering 7.1 (2019): 757-769.

APA Style Citation: B.K. Deka, P. Das, (2019). A Review of Keyword Spotting as an Audio Mining Technique. International Journal of Computer Sciences and Engineering, 7(1), 757-769.

BibTex Style Citation:
@article{Deka_2019,
author = {B.K. Deka, P. Das},
title = {A Review of Keyword Spotting as an Audio Mining Technique},
journal = {International Journal of Computer Sciences and Engineering},
issue_date = {1 2019},
volume = {7},
Issue = {1},
month = {1},
year = {2019},
issn = {2347-2693},
pages = {757-769},
url = {https://www.ijcseonline.org/full_paper_view.php?paper_id=3580},
doi = {https://doi.org/10.26438/ijcse/v7i1.757769}
publisher = {IJCSE, Indore, INDIA},
}

RIS Style Citation:
TY - JOUR
DO = {https://doi.org/10.26438/ijcse/v7i1.757769}
UR - https://www.ijcseonline.org/full_paper_view.php?paper_id=3580
TI - A Review of Keyword Spotting as an Audio Mining Technique
T2 - International Journal of Computer Sciences and Engineering
AU - B.K. Deka, P. Das
PY - 2019
DA - 2019/01/31
PB - IJCSE, Indore, INDIA
SP - 757-769
IS - 1
VL - 7
SN - 2347-2693
ER -

VIEWS	PDF	XML
578	366 downloads	168 downloads

Bar Line

Abstract

Speech is that the essential and therefore the most profitable ways for correspondence between people. Speech is an emerging technology and automatic speech recognition has created advances in recent years. It provides the flexibility to a machine for responding properly to spoken language. Keyword Spotting could be a very important strategy in audio mining that is employed to recover of all occurrences of a given keyword within the knowledge talked expressions. It has transformed into a fascinating and testing zone as the proportion of an audio substance in the web, telephone and diverse sources growing rapidly. It can be viewed as a subproblem of automatic speech recognition where only partial information has got to be extracted from speech utterances. KWS is closely associated with the task of speech transcription and offers several advantages for certain applications. The main aim of this study is to understand the various approaches used for keyword spotting of speech in order that we can find out the methods that provide better accuracy and performance. Additionally, we have quickly examined the Keyword spotting framework and Audio mining system in this paper

Key-Words / Index Term

Audio Mining, Keyword Spotting, Automatic Speech Recognition, Audio Indexing

References

[1] G. Hemakumar, P. Punitha, “Speech Recognition Technology: A survey on Indian Languages”, International Journal of Information Science and Intelligent System, Vol. 2, No.4, 2013.
[2] A. Katyal, A. Kaur, J. Gill, “Automatic Speech Recognition : A Review”, IJCST, Vol. 4, No.3, pp.71–74, 2014.
[3] M.M. Kumar, E. Sherly, W.S. Varghese, “Isolated Word Recognition System for Malayalam using Machine Learning”, In the Proceedings of the 12th International Conference on Natural Language Processing, 2015.
[4] A. Jansen, P. Niyogi, “Point process models for spotting keywords in continuous speech”, IEEE Transactions on Audio, Speech and Language Processing, Vol. 17, No.8, pp.1457–1470, 2009.
[5] K. A. Senthildevi, E. Chandra, “Keyword Spotting System for Tamil Isolated Words using Multidimensional MFCC and DTW Algorithm”, IEEE International Conference on Communication and Signal Processing (ICCSP 2015), pp. 550–554, 2015.
[6] E. Chandra, K.A. Senthildevi, “Keyword Spotting: An Audio Mining Technique in Speech Processing – A Survey”, IOSR Journal of VLSI and Signal Processing, Vol. 5, No.4, Ver. II, pp.22–27, 2015.
[7] A.J.K. Thambiratnam, “Acoustic Keyword Spotting in speech with applications to data mining”, PhD Thesis Published in Queensland University of Technology, 2005.
[8] M.K. Mand, D. Nagpal, “An Analytical Approach for Mining Audio Signals”, International Journal of Advanced Research in Computer and Communication Engineering, Vol. 2, No.9, pp.3645–3647, 2013.
[9] V. Franken, A. Fischer, R. Manmatha, H. Bunke, “A novel word spotting method based on recurrent neural networks”, IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 34, No.2, pp. 211–224, 2012.
[10] S. Irtza, K. Rehman, S. Hussain, “Urdu Keyword Spotting System using HMM”, Conference on Language and Technology, Karachi, Pakistan, 2014.
[11] K. Thambiratnam, T. Martin and S. Sridharan, “A study on the effects of limited training data for English, Spanish and Indonesian keyword spotting”, Proceedings of the 10th Australian International Conference on Speech, Science and Technology, 2014.
[12] N. Zhao, H. Yang, “Realizing Speech to Gesture Conversion by Keyword Spotting”, IEEE Transactions, 2016.
[13] K. Gopalan, T. Chu, X. Miao, “An Utterance Recognition Technique for Keyword Spotting by Fusion of Bark Energy and MFCC Features”, Proceedings of the 9th World Scientific and Engineering Academy and Society (WSEAS) International Conference on Signal, Speech & Image Processing and 9th WSEAS International Conference on Multimedia, Internet & Video technologies, 2009.
[14] K.S. Kavya, “Automatic Recognition and Detection of Words in Speech- A Review”, International Journal of Innovative Research in Science, Engineering and Technology (IJIRSET), 2017.
[15] J.S.R. Alex, N. Venkatesan, “Spoken Utterance Detection Using Dynamic Time Warping Method Along With a Hashing Technique”, International Journal of Engineering and Technology (IJET), Vol. 6, No.2, 2014.
[16] P. Karmacharya, “Design of Keyword Spotting System Based on Segmental Time Warping of Quantized Features”, MS Thesis, Temple University.
[17] R. P. Ramachandran, R. J. Mammone, “Modern Methods of Speech Processing”, Vol. 327, Springer, 1995.
[18] C. Kurian, K. Balakrishnan, “Automated Transcription System for Malayalam Language”, International Journal of Computer Application, Vol. 19, No.5, pp.5–10, 2011.
[19] C. Vimla, V. Radha, “Speaker-Independent Isolated Speech Recognition System for Tamil Language using HMM”, Procedia Engineering, Vol. 30, pp.1097–1102, 2012.
[20] J. Patel, K. S. Maurya, S. Kulkarni, V. Sakore, S. Khonde, “Multimedia Keyword Spotting (MKWS) Using Training and Template-Based Techniques”, International Journal of Emerging Technology and Advanced Engineering, Vol. 4, Issue.2, 2014.
[21] S. Das, P.C Ching, “Speaker Dependent Bengali Keyword spotting in unconstrained English Speech”, A Project report, Indian Institute of Technology Guwahati, India, 2005.
[22] M. Lindasalwa, M. Begam, I. Elamvazuthi, “Voice Recognition Algorithm using Mel Frequency Cepstral Coefficient (MFCC) and Dynamic Time Warping (DTW) Techniques”, Journal Of Computing, Vol. 2, Issue.3, pp. 138-143, 2010.
[23] K. Dhameliya, “Feature Extraction And Classification Techniques for Speaker Recognition: A Review”, IEEE International Conference on Electrical, Electronics, Signal, Communication and Optimization, pp.1-4, 2015.
[24] P. Mahana, G. Singh, “Comparative Analysis of Machine Learning Algorithms for Audio Signals Classification”, IJCSNS International Journal of Computer Science and Network Security, Vol. 15, No.6, 2015.
[25] Z. Chen, Y. Qin, K Yu, “Sequence Discriminative Training for Deep Learning based Acoustic Keyword Spotting”, arXiv:1808.00639v1 [cs.CL], 2018.
[26] A. Kadan, P. Vivek, V.L. Lajish, “A Keyword Spotting Approach for Content-Based Indexing and Retrieval of Malayalam News Videos”, Conference Paper NSA-2015, 2015.
[27] C. Zhu, Q.J. Kong, L. Zhou, G. Xiong, F. Zhu, “Sensitive Keyword Spotting for Voice Alarm Systems”, In the Proceedings of 2013 IEEE International Conference on Service Operations and Logistics, and Informatics, pp.350–353, 2013.
[28] V. Karjigi, B. Patel, P. Rao, “Identification of stop consonants for acoustic keyword spotting in continuous speech”, Journal of Intelligent Systems, Vol. 22, No.3, pp.215-228, 2013.
[29] H. Bahi, N. Benati, “A New Keyword Spotting Approach”, IEEE transactions, 2009.
[30] M. Weintraub, “LVCSR Log Likelihood Ratio Scoring For Keyword Spotting”, IEEE International Conference on Acoustics, Speech and Signal Processing, 2002.
[31] S. Shetty, K.K. Achary, “Audio Data Mining Using Multi-perceptron Artificial Neural Network”, IJCSNS International Journal of Computer Science and Network Security, Vol. 8, No.10, pp.224–229, 2008.
[32] Y. Zhang, J.R. Glass, “Unsupervised Spoken Keyword Spotting via Segmental DTW on Gaussian Posterior grams”, IEEE Automatic Speech Recognition and Understanding Workshop, 2009.
[33] C. Weng, B. Juang, “Discriminative Training Using Non-Uniform Criteria for Keyword Spotting on Spontaneous Speech”, IEEE/ACM Transactions on Audio, Speech and Language Processing, 2015.
[34] J. Junkawitsch, L. Neubauer, H. Höge, G. Ruske, “A New Keyword Spotting Algorithm with Pre-Calculated Optimal threshold”, IEEE Conference Paper, 2002.
[35] K. Thambiratnam, S. Sridharan, “Isolated Word Verification Using Cohort Word Level Verification”, EUROSPEECH, 2003.
[36] M. Awaid, A.H. Kandil, S.A. Fawzi, “Audio Search Based on Keyword Spotting in Arabic Language”, International Journal of Advanced Computer Science and Applications, 2014.
[37] S. Tabibian, A. Shokri, A. Akbari, B. Nasersharif, “Performance Evaluation for an HMM-based Keyword Spotter and a Large-margin based one in Noisy Environments”, In Proceeding of the Computer Science, (ELSEVIER), 2011.
[38] T. Nitta, S. Iseji, T. Fukuda, H. Yamada, K. Katsurada, “Key-word Spotting Using Phonetic Distinctive Features Extracted from Output of an LVCSR Engine”, ISCA & IEEE Workshop on Spontaneous Speech Processing and Recognition, 2003.
[39] Y. Wang, P. Getreuer, T. Hughes, R.F. Lyon, R.A. Saurous, “Trainable Fronted for Robust and Far-Field Keyword Spotting”, IEEE transactions, 2017.
[40] G. Retsinas, G. Sfikas, B. Gatos, “Transferable Deep Features for Keyword Spotting”, Multidisciplinary Digital Publishing Institute Proceedings, Vol. 2, No.89, 2018.
[41] V. Smirnov, D. Ignatov, M. Gusev, M. Farkhadov, N. Rumyantseva, M. Farkhadova, “A Russian Keyword Spotting System Based on Large Vocabulary Continuous Speech Recognition and Linguistic Knowledge”, Journal of Electrical and Computer Engineering, 2016.
[42] W. Amelia, N.U. Maulidevi, “Dominant Emotion Recognition in Short Story Using Keyword Spotting Technique and Learning-based Method”, IEEE Transaction, 2016.
[43] W. Li, A. Billard, H. Bourlard, “Keyword Detection for Spontaneous Speech”, Conference Paper of IEEE 2nd International Congress on Image and Signal Processing, 2009.
[44] J. Tejedor, J. Colás, “Spanish Keyword Spotting System Based on Filler Models, Pseudo N-Gram Language Model and a Confidence Measure”, IV Jornadas en Tecnologia del Habla, 2006.
[45] A.S. Park, J.R. Glass, “Unsupervised Pattern Discovery in Speech”, IEEE Transactions on Audio, Speech and Language Processing, Vol. 16, No.1, 2008.
[46] G. Chen, C. Parada, G. Heigold, “Small-Footprint Keyword Spotting Using Deep Neural Networks”, IEEE International Conference on Acoustics, Speech and Signal Processing, 2014.
[47] Lin, Hui, A. Stupakov, J.A. Bilmes, “Spoken keyword spotting via multi-lattice alignment”, INTERSPEECH, 2008.
[48] A.H. Mansour, G.Z.A. Salh, H.H.Z. Alabdeen, “Voice recognition Using Back Propagation Algorithm in Neural Networks”, International Journal of Computer Trends and Technology, Vol. 23, No.3, 2015.
[49] M. Limkar, R. Rao, V. Sagvekar, “Isolated Digit Recognition Using MFCC and DTW”, International Journal on Advanced Electrical and Electronics Engineering, Vol. 1, Issue.1, pp.2278-8948, 2012.
[50] A. Zehetner, M. Hagmuller, F. Pernkopf, “Wake-Up-Word Spotting or Mobile Systems”, Proceedings of the 22nd European Signal Processing Conference, IEEE, 2014.
[51] J. Nouza, J. Silovsky, “Fast Keyword Spotting in Telephone Speech”, Radio Engineering, Vol. 18, No.4, 2009.
[52] B. Varghese, S. Govilkar, “A Survey on Various Word Spotting Techniques for Content-Based Document Image Retrieval”, International Journal of Computer Science and Information Technologies, Vol. 6, No.3, pp.2682-2686, 2015.
[53] Wshah, Safwan, G. Kumar, V. Govindaraju, “Script Independent Word Spotting in Offline Handwritten Documents Based on Hidden Markov Models”, IEEE International Conference on Frontiers in Handwriting Recognition, 2012.
[54] A. Bhardwaj, S. Setlur, V. Govindaraju, “Keyword Spotting Techniques for Sanskrit Documents”, Sanskrit Computational Linguistics, Springer, Berlin Heidelberg, 2009.
[55] Q. Chen, W. Zhang, X. Xu, X. Xing, “Improved Keyword Spotting based on Keyword/Garbage Models”, IEEE Transactions, 2017.
[56] L. Pandey, K. Chaudhary, R.M. Hegde, “Fusion of Spectral and Prosodic Information using Combined Error Optimization for Keyword Spotting”, IEEE Conference on Communication, 2017.
[57] I. Chen, C. Lee, “A Hybrid HMM/DNN Approach to Keyword Spotting of Short Words”, INTERSPEECH, 2013.
[58] S. Lubos, T. Jan, “Keyword Spotting Result Post-processing to Reduce False Alarms”, Proceeding of Recent Advances in Signals and Systems, 2009.
[59] G. Williams, S. Renals, “Confidence Measures for Hybrid HMM/ANN Speech Recognition”, In Proceedings of Eurospeech, pp.1955-1958, 1997.
[60] J.G. Wilpon, L.R. Rabiner, C.H. Lee, E.R. Goldman, “Automatic Recognition of Keywords in Unconstrained Speech Using Hidden Markov Models”, IEEE Transactions on Acoustics, Speech and Signal Processing (ASSP), Vol. 38, No.11, pp.1870-1878, 1990.
[61] M.G. Rahim, C.H. Lee, B.H. Juang, “Discriminative Utterance Verification for Connected Digits Recognition”, IEEE Transactions on Speech and Audio Processing, Vol. 5, No.3, 1997.
[62] V.K. Jain, S. Tripathi, “Speech Features Analysis and Biometric Person Identification in Multilingual Environment”, Int. J. Sc. Res. in Network Security and Communication, Vol. 6, Issue.1, 2018.
[63] Ketan Sarvakar, Urvashi K Kuchara, “Sentiment Analysis of movie reviews: A new feature-based sentiment classification”, International Journal of Scientific Research in Computer Science and Engineering, Vol. 6, Issue.3, pp. 8-12, 2018.

Citations	2325
h-index	16
i10-index	47