Twee: A Novel Text-To-Speech Engine

D. Das, H. Hassan, S. Gupta

Open Access Article Go Back

Twee: A Novel Text-To-Speech Engine

D. Das¹ , H. Hassan² , S. Gupta³

Section:Survey Paper, Product Type: Journal Paper
Volume-07 , Issue-01 , Page no. 67-70, Jan-2019

Online published on Jan 20, 2019

Copyright © D. Das, H. Hassan, S. Gupta . This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

View this paper at Google Scholar | DPI Digital Library

XML View

PDF Download

How to Cite this Paper

IEEE Citation
MLA Citation
APA Citation
BibTex Citation
RIS Citation

IEEE Citation

IEEE Style Citation: D. Das, H. Hassan, S. Gupta, “Twee: A Novel Text-To-Speech Engine,” International Journal of Computer Sciences and Engineering, Vol.07, Issue.01, pp.67-70, 2019.

MLA Citation

MLA Style Citation: D. Das, H. Hassan, S. Gupta "Twee: A Novel Text-To-Speech Engine." International Journal of Computer Sciences and Engineering 07.01 (2019): 67-70.

APA Citation

APA Style Citation: D. Das, H. Hassan, S. Gupta, (2019). Twee: A Novel Text-To-Speech Engine. International Journal of Computer Sciences and Engineering, 07(01), 67-70.

BibTex Citation

BibTex Style Citation:
@article{Das_2019,
author = {D. Das, H. Hassan, S. Gupta},
title = {Twee: A Novel Text-To-Speech Engine},
journal = {International Journal of Computer Sciences and Engineering},
issue_date = {1 2019},
volume = {07},
Issue = {01},
month = {1},
year = {2019},
issn = {2347-2693},
pages = {67-70},
url = {https://www.ijcseonline.org/full_spl_paper_view.php?paper_id=595},
publisher = {IJCSE, Indore, INDIA},
}

RIS Citation

RIS Style Citation:
TY - JOUR
UR - https://www.ijcseonline.org/full_spl_paper_view.php?paper_id=595
TI - Twee: A Novel Text-To-Speech Engine
T2 - International Journal of Computer Sciences and Engineering
AU - D. Das, H. Hassan, S. Gupta
PY - 2019
DA - 2019/01/20
PB - IJCSE, Indore, INDIA
SP - 67-70
IS - 01
VL - 07
SN - 2347-2693
ER -

Abstract

With the advancement of technology and the widespread use of smart devices, the world has witnessed that the networking and/or the connectivity horizon has broadened to an exalted level. One of the prominent researches being undertaken in this digital era is the development of Text-to-Speech (TTS) engines; which is capable enough of offering more interactivity with the prevalent smart devices. There are various TTS engines available in the market currently, but these engines lack the capability of showing the effects of human voice e.g., they fail to provide credible indications of the sentiment, mood or emotional state of mind of the speaker etc. Further speaking, presently there is no comprehensible or consummate TTS engine that could replicate human behaviour and/or mannerisms with utmost precision and accuracy. This paper proposes a novel Text-to-Speech engine named ‘Twee’ whose pronunciation works in sync with real world human intelligence. The proposed system is an application of the interdisciplinary field of research whereby domains such as Natural Language Processing, Artificial Intelligence and Digital Signal Processing are amalgamated to perform sentiment analysis on text through the processing of phonemes. This system works well both in mono channel mode and in stereo mode and is capable of generating varied effects on a voice depending on the type of communication.

Key-Words / Index Term

Artificial Intelligence, Natural Language Processing, Digital Signal Processing, Phoneme, Emotion

References

[1] A. Drahota, A. Costall, V. Reddy, “The Vocal Communication of Different Kinds of Smile”, Speech Communication, Vol.50, Issue.4, pp.278-287, 2007. doi: 10.1016/j.specom.2007.10.001
[2] W.Y. Wang, K. Georgila, “Automatic Detection of Unnatural Word-Level Segments in Unit-Selection Speech Synthesis”, In the Proceedings of the 2011 IEEE Workshop on Automatic Speech Recognition & Understanding, Waikoloa, HI, USA, pp.289-294, 2011.
[3] R.E. Remez, P.E. Rubin, D.B. Pisoni, T.D. Carrell, “Speech Perception without Traditional Speech Cues”, Science, New Series, Vol.212, Issue.4497, pp. 947-950, 1981. doi:10.1126/science.7233191
[4] J. Zhang, “Language Generation and Speech Synthesis in Dialogues for Language Learning”, Massachusetts Institute of Technology, pp.1-68, 2004.
[5] S. Lemmetty, “Review of Speech Synthesis Technology”, Helsinki Universty of Technology, pp.1-113, 1999.
[6] I.G. Mattingly,"Speech synthesis for phonetic and phonological models", Current Trends in Linguistics. Mouton, The Hague, Vol. 12, pp.2451–2487, 1974.
[7] FFmpeg Git, "FFmpeg 4.0 "Wu"", last accessed 2018-07-18.
[8] Takanishi Lab Webpage, "Anthropomorphic Talking Robot Waseda Talker Series", Retrieved from http://www.takanishi.mech.waseda.ac.jp/top/research/voice/index.htm, last accessed 2018-10-10.
[9] Deepmind Webpage, "WaveNet: A Generative Model for Raw Audio”, Retrieved from https://deepmind.com/blog/wavenet-generative-model-raw-audio/, last accessed 2018-09-08.

Citations	8797
h-index	34
i10-index	152

Impact Factor :	3.802
ISSN :	2347-2693 (Online)