VAARTALAP: Embedding Whisper-AI-like Model into a Video-Conferencing System to Aid Real-Time Translation and Transcription

Kunal Kashyap, Prashant Singh, Anjali Verma, Satya Mishra, Prachi Goel

Open Access Article Go Back

VAARTALAP: Embedding Whisper-AI-like Model into a Video-Conferencing System to Aid Real-Time Translation and Transcription

Kunal Kashyap¹ , Prashant Singh² , Anjali Verma³ , Satya Mishra⁴ , Prachi Goel⁵

Undergraduate student, CSE Department, ADGIPS, New Delhi, India.
Undergraduate student, CSE Department, ADGIPS, New Delhi, India.
Undergraduate student, CSE Department, ADGIPS, New Delhi, India.
Undergraduate student, CSE Department, ADGIPS, New Delhi, India.
CSE Department, ADGIPS, New Delhi, India.

Section:Research Paper, Product Type: Journal Paper
Volume-11 , Issue-12 , Page no. 21-25, Dec-2023

CrossRef-DOI: https://doi.org/10.26438/ijcse/v11i12.2125

Online published on Dec 31, 2023

Copyright © Kunal Kashyap, Prashant Singh, Anjali Verma, Satya Mishra, Prachi Goel . This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

View this paper at Google Scholar | DPI Digital Library

XML View

PDF Download

How to Cite this Paper

IEEE Citation
MLA Citation
APA Citation
BibTex Citation
RIS Citation

IEEE Style Citation: Kunal Kashyap, Prashant Singh, Anjali Verma, Satya Mishra, Prachi Goel, “VAARTALAP: Embedding Whisper-AI-like Model into a Video-Conferencing System to Aid Real-Time Translation and Transcription,” International Journal of Computer Sciences and Engineering, Vol.11, Issue.12, pp.21-25, 2023.

MLA Style Citation: Kunal Kashyap, Prashant Singh, Anjali Verma, Satya Mishra, Prachi Goel "VAARTALAP: Embedding Whisper-AI-like Model into a Video-Conferencing System to Aid Real-Time Translation and Transcription." International Journal of Computer Sciences and Engineering 11.12 (2023): 21-25.

APA Style Citation: Kunal Kashyap, Prashant Singh, Anjali Verma, Satya Mishra, Prachi Goel, (2023). VAARTALAP: Embedding Whisper-AI-like Model into a Video-Conferencing System to Aid Real-Time Translation and Transcription. International Journal of Computer Sciences and Engineering, 11(12), 21-25.

BibTex Style Citation:
@article{Kashyap_2023,
author = {Kunal Kashyap, Prashant Singh, Anjali Verma, Satya Mishra, Prachi Goel},
title = {VAARTALAP: Embedding Whisper-AI-like Model into a Video-Conferencing System to Aid Real-Time Translation and Transcription},
journal = {International Journal of Computer Sciences and Engineering},
issue_date = {12 2023},
volume = {11},
Issue = {12},
month = {12},
year = {2023},
issn = {2347-2693},
pages = {21-25},
url = {https://www.ijcseonline.org/full_paper_view.php?paper_id=5643},
doi = {https://doi.org/10.26438/ijcse/v11i12.2125}
publisher = {IJCSE, Indore, INDIA},
}

RIS Style Citation:
TY - JOUR
DO = {https://doi.org/10.26438/ijcse/v11i12.2125}
UR - https://www.ijcseonline.org/full_paper_view.php?paper_id=5643
TI - VAARTALAP: Embedding Whisper-AI-like Model into a Video-Conferencing System to Aid Real-Time Translation and Transcription
T2 - International Journal of Computer Sciences and Engineering
AU - Kunal Kashyap, Prashant Singh, Anjali Verma, Satya Mishra, Prachi Goel
PY - 2023
DA - 2023/12/31
PB - IJCSE, Indore, INDIA
SP - 21-25
IS - 12
VL - 11
SN - 2347-2693
ER -

VIEWS	PDF	XML
135	151 downloads	52 downloads

Bar Line

Abstract

In this world of digitalized communication, effective communication crosses regional boundaries and linguistic obstacles making the world more connected. The demand for seamless multilingual communication has never been more important as corporations, institutions, and individuals engage on a worldwide scale. This article explores a trailblazing initiative that uses real-time translation and transcription services offered by Whisper-AI to transform the world of video conferences. The goal of the research is to create an AI model that easily interfaces a translation and transcription-based model to work in a real-time video conferencing system. Participants may converse in real-time without any language barriers by utilizing cutting-edge voice recognition and translation technologies.

Key-Words / Index Term

Sound transcription, Sound translation, AI, Deep learning, Real-time, Language barrier, NLP.

References

[1]. A. Radford, J.W. Kim, T. Xu, G. Brockman, C. McLeavey, and I. Sutskever, “Robust Speech Recognition via Large- Scale Weak Supervision”, arXivpreprint, arXiv:2212.04356, 2022. doi10.48550/ arXiv.2212.04356
[2]. E. Cho, C. Fügen, T. Herrmann, K. Kilgour, M. Mediani, C. Mohr, J. Niehues, K. Rottmann, C. Saam, S. Stüker, and A. Waibel. 2013. “A real- world system for contemporaneous restatement of German lectures”, In the Proceedings of the 2013 INTERSPEECH, Lyon, France, pp.3473-3477, 2013.
[3]. N. Arivazhagan, C. Cherry, I. Te, W. Macherey, P. Baljekar and G. Foster, "Re-Translation Strategies for Long Form, Simultaneous, Spoken Language Translation," ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Barcelona, Spain, pp.7919-7923, 2020. doi: 10.1109/ICASSP40776.2020.9054585.
[4]. Rothman and A. Gully, “Mills for Natural Language Processing” Second Edition, Packt Publishing, UK, ch. 2, 2022, ISBN 9781803247335
[5]. T. Chen, W. Wang, W. Wei, X. Shi, X. Li, J. Ye and K. Knight, "DiDi’s Machine Restatement System for WMT 2020", In the Proceedings of the 2020 Workshop on Statistical Machine Translation (WMT), pp.105?112, 2020.

Citations	2325
h-index	16
i10-index	47