Text-to-Speech Generation with Multilingual Assistance and AI Integration for Information Generation

Authors: Dr. JayaSudha K, Appu Gowda B S, Meghana B R, Geetha U, Ashritha G N

DOI: 10.87349/ahuri/181031

Page No: 1-12

Abstract

Multilingual speech technologies are critical to enabling seamless communication across languages, empowering users with diverse linguistic backgrounds, and supporting accessibility in digital systems. This paper presents a complete multilingual Text-to-Speech (TTS) architecture enhanced with Automatic Speech Recognition (ASR), Neural Machine Translation (NMT), and AI-driven information extraction. Unlike conventional TTS systems that simply convert text into speech, this integrated pipeline analyzes speech, extracts meaningful information, translates content to a target language, and finally generates expressive and natural audio. We describe each subsystem in detail, the engineering decisions behind the composite pipeline, and the challenges encountered in multilingual deployment settings.

Download PDF