AI Translation and Dubbing with Emotional Accuracy

Denys Krasnikov, Alexander Konovalov

Integrating artificial intelligence (AI) into media has led to significant advancements in translation and dubbing, areas previously limited by technological constraints. However, initial Text-to-Speech (TTS) technologies were primarily developed for voice interfaces rather than cinematic dubbing, where emotional integrity is paramount. As a result, these early solutions are insufficient for the high emotional demands of film production. This paper examines AI's potential to enhance emotional accuracy in media translation and dubbing, focusing on the critical challenge of interlingual emotion transfer. Traditional methods to date do not allow for the preservation of the emotional integrity of content, leading to reduced audience engagement and cultural authenticity. Our research aims to develop and apply AI models that accurately detect and reproduce emotional cues across languages, improving both engagement and authenticity. While promising results have been achieved, challenges remain in the areas of cultural context and real-time processing, which this paper addresses in detail.

Published
2024-10-21
Content type
Original Research
Keywords
ai translation, dubbing, emotional accuracy, natural language processing (nlp), deep learning, speaker diarization, ai ethics, cultural adaptation, responsible ai, interlingual transfer, video production, speech recognition, machine learning, vidby ag, real-time processing, ai-driven media, interlingual emotion transfer, voice cloning, text-to-speech, lip sync, accessibility in media, large language models (llm)
DOI
10.5594/MOO/3043
ISBN
978-1-61482-965-2