Exploring Realtime Conversational Virtual Characters

Ha Nguyen, Aansh Malik, Michael Zink

Advancements in Artificial Intelligence (AI) such as Speech-To-Text, Language Understanding Models, Language Generation Models, and Text-To-Speech enable various types of applications, one of which is real-time conversational Virtual Characters. Building an end-to-end framework with the right AI technology components enables relatable and multi-dimensional Virtual Characters, who can naturally converse in creatively controlled domains, while consistently maintaining their state and personality in pre-determined narratives. In this work, we designed such a conversational framework with interchangeable, and loosely coupled components to support granular creative details in character performance, efficiency in mass creation of Virtual Characters, and flexibility to embrace future improvements of each component in the fields. We then evaluated the robustness and modularity of the framework by creating Melodie, a Virtual Character who is fond of music, and is a fan and promoter of the Eurovision Song Contest. With Melodie, we went through the full cycle from processing a speaker's audio signals, to generating a proper response using a Natural Language Generation model, to synthesizing the response in a character's Voice Font, to finally synchronizing the synthesized response with corresponding body and facial movements to produce a coherent and believable character performance. Testing and analyzing the implementation of Melodie brought forth areas of improvement and ethical considerations that are, and continue to be, essential to the design of our future applications involving Virtual Characters.

Published: 2021-11
Content type: Original Research
Keywords: Virtual Characters, Virtual Beings, Conversational AI, Conversational Characters, Artificial Intelligence
DOI: 10.5594/M001944