Cloud-Based AI for Automatic Audio Production for Personalized Immersive XR Experiences

Rob G. Oldfield, Max S. S. Walley, Ben G. Shirley, Doug L. Williams

In this article, we focus on the machine-learning approach developed for automatic audio source recognition and mixing for the U.K. Government Department of Culture Media and Sport (DCMS) funded collaborative project called 5G Edge-XR. Leveraging graphics processing unit (GPU) acceleration, we deployed innovative algorithms in the cloud so that content can be automatically mixed on-the-fly for a personalized, immersive, and interactive experience for audiences. We describe the algorithms involved, the system architecture, how it has been implemented for immersive live boxing, and also how we are using it to enhance a live in-stadium experience.

Print ISSN: 1545-0279
Electronic ISSN: 2160-2492
Published: 2022-08
Content type: Original Research
Keywords: 5G, artificial intelligence (AI), audio, augmented reality (AR), automatic production, broadcast, extended reality (XR), immersive, machine learning, object-based audio, personalized experience
DOI: 10.5594/JMI.2022.3184849