Cloud-Based AI for Automatic Audio Production for Personalized Immersive XR Experiences

Rob G. Oldfield, Max S. S. Walley, Ben G. Shirley, Doug L. Williams

In this article, we focus on the machine-learning approach developed for automatic audio source recognition and mixing for the U.K. Government Department of Culture Media and Sport (DCMS) funded collaborative project called 5G Edge-XR. Leveraging graphics processing unit (GPU) acceleration, we deployed innovative algorithms in the cloud so that content can be automatically mixed on-the-fly for a personalized, immersive, and interactive experience for audiences. We describe the algorithms involved, the system architecture, how it has been implemented for immersive live boxing, and also how we are using it to enhance a live in-stadium experience.

Print ISSN
Electronic ISSN
2160-2492
Published
2022-08
Content type
Original Research
Keywords
5G, artificial intelligence (AI), audio, augmented reality (AR), automatic production, broadcast, extended reality (XR), immersive, machine learning, object-based audio, personalized experience
DOI
10.5594/JMI.2022.3184849