Scene-based Audio Implemented with Higher Order Ambisonics

Nils Peters, Deep Sen, Moo-Young Kim, Oliver Wuebbolt, S. Merrill Weiss

Scene-based audio uses a sound-field technology called “higher-order ambisonics” (HOA) to create holistic descriptions of both live-captured and artistically created sound scenes that are independent of specific loudspeaker layouts. For efficient representation, the audio can be carried as a set of PCM channels that contain predominant sounds and ambience in separate tracks. Standard audio bandwidth-compression techniques can then be applied to the PCM channels. This approach is in contrast to conventional channel-based sound representations in which one signal is used for each loudspeaker of a target reproduction system, with the implication that upmixing or downmixing is required when loudspeaker configurations other than the intended one are used for actual reproduction. This paper examines how, with scene-based audio, there can be satisfactory reproduction of immersive sound at bitrates corresponding to the equivalent of only six channels, whereas an alternative sound-field method that exclusively employs audio objects typically involves much higher bitrates.

Print ISSN: 1545-0279
Electronic ISSN: 2160-2492
Published: 2016-11
Content type: Original Research
Keywords: Scene-based audio, higher-order ambisonics, HOA, spatial audio, next-generation audio, MPEG-H, ATSC 30
DOI: 10.5594/JMI.2016.2623398