Cinematic Sound Scene Description and Rendering Control

Charles Q. Robinson, Nicolas Tsingos

Surround sound has been making cinematic storytelling more compelling and immersive for more than 30 years. The first widely deployed surround systems used magnetic recording. Later, optical recording became standard, enabling up to 7.1 channels of audio. Most recently, a new paradigm for distribution and playback of cinema sound has been deployed that carries audio streams as well as parameters (metadata) to render the audio customized to the available loudspeaker configuration. In the process of developing this system, it has become clear that the art of cinema sound requires more than a physical description of sound sources and the acoustic environment. The audio distribution format must allow the sound designer, mixer, and director to express their intent in ways that go beyond audio scene description and that include explicit instructions on how to render the content in exhibition. In this paper, we will describe the metadata characteristics that enable preservation of the essential elements of a surround mix to ensure consistent and reliable translation to cinemas. We also present data collected from recent cinema soundtracks to show how such data are used in practice.

Print ISSN: 1545-0279
Electronic ISSN: 2160-2492
Published: 2015-11
Content type: Original Research
Keywords: cinema, surround, sound, spatial audio, metadata
DOI: 10.5594/j18640