Towards Using Audio for Matching Transcoded Content

Dinkar Bhat

With the advent of multiple screens for viewing media, transcoding is becoming a key component of content delivery eco-systems. But transcoding implies that copies and versions of the same content can proliferate across various storage devices. It also means keeping track of content becomes a major problem both from copyright and recording/indexing perspectives. Video-based techniques for content indexing, where the aim is to extract robust signatures from video, have emerged as a major area of research. On the other hand, audio-based techniques have received less focus but audio could provide robust signatures for indexing media while it undergoes transformations. In this paper, we present an investigation of audio signatures under typical transcoding operations. Specifically, we look at Mel-Frequency Cepstral Coefficients (MFCC) as a signature, which has been widely used in audio recognition systems. Initial results indicate that the MFCC is quite robust.

Published: 2012-10
Content type: Original Research
Keywords: Audio, Content indexing, Copy detection, Transcoding
DOI: 10.5594/M001469
ISBN: 978-1-61482-952-2