Toward Using Audio for Matching Transcoded Content

Dinkar Bhat

With the advent of multiple screens for viewing media, transcoding is becoming a key component of content delivery ecosystems. But transcoding implies that copies and versions of the same content can proliferate across various storage devices. It also means that keeping track of content becomes a major problem from both copyright and recording/indexing perspectives. Video-based techniques for content indexing, where the aim is to extract robust signatures from video, have emerged as a major area of research. On the other hand, audio-based techniques have received less focus, but audio could provide robust signatures for indexing media while it undergoes transformations. This paper presents an investigation of audio signatures under typical transcoding operations. Specifically, mel-frequency cepstral coefficients (MFCCs) are examined as a signature, which has been widely used in audio recognition systems. Initial results indicate that MFCCs are quite robust.

Print ISSN: 1545-0279
Electronic ISSN: 2160-2492
Published: 2014-01
Content type: Original Research
DOI: 10.5594/j18367XY