File-Based Closed Captioning System without Captioning Delay

Yunhyoung Kim, Sunghee Han, Sungwoo Choi, Byunghee Jung

There have been complaints on the time delay between closed captions and dialogues on TV occurring due to live stenography. The problem can be solved if captions are generated beforehand, but it is practically impossible to enforce pre-production for all the broadcasting programs. To cope with it, we propose a new file-based closed captioning system based on speech recognition and audio-fingerprinting. The system aims at rerun episodes—episodes with pre-prepared caption files produced by live stenography at the first-run, thereby containing delayed captions. To match the delayed captions with the dialogues, the system adjusts timelines of the captions based on automatic speech recognition. When it comes to rebroadcasting, since there would be video editing in the rerun episodes, the system finds differences between the rerun and the first-run by comparing their audio-fingerprints and generates new closed caption files. The proposed system is implemented and applied in KBS (Korean Broadcasting System).

Published
2015-10
Content type
Original Research
Keywords
Closed Caption, Speech Recognition, Audio-Fingerprinting, File-based Closed Captioning System
DOI
10.5594/M001675
ISBN
978-1-61482-956-0