A Hierarchical Codec for AI-Native Live Video Workflows

Guendalina Cobianchi, Brent Yates, Andy Beach

Artificial intelligence is becoming central to live media workflows, powering real-time object tracking, automated replay generation, camera switching, and personalized viewing experiences. These applications demand extremely low latency and efficient use of compute and network resources. Existing mezzanine formats such as JPEG 2000 and ProRes were designed for human viewing, not machine inference. They require full-frame decoding and multiple transcoding stages, which add delay, duplicate processing, and inflate infrastructure costs as AI workloads scale. This paper introduces VC6, a hierarchical, region-aware codec standardized as SMPTE ST 2117–1, and demonstrates its integration with swXtch.io's AI-native multicast networking to enable selective, AI-ready live video transport. Unlike traditional codecs, VC6 supports multi-resolution Levels of Quality (LOQ) and region-of-interest (ROI) decoding, allowing AI models to access only the portions of a frame needed for inference. When combined with swXtch.io's multicast overlay, a single VC6 stream can serve multiple models in parallel without redundant encodes or proxy pipelines. Benchmark testing on NVIDIA RTX A6000 and Intel i9-13900K systems shows up to 73% faster decoding compared to CUDA JPEG, 20× faster preprocessing when scaling is integrated into decode, and approximately 10× lower encode/decode complexity than JPEG 2000. Live multicast evaluations confirm predictable low latency even when multiple AI workloads run concurrently from a shared source. By moving preprocessing intelligence into the transport layer, the combined VC6 and swXtch.io architecture creates a practical foundation for scalable, energy-efficient, and standards-compliant AI-native live video workflows. This work represents the first implementation of a SMPTE-standardized hierarchical codec directly aligned with AI inference pipelines, offering a pathway for future interoperability within ST 2110 and related IP media ecosystems.

Published
2025-10-13
Content type
Original Research
Keywords
vc6, smpte st 2117–1, ai-native networking, multicast, region-of-interest decoding, live video workflows, preprocessing acceleration
ISBN
978-1-61482-966-9