Video Compression Using Convolutional Neural Networks of Video with Chroma Subsampling

Vahid Khorasani Ghassab, Rob Gonsalves, Shailendra Mathur, Nizar Bouguila

In the context of Convolutional Neural Networks based video compression, motivated by the lower acuity of the human visual system for color differences as compared with luma, we investigate a video compression framework using autoencoder networks that encode and decode videos by using less chroma information than luma information. For this purpose, instead of converting Y'CbCr 4:2:2/4:2:0 videos to and from RGB 4:4:4 as per the current state-of-the-art, we have kept the video in Y'CbCr 4:2:2/4:2:0 and merged the luma and chroma channels after the luma is downsampled to match the chroma size. We have performed an inverse function for the decoder. The performance of our models against the 4:4:4 baseline is evaluated by using CPSNR, MS-SSIM, and VMAF metrics. Our experiments reveal that, as compared to video compression involving conversion to and from RGB 4:4:4, the proposed method increases the video quality by about 5% for Y'CbCr 4:2:2 and 6% for Y'CbCr 4:2:0 while reducing the amount of computation by nearly 37% for Y'CbCr 4:2:2 and 40% for Y'CbCr 4:2:0. These results point us to optimization for 4:2:2 and 4:2:0 video of the current state-of-the-art autoencoder.

Published: 2022-10
Content type: Original Research
Keywords: Video Compression, Neural Networks, Machine Learning, CNN, Chroma Subsampling
DOI: 10.5594/M001957
ISBN: 978-1-61482-963-8