An Efficient Quality Metric for Video Frame Interpolation

Conall Daly, Darren Ramsook, Anil Kokaram

Video Frame Interpolation (VFI) enhances temporal video quality for applications like slow-motion effects and broadcast frame-rate conversion. While modern VFI methods using optical flow and deep networks can handle complex motion and occlusions, evaluating interpolated content quality remains challenging. Traditional metrics like PSNR and SSIM ignore temporal information, while perceptual metrics like LPIPS focus only on spatial aspects, missing critical motion coherence that significantly affects perceived quality. Recent VFI-specific metrics like FloLPIPS incorporate optical flow errors to detect temporal inconsistencies but are computationally expensive (5.5× slower than LPIPS), limiting practical use in training or real-time assessment. Building on our previous work in motion picture restoration, we developed PSNRDIV, which uses vector field divergence to detect spatial irregularities in optical flow that indicate temporal inconsistencies. By weighting PSNR with this feature, we can identify problematic motion patterns, that most degrade perceptual quality in interpolated frames. PSNRDIV requires only one sequence's motion field, significantly reducing computational load. Evaluation on the BVI-VFI dataset (180 diverse sequences) shows statistically significant improvements over FloLPIPS: +0.09 Pearson Linear Correlation Coefficient, +0.05 Spearman Rank-Order Correlation Coefficient, and -1.38 Root Mean Squared Error improvement. It achieves these gains while being 2.5× faster and using 6× less memory, with consistent performance across content categories and robustness to different optical flow estimators.

Published
2025-10-13
Content type
Original Research
Keywords
video frame interpolation quality, temporal consistency metrics
ISBN
978-1-61482-966-9