Integrating Machine-Learning-Based Operators in Visual Effects Toolsets

Nicolas Moenne-Loccoz

Post-production workflows rely heavily on image processing and computer vision algorithms for the implementation of their visual effects (VFX) tools. In these fields, machine learning has been shown to be disruptive. By integrating the domain statistics from a given training data set, machine-learning-based operators may be more efficient at solving existing problems and may enable new problems to be solved, expanding the VFX toolset. In this study, we share our experience developing and integrating several machine-learning-based operators into software for the post-production industry. We will present a sky segmentation operator, a depth map estimation operator, and an operator to compute face geometric maps (UVs,^* depth, and normals) from sequences of images. More specifically, these operators consist of trained deep convolutional neural networks (DCNNs) taking as input an RGB color image and outputting the associated maps, that is, a matte, depth, normals, and/or UV maps. Such maps permit the application of many different effects to the input image during color grading, including beautification or even relighting of faces. These studies will serve as a case study to review and discuss the multiple challenges posed by the implementation, integration, and deployment of machine-learning-based operators into a VFX toolset.

Print ISSN: 1545-0279
Electronic ISSN: 2160-2492
Published: 2021-06
Content type: Original Research
Keywords: Face relighting, machine learning, monocular depth estimation, sky segmentation, visual FX
DOI: 10.5594/JMI.2021.3072684