Numerical Results
When applied to LTR videos with severe motion blur and motion aliasing, frame interpolation
methods (e.g., Nvidia SlowMo [9] and DAIN [2]) score significantly lower.
However, even methods trained to overcome such challenges, but were trained on external datasets (Flawless [10]),
struggle on videos that do not represent the typical motions and dynamic behaviors they were
trained on. Videos 1-13 are such challenging examples.
Comparing Temporal Upsampling x8 Results on WAIC TSR Dataset
Detailed Comparison of Temporal Upsampling x8 Results on WAIC TSR Dataset
We compared average per-frame PSNR, SSIM and LPIPS[25] values of each
method, as listed in the table above. To avoid boundary effects we did not include the first and last 30 frames of each
sequence.We also disregarded a 20-pixel boundary around each frame when computing
per-frame PSNR. This wide masking of the boundaries was done to accommodate large
margin that some of the other algorithms require
The results above indicate that sophisticated frame-interpolation
methods (DAIN [2], NVIDIA SloMo [9]) are not adequate for the task of Temporal Super
Resolution (TSR), and are significantly inferior (-1 dB) on LTR videos compared to
dedicated TSR methods (Ours and Flawless [10]). Flawless and Ours provide comparable
quantitative results on the dataset, even though Flawless is a pre-trained supervised
method, whereas Ours is unsupervised and requires no prior training examples. Moreover,
on the subset of extremely challenging videos (with highly complex non-rigid
motions), our Zero-Shot TSR outperforms the state-of-the-art externally trained Flawless[10].