Fork me on GitHub

The Weizmann Institute of Science
Faculty of Mathematics and Computer Science
Computer Vision Lab

Ablation Study

We designed an ablation study to examine the power of cross-dimension augmentations for all videos in the dataset.

It compares the performance of our network with the following configurations:
"Within" - Training only on examples from same-dimension
"Across" - Training only on examples across-dimensions
"Best" - Training each video on its best configuration – "within", "across", or on both.

Since our atomic TSRx2 network is trained only on a coarse spatial scale of the video, we performed the ablation study at that scale (hence the differences between the numeric values here and in the numerical results section).

The following tables shows the ablation study results on WAIC TSR dataset
The top table shows a summary of the mean results. Below is a table with detailed results for each video in the dataset

Ablation Study Mean Results on WAIC TSR Dataset

Mean Results Ablation Study Results Table

Ablation Study Detailed Results on WAIC TSR Dataset

Ablation Study Results Table

The above table indicates that, on the average, the cross-dimension augmentations are more informative than the within (same-dimension) augmentations. However, since different videos have different preferences, training each video with its best within and/or across configuration provides a small additional overall improvement. For more details see paper.