Learning by Aligning Videos in Time

A Self-Supervised Approach for Training Viewpoint-, Actor-, and Scene-Invariant Video Representations.

DOWNLOAD THE OPEN ACCESS VERSION FROM THE COMPUTER VISION FOUNDATION (CVF)
Play Video

DENSE PER-FRAME LABELS FOR 2123 VIDEOS ANNOTATED BY US.