TAP-Vid: DeepMind's New Dataset & Benchmark For Point Tracking
DeepMind has introduced a new dataset for point tracking video in 3D space.
Created on November 8|Last edited on November 8
Comment
Today, DeepMind introduced TAP-Vid, a new dataset featuring videos annotated with tracked points. The dataset, and it's related materials, were created as part of a paper submitted and accepted to NeurIPS 2022. The datasets are freely downloadable, and a GitHub repository is set up to guide those looking to use it.
TAP - tracking any point
The researchers behind TAP-Vid identified a lack of datasets like it - video annotated with tracked points - and wanted to fill that void. Using a point-based approach to spatial tracking in video scenes allows for an understanding of 3D space much better than the standard bounding-box approach; Points can appear to move relative to each other and can be occluded by objects, presenting a clearly 3D space (point-tracking is a common method used for making realistic special effects and 3D CGI).
TAP-Vid is available in a few different subsets: TAP-Vid-Kinetics, TAP-Vid-DAVIS, and TAP-Vid-RGB-Stacking (based on the Kinetics dataset, DAVIS dataset, and RGB-Stacking simulator respectively). For the full benchmark, Kubric is also used (a synthetic dataset with the ability to create an arbitrary number of point tracking annotations).
Find out more
Add a comment
Tags: ML News
Iterate on AI agents and models faster. Try Weights & Biases today.