Skip to main content

Open Images V7 Released, Adds New Point-Based Labeling & Visualizers

Google has released the newest version of the Open Images dataset, bringing a new point-based annotation system as well as a few new data visualizers.
Created on October 26|Last edited on October 26
Google's Open Images dataset was first released in 2016 and has seen a number of upgrades over the years; Version 6 came out over two and a half years ago, and now version 7 is released with a new point-based labeling system.
The researchers found that the standard color-fill approach to semantic labeling for image datasets presented a bottleneck of sorts, limiting the potential and scalability of any image dataset's use. To remedy this issue, they implemented a new system that identifies and labels content at individual points within an image, now available in Open Images V7.


Labeling with points

The new point-based labeling system scatters a number of points across an image, each point consisting of a question (ie. "Is this point on a chair?") and an answer ("Yes" or "No").
To create all of these new annotations, machine learning models were first used to determine points of interest within an image and create a question for them. Then, these data points are fed to human annotators to answer the questions (yes, no, or unsure); Answers for each data point were aggregated from all annotators to determine the answer that would then be used for the final dataset.
38.6 million point annotations were created with this process (12.4 million which were labeled with a "yes"), covering 5.8 thousand classes and 1.4 million images. The new point annotations cover many times more classes than other annotation styles within the dataset because the highly efficient annotation process allowed it.


New visualizations

Three new annotation visualizations are available for the Open Images dataset with the version 7 release: Visualizations for the new point-based labels, localized narrative annotations, and an all-in-one view.
The point-based view shows circles and squares (corresponding to yes and no answers) and the classes to which they are assigned to.
Next, the localized narrative annotations are displayed with squiggly lines together with a text description beneath. The squiggly lines represent the annotator's mouse movement which moves along with the narration of the description, connecting visual and text data temporally.
Finally, the new all-in-one view displays all the different kinds of annotations present in an image in one concise (if a little messy) view.


Find out more

Tags: ML News
Iterate on AI agents and models faster. Try Weights & Biases today.