Skip to main content

Google I/O '22 Keynote Recap For Machine Learning

Google I/O 2022 has revealed a lot, so here's a full recap of everything machine learning that was talked about. Machine learning really took to lead at Google I/O 2022, so there's a lot to talk about.
Created on May 11|Last edited on May 11
Google I/O 2022 has officially kicked off. Running through today and tomorrow, we'll get to see everything new Google's got on offer for us, including countless announcements regarding machine learning in their services and devices.
Here I'll be recapping everything that was mentioned in the head keynote presentation regarding machine learning. From Google Search improvements with ML, to new Google Pixel devices featuring the powerful Tensor core, there's a lot to chew on in today's keynote presentation.
Here's the keynote presentation VOD if you missed it, or if you'd like to follow along:


There's a lot to go through, so I'll provide a table of contents here:


Google Translate

While most translation models use a bilingual training approach, many languages do not have the required large datasets to support such endevours. Google's developed an approach for monolingual training, allowing their translation models to learn a language in and out with just the single language alone, and leveraging that into translation capabilities deemed "of sufficient quality to be useful".

With this advancement in translation model training, Google was happy to announce 24 additional languages will be added to Google Translate.

Google Maps

Google Maps has collected data on over 1.6 billion buildings and over 60 million kilometers of roads to give Maps the detail we need to navigate our cities and environments. They noted that it's difficult to map rural areas due to a lack of data, but thanks to improvements in ML, they are able to use machine learning to detect and label buildings based on sattelite imagery, boasting a 5-fold increase in buildings mapped in Africa thanks to this tech. Over 20% of buildings on Google Maps globally have been detected using these new ML techniques.

Something else Google has been working on is using the power of machine learning to develop explorable 3D renderings of cities within Google Maps. By extrapolating various means of data with machine learning, Google was happy to introduce Immersive View for Maps. Immersive View lets you zoom and pan around 3D scenes of the places you'd like to visit.
Taking this Immersive View feature even further, Google revealed a new feature that lets you view the inside of restaurants through a generated 3D scene. These scenes are of course all thanks to machine learning, able to take select photos and extrapolating that data into a fully realized 3D rendering.

These features will be coming to Google Maps for select cities this year.

YouTube

Last year Google brought chapter automation to YouTube, taking that extra job off content creators' backs. Thanks to multi-modal technology from DeepMind, their auto-chapter generation now accounts for visuals, audio, and text all together simultaneously to more quickly and accurately generate chapters.
With these improvement, they promise a jump from the current 8 million auto-chaptered videos up to 80 million over the next year.

Another feature we're all familiar with is YouTube's auto-generated captions, which is powered by speech recognition models. These features will now be coming to all Android and IOS devices.
Expanding this, auto-translation of generated captions is also becoming available to mobile users, allowing for translation into 16 different languages thanks to machine learning.

Google Workspace

A fantastic new feature has been added to Google Docs, powered by advances in natural language process, a document summary can now be generated by AI. The models that do this are able to comprehend, extract, and compress the relevent information from a lengthy doc into an easily understandable TL;DR at the top.
These machine learning powered summaries will be coming to Google Chat in the next month, as well as for Google Meet in the future.

In addition to coming summary features in Google Meet, a number of other machine learning powered enhancements will be coming to the service. Image processing models will be able to improve your video quality in any situation, touching up your face while improving lighting conditions all with customizable parameters, allowing you to put out quality video despite not having a studio lighting setup in your living room.

Google has been making a number of enhancements into their search models, including features like voice control and Google Lens which allows you to search with a picture.
Last month they revealed Multisearch, a searching method that combines a variety of methods into one, such as taking a picture and providing a text prompt at the same time to hone in on the results you really need. To expand Multisearch, Google will be soon supporting "near me" searches, allowing you to do things like take a picture of some tasty food you forget the name of, and ask for restaurants nearby that serve it.
The biggest feature we should be looking at now, however, is something they're calling Scene Exploration, an expansion to multisearch that allows you to pan your camera across a scene and instantly get insights into it.
The example they proposed was searching for the perfect chocolate bar, but there's so many chocolate bars to choose from and it could get quite tiresome to find one that fits all your requirements. With Scene Exploration, you can scan your phone's camera across the shelf of chocolate bars and it will overlay the information you're seeking right on your phone's screen. It's basically a "Ctrl-F" for real life.

Google's clear aim is to make search feel more natural, making it easy to search any where in any way.

Google Assistant

Google's recognized that saying "Hey Google" every time we need something from our phones is a pain, so they're working to move past the need for that keyword. A new feature has been added to Google Nest Hub Max that allows you to simply look at the device and ask it what you need.
By using 6 different ML models all processed on-device to determine intent, it will be able to know whether you're just glancing over or if your intent is to ask it a question.
Speech processing enhancements will also be coming to to Google's assistant.
With improvements to the models that power the assistants to understand speech, Google assistant will now be able to comprehend the "umm"s and pauses that are natural to human speech. Not only that, but the assistant can intuit your intent if you stumble and forget words.
No longer will you have to prepare yourself for exactly what you're going to say and awkwardly speak it out, you can just speak at your own pace and the improved on-device models will understand you.


Google's Machine Learning Announcements

LaMDA 2

The follow up to Google LaMDA, a breakthrough conversational AI, has been announced as LaMDA 2.
Alongside the announcement of LaMDA 2, Google has made a utility called AI Test Kitchen, a place where interested users can test out the capabilities of Google's ML models. AI Test Kitchen is currently running with 3 experiences you can use to try out the conversational power of LaMDA 2.

  • Imagine It: With a user-provided prompt such as "Imagine I'm at the deepest part of the ocean", LaMDA will intricately describe what it would be like to be in the Mariana Trench. You can continue exploring more sensations from there, with model-proposed follow-up questions.
  • Talk About It: Many conversational models will veer off-topic or go down weird tangeants, but LaMDA is able to keep focus and even move back to the main topic if the user decides to stray. In this experience, you talk to LaMDA about dogs.
  • List It: LaMDA is able to create a list of sub-tasks regarding a prompt of something you'd like to do. With the prompt "I want to plant a vegetable garden", LaMDA will generate a list of tasks that break down the process of what you need to create a vegetable garden. From there, you are able to use LaMDA to break down sub-tasks into even more granual sub-sub-tasks.
The actual factual content that LaMDA 2 puts out still has potential issues in terms of accuracy, so Google will be inviting feedback in the app so that you can provide any issues you find with LaMDA 2.
Over time, more experiences will be added to AI Test Kitchen, so keep an eye out.

PaLM

Google has announced yet another new model, named PaLM (short for Pathway Language Model), trained on 540 billion parameters, their largest model yet. PaLM is able to perform a wide range of complex natural lanuage tasks, thanks to a process called Chain-of-Thought Prompting.
When asking NLP models questions, they are often initiailized with a question and answer, then asked further questions. By including the work to come to the answer in the promp question, PaLM is able to understand the reasoning process which goes into coming to an answer, and is able to answer correctly more reliably, and even show it's work.

This Chain-of-Thought Prompting boosts PaLM's solve rate from 18% to 58% in math word problems.
PaLM can do more than math word problems, though. By prompting with a question in some language (such as Bengali), and providing answers in that same language and in English, PaLM is able to answer your Bengali questions in both English and Bengali, despite never having been trained on Bengali to English translation or even answering questions for that matter.


Machine Learning Hub

Available for Google Cloud users, Google has announced the launch of the world's largest publically available machine learning hub, positioned in Maise County Oklahoma
. Featured in this massive hub is 8 TPUv4 pods built on the same infrastucture that powers Google's largest neural models. These pods provide nearly 9 eaflops of compute power, bringing an immense amount of power to users and researchers using the Google Cloud platform. I
n addition, Google claims that this center is already running at 90% carbon-free energy, a step forward in their goal of having all their datacenters and campusses running fully carbon-free by 2030

Google Pixel & Tensor

A whole slew of information about the Google Pixel line of products was just announced, including the Pixel 6a, Pixel 7 & 7 Pro, Pixel Buds Pro, Pixel Watch, and even a Pixel Tablet.
What we machine learning enthusiasts should know is that following in line with the Pixel 6 and 6 Pro, the Pixel 6a will be featuring the same exact Google Tensor processor as it's sibling phones. The Pixel 6a comes to $449, yet still maintains the same powerful ML-focused processing power of the Google Tensor chip that the others delight in.
The highlight I think here is the Pixel 7 and Pixel 7 Pro just revealed today.
While we don't know a lot about them yet, we do know that they will be powered with an even more powered next-gen version of Google's Tensor chip, allowing for even quicker processing of the variety of machine learning-based AI models powering features Google has been releasing day-after-day.
The announced Pixel Tablet will also be running on a Google Tensor chip.
Additionally, the Pixel Buds Pro just announced will be using machine learning technology to power the impressive noise cancellation and microphone noise suppression capabilities.


Google Glass Teaser & Augmented Reality

A new teaser featuring a set of AR glasses was shown at the very end of the keynote. These prototype glasses were presented as taking the full capability of existing ML-powered features like Google Lens and Google Translate to the immersive field of vision your eyes provide. With features like live translation, these glasses can effectively give subtitles to real life.


Find out more

Tags: ML News
Iterate on AI agents and models faster. Try Weights & Biases today.