Improving Google Assistant: ‘Look and Talk,’ more quick phrases, improved skin tone recognition, and future developments

Look and Talk on Nest Hub Max | Google AssistantLook and Talk on Nest Hub Max | Google Assistant
Watch this video on YouTube

We are seeing a great deal of improvements in Google Assistant recently. One of the best things to highlight is the company’s official announcement of the software application’s new feature called “Look and Talk” during its Google I/O keynote. Nonetheless, there are also other details worth mentioning and appreciating, especially if you greatly rely on the Assistant in your daily activities. These include Google Assitant’s improvement in terms of recognizing skin tones and expansion of its quick phrases library.

Recently, the new Look and talk feature introduced by Google is seen rolling out widely to all Nest Hub Max users in the US. The main idea behind it is simple: make the users’ interactions with the device more straightforward and, most of all, more natural. This simplifies the delivery of commands to Google Assitant by removing the cue phrase “Hey Google” every time a person needs to activate the Nest Hub Max. The feature works through the coordination of different technologies integrated by Google. Specifically, Look and Talk uses the Face Match and Voice Match capabilities of the system, helping it to determine when to respond.

Using the Look and Talk feature, the user just needs to stand no more than 5 feet away from the Nest Hub Max, stare, and command the Google Assistant. “Let’s say I need to fix my leaky kitchen sink,” says Google Assistant Vice President Sissie Hsiao, trying to explain how Look and Talk works in the blog post. “As I walk into the room, I can just look at my Nest Hub Max and say ‘Show plumbers near me’ — without having to say ‘Hey Google’ first.”

Hsiao also adds that the video of the interactions being analyzed by the Assistant is “processed entirely on-device,” assuring that your data is not being shared with Google or any other third-party apps. Hsiao also stresses that the new feature respects privacy, so you have the option to opt in or out of it anytime. It is initially deactivated, and you need to turn it on via the Google Home app. Just go to the device setting of the Nest Hub Max, then to “Recognition & sharing,” then to the “Face match” menu, and toggle on the setting.

“There’s a lot going on behind the scenes to recognize whether you’re actually making eye contact with your device rather than just giving it a passing glance,” notes Hsiao. “In fact, it takes six machine learning models to process more than 100 signals from both the camera and microphone — like proximity, head orientation, gaze direction, lip movement, context awareness and intent classification — all in real time.”

On the other hand, given that Look and Talk works through Face Match, it is important to note that Google made sure to make it effective to a diversity of users by including the Real Tone tech it launched last year. This allows the Nest Hub Max camera to work efficiently across different skin tones. Additionally, the company promises to push things further by utilizing the “Monk Skin Tone Scale” in order to help the machine understand images more efficiently.

Moreover, in hopes of diminishing the need to say the cue phrase “Hey Google” more, Google is also including more quick phrases in Nest Hub Max. This makes things simpler for users without having to stare at the device’s camera or stand in front of it. Like the Look and Talk, the Voice Match that manages the work of quick phrases can also be turned off and on.

While the improvements revealed by Google make the Assistant more satisfying this time, Google says that it still has more plans for the software in the future. It includes giving it better speech and language models to “understand the nuances of human speech.” As of now, the company is working on a custom-engineered Tensor chip to allow the Assistant to handle on-device machine learning tasks in the fastest way possible. Once successful, Google says that this promising technology will help the Assistant get a better understanding of human speech even with the presence of unnecessary fillers (like “uhm” or “ahh”) and pauses when someone is speaking.

Original Article