Google Lens Now Provides Answers to Your Video Questions

Google is enhancing its visual search app, Lens, with a new feature that allows users to ask near-real-time questions about their surroundings through video.

English-speaking Android and iOS users with the Google app can now capture a video using Lens and inquire about interesting objects as they record.

Lou Wang, Director of Product Management for Lens, explained that this feature leverages a “customized” Gemini model, which interprets the video alongside the relevant questions posed. The Gemini model underpins various products across Google’s platform, empowering it with advanced AI capabilities.

“For example, if you're curious about a particular fish,” Wang shared during a press briefing, “[Lens will] generate an overview detailing why it swims in a circle, along with additional resources and valuable information.”

To try out Lens' new video analysis capability, users need to enroll in Google’s Search Labs program and opt into the experimental features labeled "AI Overviews and more." By holding down the shutter button in the Google app, users can initiate the video-capturing mode of Lens.

While recording, if a user asks a question, Lens will provide an answer sourced from AI Overviews, which summarizes information gathered from the web.

Wang mentioned that Lens intelligently selects the most "interesting" frames from the video to ground the AI-generated answer, ensuring relevance to the user’s inquiry.

“All of this stems from observing how people currently interact with Lens,” Wang added. “By making it easier to ask questions and satisfying users' curiosity, we believe they’ll adopt this feature naturally.”

The introduction of video capabilities for Lens follows a similar preview by Meta last month for its AR glasses, Ray-Ban Meta. Meta aims to incorporate real-time AI video functionalities, enabling wearers to ask questions about their surroundings, such as “What type of flower is this?”

OpenAI has also hinted at a feature within its Advanced Voice Mode tool that will understand videos in real time. Eventually, the Advanced Voice Mode will analyze video context when answering user questions.

Google seems to be the first to roll out this feature, albeit primarily asynchronously (users can’t interact in real time). It’s important to note that a live demo was not presented during the press briefing, and Google’s AI capabilities have previously received mixed reviews.

In addition to the new video analysis, Lens can now simultaneously search using images and text. English-speaking users, even those not in the Labs program, can launch the Google app, tap the shutter button to take a photo, and ask a question aloud.

Moreover, Lens is gaining new functionalities tailored for e-commerce. Starting today, when Lens on Android or iOS identifies a product, it will show detailed information such as pricing, deals, brand, reviews, and availability. This product identification feature works with both uploaded images and new photos, but not videos, and is currently limited to specific countries and select shopping categories, including electronics, toys, and beauty.

“For instance, if you spot a backpack you like,” Wang noted, “you can use Lens to identify it and instantly access any details you might want to know.”

Additionally, there is an advertising aspect to this service. The results page for products identified by Lens will display “relevant” shopping ads featuring options and prices.

Why include ads in Lens? According to Google, around 4 billion Lens searches each month are shopping-related. For a tech giant that relies heavily on advertising, this presents an invaluable opportunity.

Most people like

Find AI tools in YBX