If you can't find what you're searching for with just an image, Google Lens now allows you to take a video and use your voice to inquire about the visuals. This new feature launches within Search Labs on Android and iOS, offering an AI Overview and search results tailored to the video's content and your spoken question.
Google first showcased video search at I/O in May. For instance, if you're at an aquarium and want to know why certain fish are swimming together, simply open the Google Lens app, capture a video of the exhibit, and ask, “Why are they swimming together?” Google Lens will then utilize the Gemini AI model to generate a relevant response.
Rajan Patel, Google’s vice president of engineering, explained that the technology captures video as a series of image frames, applying advanced computer vision techniques for enhanced understanding. The custom Gemini model processes these sequences to provide answers rooted in web information.
Although sound identification in videos is not yet supported—such as recognizing a bird call—Patel mentioned that Google is exploring this capability. Additionally, Google Lens is enhancing its photo search functionality, allowing users to ask questions via voice. To use this feature, simply point your camera at the subject, hold down the shutter button, and voice your query. Previously, questions could only be typed after taking a picture. Voice questions are rolling out globally on Android and iOS, currently available only in English.