What exactly can Gemini Omni do?

Gemini Omni is a next-generation AI model capable of generating and editing high-quality videos from inputs like text and images while considering physical laws. For instance, it can automatically create educational videos explaining scientific concepts or edit existing videos in a conversational way. It will be integrated into various Google services.

Is Antigravity 2.0 useful for non-programmers?

While currently aimed at software developers, its agent capabilities could be applied to automate a wide range of tasks in the future. A demo showcased its ability to build an operating system from scratch, highlighting its potential to dramatically streamline complex project setups and code generation.

When will Google Search's "agent" feature be available?

Google plans to introduce the feature this summer, allowing users to create information agents within Search to monitor and notify them about specific information, such as stock prices, rental listings, or product availability.

Google I/O 2026 Unveils AI Model "Gemini Omni," Revolutionizing Video Generation and Development

At Google I/O 2026, "Gemini Omni" and "Gemini 3.5 Flash" were revealed, introducing groundbreaking tools for video creation and coding at unprecedented speeds. Search evolves into an intelligent agent.

May 20, 2026 7 min read Reviewed & edited by the SINGULISM Editorial Team

Google I/O 2026 Unveils AI Model "Gemini Omni," Revolutionizing Video Generation and Development — Photo by Abid Shah on Unsplash

Google I/O 2026: AI at the Core of All Products

Today’s Google I/O conference was a showcase of AI innovation. Google CEO Demis Hassabis opened the event by sharing remarkable growth statistics for the company’s AI platform, “Gemini.” With over 900 million monthly active users on the Gemini App and a staggering 3.2 quadrillion tokens processed each month, the scale of Gemini’s impact is immense. The image generation model “Imagen” alone has produced over 50 billion images. These numbers highlight that AI is no longer just an element of Google’s offerings—it is now the core of every product and service.

The announcements began with updates to their models, leading into new coding tools and a revamped search service. At the heart of these developments are the next-generation multimodal model “Gemini Omni” and the ultra-fast, high-efficiency “Gemini 3.5 Flash.” These innovations promise not just improved performance but the potential to redefine video generation and transform software development workflows.

Gemini Omni: Ushering in an “Imagen Moment” for Video Generation

The highlight of the unveiling was “Gemini Omni.” DeepMind’s CEO described the model as “a next-generation system capable of creating any content from any input.” Omni combines the advanced reasoning capabilities of Gemini with the cutting-edge technologies behind Google’s video generation model “Veo,” image generation model “Imagen,” and interactive simulation model “Genie,” taking them to the next level.

The goal of Gemini Omni is not merely to generate visuals but to achieve a profound understanding of physical laws like gravity and kinetic energy, enabling it to create highly realistic “world models.” During a live demo, an audience member typed a prompt like “generate a clay animation explaining protein folding,” and Omni transformed abstract scientific concepts into easy-to-understand video content. It was also demonstrated how users can upload existing videos and make conversational edits, such as turning a circular object into a black hole or making a nighttime walking scene more dramatic.

What sets Omni apart is that it is not solely a video-generation model. Google positions it as a “world model,” with plans to evolve it into a system capable of producing “any output from any input.” This means the model must understand relationships between objects and ensure logical consistency within scenes. Integrated into tools like Gemini App, video creation software “Google Flow,” and YouTube Shorts, Omni is set to elevate user creativity from just image editing to comprehensive video editing.

The first family model, “Gemini Omni Flash,” has already been integrated into various products, and more details about the high-performance “Omni Pro” are expected to be announced soon. Omni features in Gemini App are available to subscribers of Google AI Plus, Pro, and Ultra plans.

Gemini 3.5 Flash and Antigravity 2.0: AI Writes Code and Builds Operating Systems

While Omni focuses on creation and editing, “Gemini 3.5 Flash” was introduced as a model specialized in speed, cost efficiency, and execution capability. Described as a forerunner of the Gemini 3.5 series, it emphasizes agent-driven coding, complex long-term tasks, and workflow applications.

Compared to the previous generation, 3.1 Pro, the performance of 3.5 Flash has significantly improved across almost all benchmarks, particularly in code generation and real-world economic tasks. Its speed is especially remarkable, outputting tokens four times faster than other state-of-the-art models and achieving speeds up to 12 times faster in optimized development environments.

Google emphasized the importance of a “feedback loop” in the model’s development. While the company processed 500 billion tokens per day in internal development tasks as of March, this figure has doubled every few weeks, now exceeding 3 trillion tokens daily. This massive volume of real-world usage data drives continuous model improvement.

Alongside the model, Google also announced a major upgrade to its agent-driven integrated development environment (IDE), “Antigravity,” with the release of version 2.0. This new version transcends the traditional IDE concept, evolving into a standalone desktop application with an “agent-first” approach. Users can not only rely on AI for code completion but also collaborate with agents to generate outputs and complete development tasks using multiple agents working together.

Antigravity 2.0 includes a comprehensive command-line interface (CLI) and software development kit (SDK), with native support for Gemini’s voice model. It is also integrated with services like Android, Firebase, and Google AI Studio, and is already available to users worldwide.

A compelling demo showcased the tool’s potential by having agents build a functional operating system (OS) from scratch. This task involved 93 sub-agents working in parallel over 12 hours, issuing over 15,000 model requests, and processing 2.6 billion tokens. Core OS modules, including a scheduler, memory management, and file system, were generated from an empty project.

According to Google, this complex task was previously impossible with Gemini 3.1 Pro but was completed using 3.5 Flash at a cost of less than $1,000 in API fees. The event also featured the newly created OS running a miniature train simulation program and the iconic shooting game “Doom.” Initially, the OS lacked video and keyboard drivers, but Antigravity generated and refined the necessary code to make the game operational.

Google noted that similar methods have already been tested to create photo editing suites, real-time messaging apps, and multi-user collaboration platforms. Tasks that used to require days of engineering work have been reduced to just hours or even less.

Gemini 3.5 Flash is now available to all users and can be accessed via Google products and APIs. The more advanced “Gemini 3.5 Pro” is currently undergoing internal testing and is set to launch publicly next month.

Search Evolves into an “Information Agent”

Following the announcements of new models and development tools, Google turned its attention to its core business: search. The company unveiled a revamped AI-powered search experience.

The “AI Mode” for search, which now boasts over 1 billion monthly active users, has seen its query volume double quarter-over-quarter since its launch. As of today, AI Mode has been upgraded to the latest Gemini 3.5 model. A new smart search box is also being rolled out, allowing users to input text, images, files, or videos and receive AI-driven suggestions in real time.

The integration of the search experience is also advancing. Users will first see AI-generated summary answers on the main search results page, then seamlessly transition into AI Mode for more in-depth questions without losing conversational context. This new search experience is being made available globally on both desktop and mobile starting today.

However, the most significant transformation lies in the introduction of “search agents.” According to Google, starting this summer, users will be able to create personalized information agents within Search to continuously track and monitor specific types of information.

For example, users can set up an agent to monitor “large biotechnology stocks with a price-to-earnings ratio below 15, positive cash flow, and low debt,” or track long-term rental listings, limited-edition sneaker collaborations, or new product arrivals in specific categories. When relevant information changes, the agent will notify the user with a comprehensive update.

Additionally, Google plans to apply the agent-driven coding capabilities developed in Antigravity to Search. This will enable search to go beyond returning web links, summaries, and information, allowing it to dynamically generate interactive interfaces and visualizations to answer specific user queries. For instance, asking “How do black holes affect spacetime?” could result in an interactive simulation to aid understanding.

Conclusion: The Era of AI Agents in the Google Ecosystem

The Google I/O announcements signal a shift from AI as a standalone “model” to AI deeply embedded in every aspect of digital interaction, proactively executing tasks as an “agent.”

From Gemini Omni, which opens new horizons in video generation, to Gemini 3.5 Flash and Antigravity 2.0, which significantly boost developer productivity, and the transformation of search into a personal information agent, all these innovations are designed to integrate seamlessly into Google’s vast ecosystem—spanning search, apps, browsers, developer tools, next-generation XR glasses, and e-commerce.

The exponential growth in token processing highlighted by Demis Hassabis reflects a broader transformation: the entire Google ecosystem is now driven by AI, which in turn feeds back into the system to refine it further. This self-perpetuating evolution heralds the next stage in AI’s role in our digital lives—a stage where “AI enhances and evolves itself.”

Frequently Asked Questions

What exactly can Gemini Omni do?: Gemini Omni is a next-generation AI model capable of generating and editing high-quality videos from inputs like text and images while considering physical laws. For instance, it can automatically create educational videos explaining scientific concepts or edit existing videos in a conversational way. It will be integrated into various Google services.
Is Antigravity 2.0 useful for non-programmers?: While currently aimed at software developers, its agent capabilities could be applied to automate a wide range of tasks in the future. A demo showcased its ability to build an operating system from scratch, highlighting its potential to dramatically streamline complex project setups and code generation.
When will Google Search's "agent" feature be available?: Google plans to introduce the feature this summer, allowing users to create information agents within Search to monitor and notify them about specific information, such as stock prices, rental listings, or product availability.

Source: 爱范儿

SINGULISM Editorial Team — Reviewed & edited by the SINGULISM Editorial Team

Last updated: May 19, 2026

If you find any factual errors or inaccuracies, we will promptly publish a correction. Please contact us via the contact form to request a correction.

Comments

← Back to Home