Dev

Run Gemma 4 in Your Browser: AI-Generated Flowcharts Now Possible in Excalidraw

A new technology utilizing Google's "TurboQuant" algorithm enables the large language model "Gemma 4" to run directly in browsers. When combined with Excalidraw, users can create unlimited AI-generated flowcharts without APIs or fees.

April 26, 2026 5 min read

Run Gemma 4 in Your Browser: AI-Generated Flowcharts Now Possible in Excalidraw — Photo by Walkator on Unsplash

Embedding AI Models in Browsers: A Developer’s Breakthrough

The era of embedding native AI models in smartphones and PCs has arrived. However, such features are typically offered as part of dedicated apps or operating system functionalities. What’s drawing attention now is a more accessible approach—embedding high-performance large language models (LLMs) directly into web browsers.

According to the Chinese software information site “小众软件 (Appinn),” a developer has unveiled a method to run Google’s LLM “Gemma 4” directly within a browser by leveraging Google’s newly proposed quantization algorithm called “TurboQuant.” By combining this capability with the real-time collaborative drawing tool “Excalidraw,” users can now open a web page and have AI generate flowcharts and diagrams without needing APIs or incurring any costs.

TurboQuant: Opening New Frontiers for Browser-Based AI

At the heart of this technology is “TurboQuant.” This is a type of quantization technology that represents the parameters of large neural networks with fewer bits. Typically, LLMs have massive parameters measured in gigabytes, making it impractical to load them into a browser. However, TurboQuant enables a dramatic reduction in model size without significantly compromising its accuracy.

Using this algorithm, the Gemma 4 model was converted into WebAssembly (WASM) format, allowing it to run efficiently on a browser’s JavaScript engine. For users, this means there’s no need to install special software or purchase expensive API keys. As long as the browser is connected to the internet, the AI model can function locally.

A New Workflow Emerges with Excalidraw Integration

The decision to integrate this technology with Excalidraw highlights its practical potential. Excalidraw is an open-source tool that enables users to quickly create flowcharts and wireframes with simple, hand-drawn-style graphics. By connecting the browser-based AI, users can generate diagrams instantly by providing natural language instructions like “draw a flowchart for this process.”

In traditional workflows, creating AI-generated diagrams required multiple steps: 1) sending a request to a cloud-based AI via an API, 2) receiving the results, and 3) manually converting them into diagrams. With this new method, everything is completed within a single browser application. There are no API costs, minimal latency, and, most importantly, the drawing process is entirely local, providing advantages in terms of security and privacy.

The Impact on the Future of Edge AI and Web Development

This news has significant implications. Firstly, it suggests an acceleration in the “edgeification” of LLMs. Edge computing refers to processing data locally on devices or at the network’s edge rather than relying on centralized cloud data centers. Running AI models in browsers means that advanced AI functionalities can be integrated into any web application without additional infrastructure costs.

Secondly, it hints at a potential paradigm shift in web development. In the future, front-end developers may need to add skills in managing WASM-based AI models alongside their expertise in JavaScript and TypeScript. As the practice of treating AI as a “component” that can run directly in the browser becomes more common, this could significantly transform development methodologies.

However, challenges remain. For now, browser-based models are still limited compared to massive cloud-based models. Loading such models can take time, and performance may vary depending on the device’s capabilities. Nevertheless, advancements in hardware and algorithm optimization are expected to gradually address these issues.

What Lies Ahead?

This demonstration represents a significant milestone in the transition of AI technology from being “cloud-exclusive” to something that can be freely used locally. Google itself has been actively open-sourcing its Gemma series, and the number of optimization efforts and application use cases from the developer community is expected to grow further.

Flowchart generation is just one example. Browser-based LLMs hold limitless potential for applications such as code completion, text summarization, multilingual translation, and interactive educational tools. By eliminating the barriers of API costs and usage limits, this development takes a crucial step toward making AI accessible to everyone as an essential infrastructure.

FAQ

Q: Won’t running Gemma 4 in a browser quickly drain my computer or smartphone battery?
A: That concern is valid. Running an LLM requires significant computational power, which increases CPU and GPU usage, leading to higher power consumption. At the moment, this is a prototype intended for short-term use, and it may not be suitable for extended sessions. However, advancements in hardware and energy-efficient model designs are expected to make long-term usage more feasible in the future.

Q: Can this technology replace cloud-based AI services like ChatGPT?
A: It cannot fully replace cloud-based AI. Browser-based AI models are limited in scale and capability compared to massive cloud-hosted models, which are better suited for complex reasoning and tasks requiring vast contexts. However, browser-based AI offers unique strengths, such as zero API costs, enhanced privacy, and offline functionality, making it ideal for light to medium tasks or applications specialized in specific domains.

Q: I’d like to try this myself. How can I get started?
A: You can access the web page provided in the original article to try the technology right away. However, note that it’s currently in an experimental stage, and its performance may vary depending on your browser and device specifications. If you’re interested in development, you can also explore the open-source Gemma model and WebAssembly-related technologies to create your own applications.

Source: 小众软件