09/08/2026 –, RB102
flutter_gemma started as a thin wrapper over MediaPipe for running Google Gemma on Android and iOS. Today it's a full-fledged platform for on-device AI: 6 platforms (Android, iOS, Web, macOS, Windows, Linux), 2 inference engines, multimodal support, function calling for local agents, and on-device RAG with vector search. The plugin has been featured by Google AI for Developers.
In this talk, we'll walk through the evolution via key engineering decisions. How adding Web platform forced us to rethink file handling and led to sealed classes instead of string URLs. Why we had to spin up a gRPC server in Kotlin with a bundled JVM for desktop — and how to automate the build through Xcode build phases. How the Strategy pattern allowed us to add a second inference engine (LiteRT-LM) without rewriting existing code, and how the Adapter pattern helped reuse the MediaPipe implementation. Why we needed chunk buffering for engines with fundamentally different APIs.
We'll dive deep into on-device RAG: how to build a SQLite VectorStore that works identically on mobile and in the browser via WASM. I'll show real bugs and their fixes — from Web hot restart crashes to iOS Simulator limitations with vision models.
The latest addition is genkit_flutter_gemma — a bridge to Google's Genkit for Dart that enables hybrid AI pipelines. Now you can seamlessly combine on-device inference with cloud-based models in a single Genkit flow: run lightweight tasks locally for speed and privacy, and escalate complex reasoning to the cloud — all orchestrated through one unified pipeline.
Every architectural decision in flutter_gemma is an answer to a specific problem. Minimum theory, maximum code, diagrams, and stories of "how it broke and why it looks like this now".
Sasha is CTO at Brainform.ai with over 20 years of experience architecting scalable enterprise systems. With a strong engineering background, his expertise spans frontend, backend, cloud infrastructure, mobile development, and AI — from cloud-based generative AI to on-device solutions. He specializes in building robust, production-ready products using a variety of technologies and frameworks. Sasha has delivered solutions across fintech, digital media, and entertainment. He is a Google Developer Expert for Cloud, AI, Firebase, Flutter, and Dart, co-organizes the Flutter Berlin Community, and is a recognized international speaker and writer, having presented at 30+ conferences worldwide.