Gemma-3n-E2B-it for on‑device LLM applications
Gemma 3n was built hand‑in‑glove with some of the biggest mobile‑chip makers out there. It shares the same clever architecture that’ll power the next‑gen Gemini Nano—so you get rock‑solid, on‑device smarts without ever pinging the cloud.
Gemma‑3n‑E2B‑it is Google DeepMind’s newest edge‑first transformer model: a 4.46 B‑parameter MatFormer that behaves like a 2 B model in RAM, runs wholly offline on as little as 2 GB VRAM thanks to Per‑Layer Embeddings (PLE), and still delivers 32 000‑token context and multimodal I/O: ext + image + audio + video.…