Google has released a new AI model, Gemma 4 12B, designed to run on standard laptops with just 16GB of RAM.

The model handles complex reasoning and agentic tasks once requiring larger variants, thanks to Multi-Token Prediction for faster, efficient processing.
Gemma 4 12B natively processes text, audio, and images. Google streamlined vision processing with single-matrix multiplication, eliminating a bulky encoder. Audio requires no encoding at all-raw signals are projected directly into text vectors.
Weighing under 18GB, the model is available on Kaggle and Hugging Face for local use.