Skip to content

ABC Tool

  • Home
  • About / Contect
    • PRIVACY POLICY
Gemma 4 models use a training trick to slash their memory footprint

Gemma 4 models use a training trick to slash their memory footprint

Posted on June 5, 2026 By safdargal12 No Comments on Gemma 4 models use a training trick to slash their memory footprint
Blog


TL;DR

  • Gemma 4 models are now available for download with quantization-aware training (QAT), which reduces the size and memory footprint of the models.
  • These open-source models retain quality better thanks to QAT compared to those that use post-training quantization (PTQ).
  • The Gemma 4 models optimized with QAT are available in five sizes: Gemma 4 E2B, Gemma 4 E4B, Gemma 4 12B, Gemma 4 26B A4B, and Gemma 4 31B.

Following Google’s launch of the laptop-grade Gemma 4 12B model earlier this week, the company is releasing new Gemma 4 model checkpoints with quantization-aware training. Quantization is necessary to reduce the amount of memory required to run lightweight models. The standard method is post-training quantization (PTQ), which quantizes the model after training, but could result in weaker performance. The latest Gemma 4 versions use quantization-aware training (QAT) instead to reduce model quality loss and accelerate decode speed, according to Google’s blog post.

Google says that incorporating quantization into the training process results in checkpoints with better performance than models refined with PTQ. The compressed models run on phones and laptops well thanks to a custom mobile-quantization schema. This involves using pre-calculated settings, 2-bit compression in certain parts of the model, and vocabulary list and short-term memory compression. For the user, this results in a smaller model that consumes less system memory.

Don’t want to miss the best from Android Authority?

google preferred source badge light@2xgoogle preferred source badge dark@2x

There are multiple model sizes available with QAT optimization, include Gemma 4 E2B, Gemma 4 E4B, Gemma 4 12B, Gemma 4 26B A4B, and Gemma 4 31B. The smallest versions, like the text-only Gemma 4 E2B model, require less than a gigabyte of memory to run. These small Gemma 4 checkpoints without intensive resource requirements are ideal for running on phones.

Google shared the approximate memory requirements to load the new Gemma 4 models with QAT in various sizes:

The memory requirements of Gemma 4 model sizes.

There are four different formats of Gemma 4 QAT models available for download: unquantized QAT checkpoints, GPT-Generated Unified Format (GGUF), mobile-optimized, and Compressed Tensors. These models preserve “similar quality to bfloat16 while dramatically reducing the memory requirements to load the model,” according to Google.

After downloading the Gemma 4 QAT model weights, users can run the checkpoints on their phones, laptops, or desktops. You can find the mobile and desktop models on Hugging Face, as well as in LM Studio.

Thank you for being part of our community. Read our Comment Policy before posting.



Source link

Post Views: 1
Tags: AI Google News

Post navigation

❮ Previous Post: S&P 500 rejects SpaceX, also blocking entry for OpenAI and Anthropic
Next Post: Wyze recalls over 320,000 security cameras due to fire and explosion hazards ❯

You may also like

New ‘Fast & Furious’ TV Shows Are Coming to Peacock
Blog
New ‘Fast & Furious’ TV Shows Are Coming to Peacock
May 11, 2026
Updated App Review Guidelines now available – Latest News
Blog
Updated App Review Guidelines now available – Latest News
April 25, 2026
Honor Robot Phone's launch timeframe officially confirmed
Blog
Honor Robot Phone's launch timeframe officially confirmed
May 16, 2026
Senators ban themselves from prediction markets after candidates bet on own races
Blog
Senators ban themselves from prediction markets after candidates bet on own races
May 1, 2026

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Recent Posts

  • Google Photos works to expand collage templates into categories
  • Code Reveals Meta Smart Glasses Can Use ‘Faceprint’ Tracking, Raising Privacy Alarms
  • Apple may have cancelled all Vision Pro successors, black version leaks
  • Wyze recalls over 320,000 security cameras due to fire and explosion hazards
  • Gemma 4 models use a training trick to slash their memory footprint

Recent Comments

  1. Last Chance for Big Savings on TechCrunch Disrupt 2026 Tickets – Artiverse on 5 days left: Save up to $410 on Disrupt 2026 passes

Archives

  • June 2026
  • May 2026
  • April 2026

Categories

  • Blog

Copyright © 2026 ABC Tool.

Theme: Oceanly News by ScriptsTown