27B Memory Requirements 54 → 14.1GB: Google Releases Gemma 3 QAT AI Model, Runs on RTX 3090 Graphics Card

April 19th.GoogleThe company published a blog post yesterday, April 18, releasing an optimized version of Quantitative Awareness Training (QAT) Gemma 3 Model,Reduce memory requirements while maintaining high quality.

27B Memory Requirements 54 → 14.1GB: Google Releases Gemma 3 QAT AI Model, Runs on RTX 3090 Graphics Card

Google launched last month Gemma 3 open-source model that runs efficiently on a single NVIDIA H100 GPU with BFloat16 (BF16) precision.

1AI cites a blog post that describes Google's efforts to make Gemma 3's powerful performance adaptable to common hardware in response to user demand. Quantization techniques are key, dramatically reducing data storage by reducing the numerical precision of model parameters (e.g., from 16 bits in BF16 to 4 bits in int4), similar to image compression that reduces the number of colors.

Quantized by int4, Gemma 3 27B video memory requirementsSharp reduction from 54GB to 14.1GBThe Gemma 3 12B is down from 24GB to 6.6GB; the Gemma 3 1B requires only 0.5GB of video memory.

This means that users can run powerful AI models on desktops (NVIDIA RTX 3090) or laptops (NVIDIA RTX 4060 Laptop GPU), and even phones can support smaller models.

To avoid performance degradation due to quantization, Google employs quantization-aware training (QAT) techniques, which simulate low-precision operations during training to ensure that the model maintains a high level of accuracy even after compression.The Gemma 3 QAT model reduces the perplexity degradation by 541 TP3T in about 5000 training steps.

Major platforms such as Ollama, LM Studio, and llama.cpp have already integrated the model, and users can get the official int4 and Q4_0 models through Hugging Face and Kaggle to easily run on Apple Silicon or CPU. In addition, the Gemmaverse community offers more quantization options to meet different needs.

statement:The content of the source of public various media platforms, if the inclusion of the content violates your rights and interests, please contact the mailbox, this site will be the first time to deal with.
Information

Microsoft pushes AI interoperability with release of two major MCP servers

2025-4-19 12:00:11

Information

AI Race Under Pressure: Meta Exposed to Funding Gap, Asks Microsoft, Amazon for Help

2025-4-19 12:02:10

Search