Tag

multimodal

4 articles

Gemma 4 12B: Google's Encoder-Free Multimodal Laptop Model

Google released Gemma 4 12B on June 3, 2026, a multimodal open model with an encoder-free architecture that feeds vision and audio directly into the LLM backbone. It runs locally on 16GB of memory, approaches the 26B MoE on benchmarks, uses Multi-Token Prediction drafters for low latency, and ships under Apache 2.0 with broad tooling support.

By Sarah Chen · 5 min · Jun 9, 2026

Open Source

Nemotron 3 Nano Omni: NVIDIA's 30B Open Model Sees and Hears

NVIDIA's Nemotron 3 Nano Omni is a 30B open multimodal model, processing diverse data with high throughput.

By Marcus Rivera · 6 min · Apr 29, 2026

Open Source

Mistral Small 4: One Open-Source Model Replaces Three Separate AI Products

Mistral Small 4 unifies three AI products into one powerful open-source model, simplifying capabilities.

By Marcus Rivera · 4 min · Mar 30, 2026

Open Source

Qwen 3.5 Small: Alibaba's 9B Model That Beats GPT-OSS-120B

Alibaba's Qwen 3.5 Small, a 9B multimodal AI, surprisingly beats models 13x its size.

By Sarah Chen · 5 min · Mar 29, 2026