Tag

reasoning-models

5 articles

Test-Time Compute: Why Reasoning Models Think Before Answering

Test-time compute spends extra computation during inference, not training, to improve answers. It powers reasoning models like OpenAI o1 and DeepSeek-R1. Two strategies exist: sequential scaling (longer chains of thought, e.g. the s1 paper's budget forcing) and parallel scaling (Best-of-N, majority voting). More thinking is not always better, overthinking degrades accuracy, and hidden reasoning tokens are billable. Match compute to task difficulty.

By Aisha Patel · 8 min · Jun 17, 2026

Deep Dives

Diffusion LLMs: How Text Diffusion Is Challenging Autoregression

Diffusion language models (dLLMs) abandon left-to-right autoregressive generation, instead refining masked noise into text over a few parallel denoising steps. Inception Labs' Mercury Coder runs at 1,100+ tokens per second on H100s versus 50-200 for autoregressive models, and LLaDA 8B's bidirectional design breaks the reversal curse. They still trail the best models on hard reasoning benchmarks, but the one-token-at-a-time assumption is no longer a law of nature.

By Aisha Patel · 8 min · Jun 12, 2026

Deep Dives

ZAYA1-8B: Zyphra's 760M-Active MoE Trained on AMD

Zyphra's ZAYA1-8B MoE model, trained on AMD, achieves high performance with efficient parameter activation.

By Aisha Patel · 6 min · May 24, 2026

Open Source

Trinity-Large-Thinking: 400B U.S.-Made Open Reasoning Model

Trinity-Large-Thinking is Arcee AI's 400B open-weights reasoning model, offering powerful, cost-effective agent tuning.

By Aisha Patel · 7 min · Apr 30, 2026

AI News

DeepSeek V4 Pro: 1.6T Open-Weights Model Hits #2 on the Index

DeepSeek V4 Pro is a top 1.6T open-weights model for agents, but has a high hallucination rate.

By Sarah Chen · 5 min · Apr 29, 2026