Tag
1 article
Zyphra's new 8.4B-parameter MoE activates only 760M params per token, trains entirely on AMD MI300x, and beats Claude 4.5 Sonnet on HMMT'25 math.
By Aisha Patel · 6 min · May 24, 2026