Tag
swe-bench
2 articles

Mistral Medium 3.5: 128B Open-Weight Model That Opens PRs
Mistral Medium 3.5 is a 128B dense open-weight model with a 256k context, 77.6% on SWE-Bench Verified, and remote agents in Vibe that open pull requests on GitHub.
By Sarah Chen · 7 min · May 4, 2026

Claude Opus 4.7: Anthropic's New Flagship Clears SWE-Bench Pro
Anthropic's Opus 4.7 hits 64.3% on SWE-bench Pro, adds an xhigh effort level, and ships with 3x sharper vision. But the new tokenizer quietly shifts your bill, and Mythos still sits in the drawer.
By Sarah Chen · 6 min · Apr 19, 2026