Tag
swe-bench
3 articles

Kimi K2.6: Moonshot's Open-Weights Model Beats GPT-5.4 on SWE-Bench Pro
Moonshot's Kimi K2.6, an open-weights model, surpasses GPT-5.4 on SWE-Bench Pro.
By Sarah Chen · 6 min · May 12, 2026

Mistral Medium 3.5: 128B Open-Weight Model That Opens PRs
Mistral Medium 3.5 is a powerful 128B open-weight model capable of opening GitHub pull requests.
By Sarah Chen · 7 min · May 4, 2026

Claude Opus 4.7: Anthropic's New Flagship Clears SWE-Bench Pro
Anthropic's Claude Opus 4.7 excels on SWE-bench Pro with enhanced vision and new features.
By Sarah Chen · 6 min · Apr 19, 2026