Tag

swe-bench

3 articles

Moonshot's Kimi K2.6, an open-weights model, surpasses GPT-5.4 on SWE-Bench Pro.

By Sarah Chen · 6 min · May 12, 2026

Mistral Medium 3.5 is a powerful 128B open-weight model capable of opening GitHub pull requests.

By Sarah Chen · 7 min · May 4, 2026

Anthropic's Claude Opus 4.7 excels on SWE-bench Pro with enhanced vision and new features.

By Sarah Chen · 6 min · Apr 19, 2026