Meta Muse Spark: Superintelligence Labs First Model Reviewed

Meta just drew a line under the Llama era. Muse Spark, the first model out of Meta Superintelligence Labs, launched on April 8, 2026 — and it signals a fundamentally different approach to how the company builds and ships AI.

This is not a Llama successor. It is not open-weight. And it is, by several measures, the most capable free frontier model available today.

The Backstory Matters

Muse Spark exists because Mark Zuckerberg was reportedly unhappy with the progress of Meta's AI efforts. The response was dramatic: Meta invested $14.3 billion in Scale AI for a 49% stake, recruited Scale's co-founder Alexandr Wang to lead a new division called Meta Superintelligence Labs, and gave him a mandate to start from scratch.

Wang, who became the world's youngest self-made billionaire at 24, stepped down as Scale AI CEO but remained on the board. Nine months later, Muse Spark — internally codenamed Avocado — is the result.

What Muse Spark Can Do

The model accepts text, voice, and image inputs but produces text-only output. It runs in two modes today, with a third coming:

Instant mode — fast answers for straightforward questions
Thinking mode — complex reasoning with multiple subagents working in parallel
Contemplating mode — coming soon — extended reasoning comparable to Gemini Deep Think

Thinking mode is where things get interesting. Meta describes it as deploying multiple subagents simultaneously — for example, when planning a trip, the system can draft itineraries, compare destinations, and identify activities in parallel rather than sequentially.

The Benchmarks Tell a Nuanced Story

Muse Spark scores 52 on the Artificial Analysis Intelligence Index v4.0, placing it in the top 5 behind GPT-5.4 (57), Gemini 3.1 Pro (57), and Claude Opus 4.6 (53). For a free model, that is remarkable.

But the headline numbers hide where Muse Spark truly excels — and where it falls short.

Where it leads:

Benchmark	Muse Spark	GPT-5.4	Claude Opus 4.6	Gemini 3.1 Pro
HealthBench Hard	42.8%	40.1%	14.8%	20.6%
MMMU-Pro (vision)	80.5%	—	—	82.4%

The HealthBench result is striking. Muse Spark outperforms GPT-5.4 on medical reasoning and triples Claude Opus 4.6's score. Meta says the model was developed with physician input, and the numbers suggest that investment paid off.

On multimodal perception via MMMU-Pro, Muse Spark is the second-most capable vision model available, trailing only Gemini 3.1 Pro by less than two percentage points.

Where it trails:

Benchmark	Muse Spark	GPT-5.4	Claude Sonnet 4.6
Terminal-Bench (coding)	59.0	75.1	—
GDPval-AA (agentic)	1,427	1,676	1,648

Coding and agentic tasks remain weak spots. Meta acknowledges gaps in "long-horizon agentic systems and coding workflows." If you need a model for software engineering, Muse Spark is not your best option today.

Token Efficiency Is the Sleeper Advantage

One number jumped out of the Artificial Analysis evaluation: Muse Spark used just 58 million output tokens to complete the full Intelligence Index benchmark suite. For comparison, Claude Opus 4.6 used 157 million tokens and GPT-5.4 used 120 million.

Fewer tokens for comparable performance means faster responses and lower computational cost at scale. For Meta, which will deploy this across WhatsApp, Instagram, Facebook, Messenger, and AI glasses, that efficiency is not academic — it is the difference between viable and prohibitively expensive.

16 Built-In Tools

Muse Spark ships with a surprisingly complete tool set, as documented by developer Simon Willison:

Search and browse — web search, page opening, and content finding
Meta content search — semantic search across Instagram, Threads, and Facebook
Image generation — artistic and realistic modes in multiple aspect ratios
Python execution — sandboxed Python 3.9 with pandas, NumPy, matplotlib, scikit-learn, and more
Visual grounding — object detection with bounding box, point, and count formats
Subagent spawning — the model can create specialized sub-agents for complex tasks
Third-party integrations — Google and Outlook calendars, Gmail

The Meta content search is unique to this model. No competitor can semantically search your Instagram and Facebook history as part of a conversation.

The Open-Source Question

Here is the elephant in the room: Muse Spark is not open-weight. After years of positioning Llama as the open-source alternative to GPT and Claude, Meta has released a proprietary model. API access is currently limited to a private preview for select partners.

Wang has indicated that future versions may be open-sourced. But for now, the only way to use Muse Spark is through Meta's own apps and the meta.ai website, which requires a Facebook or Instagram login.

This is a strategic pivot. Meta is no longer trying to win the open-source AI race. It is trying to win the consumer AI race — and it is willing to keep its best model closed to do it.

The Bottom Line

Muse Spark is a strong first showing from Meta Superintelligence Labs. The medical reasoning benchmarks are best-in-class, the vision capabilities are near the top, and the token efficiency suggests a model built with deployment scale in mind. The weaknesses in coding and agentic tasks are real but unsurprising for a consumer-focused model.

The bigger story is what Muse Spark represents: Meta betting that the next phase of AI competition is not about open weights or developer mindshare, but about embedding intelligence so deeply into consumer products that switching costs become insurmountable. Whether that bet pays off depends on whether Contemplating mode and future Muse models can close the gap on the benchmarks where Spark still trails.

Meta Muse Spark: The First Model From Superintelligence Labs Is a Strategic Reset

The Backstory Matters

What Muse Spark Can Do

The Benchmarks Tell a Nuanced Story

Token Efficiency Is the Sleeper Advantage

16 Built-In Tools

The Open-Source Question

The Bottom Line

More in AI News

GLM-5V-Turbo: Z.ai's 744B Vision Model Turns Screenshots Into Code

Qwen3.7-Max: Alibaba's 35-Hour Agent Run Resets the Frontier

Gemini Intelligence: Google Moves AI From the App to the Android OS