AI News 5 min read

OpenAI Agents SDK: Sandboxes Land for Long-Horizon Agents

OpenAI's Agents SDK now features sandboxes, built-in providers, and durable state for long-horizon agents.

Apr 20, 2026

OpenAI's Agents SDK just grew up. On April 15, 2026, the company shipped a major update that pulls agent execution out of the "hope for the best" era and into something an enterprise legal team is actually willing to sign off on. The headline features — a native sandbox layer and a model-native harness — aren't glamorous, but they're what turn a demo agent into a system that can run for hours without a human babysitter.

If you've built anything with the previous Agents SDK, you know the gap. The SDK gave you a loop, tool calls, and a reasonable abstraction over the model. It didn't tell you where the code should run. Developers bolted on Docker, e2b, or a bespoke VM stack. Credentials leaked into the same environment as the model's tool output. State disappeared when a container died. The new release tackles every one of those pain points head-on.

The sandbox problem, solved at the SDK layer

The update bundles native support for seven sandbox providers out of the box: Blaxel, Cloudflare, Daytona, E2B, Modal, Runloop, and Vercel. Pick the one that fits your stack, or bring your own. Either way, the SDK treats the sandbox as a first-class citizen, not an afterthought.

What makes this portable is the new Manifest abstraction. Instead of writing provider-specific glue, you describe the agent's workspace once — mount points, output directories, storage connections — and the SDK handles the rest. Storage providers supported at launch:

Provider	Type
AWS S3	Object storage
Google Cloud Storage	Object storage
Azure Blob Storage	Object storage
Cloudflare R2	Object storage

The design assumption is important: OpenAI built this expecting prompt-injection and data-exfiltration attempts. By separating the harness from the compute layer, credentials stay out of the environment where model-generated code actually runs. That's the kind of architectural decision you can defend to a security review.

The harness, explained

In agent lingo, the harness is everything around the model — memory, orchestration, tool routing, filesystem access. OpenAI's new in-distribution harness aligns execution with the way their frontier models were trained to operate, which is a quiet but meaningful shift.

The primitives bundled in:

Tool use via MCP (Anthropic's Model Context Protocol, now a de facto industry standard)
Progressive disclosure via skills
Custom instructions via AGENTS.md
Code execution via the shell tool
File edits via the apply_patch tool
Codex-like filesystem tools for navigating project structures

Notice the names. MCP. AGENTS.md. Both are part of the Linux Foundation's Agentic AI Foundation, which OpenAI co-founded with Anthropic and Block in December 2025. The Agents SDK is quietly becoming a reference implementation for that stack.

Long-horizon tasks and durable state

"This launch, at its core, is about taking our existing agents SDK and making it so it's compatible with all of these sandbox providers." — Karan Sharma, OpenAI product team, speaking to TechCrunch.

The long-horizon piece is where this gets interesting. If a sandbox container dies — and at scale, it will — the SDK snapshots and rehydrates agent state in a fresh container and resumes from the last checkpoint. Multi-sandbox runs are supported too, so subagents can fan out to isolated environments and execute in parallel.

This is how you build an agent that does a four-hour legal review without losing its place.

LexisNexis is already using it

OpenAI dropped a real customer quote, which is rarer than it should be for launch posts:

"The OpenAI Agents SDK has enabled complex legal drafting and workflows by providing a unified framework with built-in safeguards and secure, isolated environments for data processing and code execution." — Min Chen, Chief AI Officer, LexisNexis

Legal drafting is exactly the kind of workload where you need both durability and isolation. The fact that LexisNexis was comfortable going on the record tells you the security story has teeth.

What's missing today

Two things worth flagging. First, the sandbox and harness ship Python-first. TypeScript support is coming but not dated. If your agent stack lives in Node, you're waiting. Second, two features OpenAI is advertising — code mode and subagents — are labeled "in development" for both languages. Treat the roadmap as a roadmap.

The update is generally available through the API at standard token-and-tool pricing. No new SKU, no waitlist.

The bigger picture

Agentic AI has been the tech industry's favorite talking point for a year. The gap between conference-stage demos and production deployments has stayed stubbornly wide, mostly because nobody wanted to own the execution layer. OpenAI just did. By making the sandbox and harness a SDK-level concern, they've removed the most common reason enterprises stall out between prototype and rollout.

Expect Anthropic and Google to ship comparable primitives within the quarter. The race is no longer who can demo the best agent — it's who can make agents safe to run.

The Bottom Line

The April 2026 Agents SDK update is the most enterprise-pragmatic release OpenAI has shipped in a year. Sandboxing is now a configuration detail, not an engineering project. Long-horizon agents have durable state. Seven sandbox providers are plug-and-play. If you've been waiting for a responsible way to put an autonomous agent in front of real data, the waiting is mostly over — as long as you're on Python.

More in AI News

AI News

Gemma 4 12B: Google's Encoder-Free Multimodal Laptop Model

Google released Gemma 4 12B on June 3, 2026, a multimodal open model with an encoder-free architecture that feeds vision and audio directly into the LLM backbone. It runs locally on 16GB of memory, approaches the 26B MoE on benchmarks, uses Multi-Token Prediction drafters for low latency, and ships under Apache 2.0 with broad tooling support.

By Sarah Chen · 5 min · Jun 9, 2026

AI News

MAI-Code-1-Flash: Microsoft's Lean Coding Model Hits Copilot

Microsoft launched MAI-Code-1-Flash on June 2, 2026, a lightweight, agentic coding model built end-to-end in-house and rolling out to GitHub Copilot users in VS Code. It outperforms Claude Haiku 4.5 across four coding benchmarks (including 51.2% vs 35.2% on SWE-Bench Pro) while using up to 60% fewer tokens, signaling Microsoft's push for AI independence from OpenAI.

By Sarah Chen · 5 min · Jun 6, 2026

AI News

DeepSeek V4-Pro: 75% Price Cut Becomes Permanent

On May 22, 2026, DeepSeek made its 75% promotional discount on V4-Pro permanent rather than letting it expire May 31. New permanent rates: $0.435/M input, $0.87/M output, $0.003625/M cache hit. That puts V4-Pro output roughly 34x cheaper than GPT-5.5 and 17x cheaper than Claude Opus 4.7, while landing within 3-7 points on coding and reasoning benchmarks. The underrated detail is the cache-hit price, which can cut input cost ~88% for agents with stable prefixes. Teams should re-run their build math and route the easy majority of traffic to V4-Pro.

By Sarah Chen · 5 min · Jun 1, 2026