Deep Dives 8 min read advanced

Agentjacking: Fake Sentry Errors Hijack Your AI Coding Agent

Agentjacking injects fake Sentry errors that AI coding agents read over MCP as trusted guidance, then execute - hitting an 85% success rate across 2,388 exposed orgs.

Aisha Patel

Jun 29, 2026

Agentjacking: Fake Sentry Errors Hijack Your AI Coding Agent

Your AI coding agent trusts its tools more than it trusts you. That's the uncomfortable truth at the center of Agentjacking — a new attack class disclosed by Tenet Security in June 2026 that turns the most mundane developer workflow imaginable, "fix the unresolved Sentry issues," into remote code execution on your own machine.

No phishing. No compromised server. No malware that any scanner could catch. Just a crafted error report and an agent that can't tell the difference between a real crash and a planted instruction.

The one-sentence version

An attacker sends a fake error event to your Sentry project. Your AI coding agent — Claude Code, Cursor, Codex, whichever — pulls that error in through the Model Context Protocol (MCP), reads the attacker's text as a legitimate "resolution step," and runs it with your privileges, on your laptop.

"The attacker never touches the victim's infrastructure. The malicious instruction arrives disguised as a legitimate 'Resolution' inside an ordinary error." — Tenet Security researchers Ron Bobrov, Barak Sternberg, and Nevo Poran

That's the whole trick. And it works 85% of the time.

Why this is an architecture flaw, not a bug

Most vulnerabilities are mistakes — an unescaped string, a missing bounds check. Agentjacking is different. Nobody wrote bad code. The exploit lives in the seam between two systems that were each behaving exactly as designed.

The flaw sits at the intersection of two facts:

Sentry's event ingestion accepts arbitrary payloads from anyone holding the DSN. That's not a flaw in Sentry — it's the entire point. Error trackers must accept whatever your crashing app throws at them, from any client, without authentication friction. If ingestion were locked down, it couldn't catch the errors you actually care about.
The Sentry MCP server returns that data to the AI agent as trusted system output. Also by design. MCP exists to feed agents authoritative context from real tools.

Individually, both are reasonable. Composed together, they create a pipeline where attacker-controlled input is delivered to an autonomous agent wearing the costume of trusted infrastructure. The agent has no way to know the error it's reading was injected rather than generated.

This is the defining shape of agentic-era security problems: the danger isn't in any single component, it's in the trust that flows between them.

The DSN is the skeleton key

To pull this off, an attacker needs one thing: your Sentry DSN (Data Source Name).

Here's why that's so dangerous. A DSN is a public, write-only credential. It's meant to be embedded in client-side code — it ships inside the JavaScript bundle of essentially every web app that uses Sentry for front-end error tracking. You can find DSNs by:

Viewing source on a production website
Searching public GitHub repositories
Scanning any client bundle that initializes Sentry

It was never treated as a secret, because in the old threat model it wasn't one. A write-only DSN only lets you submit errors. So what if an attacker can submit a fake error? Worst case, you get some noise in your dashboard.

That assumption held right up until an AI agent started reading those errors and acting on them.

Tenet's scan found at least 2,388 organizations with valid, injectable DSNs exposed in public code. Every one of them published the key to this attack in their own front-end, for free, years ago.

The attack chain, step by step

The full kill chain is short — which is exactly what makes it dangerous. Reproduced from Tenet's disclosure:

Harvest the DSN. The attacker finds the target's Sentry DSN — embedded in a website, leaked in a repo, whatever. No credentials to crack.
Inject the event. Using only that DSN and any HTTP client, the attacker sends a malicious error event to Sentry's ingest endpoint via a standard POST request.
Disguise the payload. The injected event contains carefully formatted markdown in the message field and context key names. When the Sentry MCP server returns this event, it renders as structured content visually identical to Sentry's own system template. The malicious instruction looks like an official "Resolution" section.
Wait for the trigger. A developer — entirely innocently — asks their agent to "fix the unresolved Sentry issues." The agent queries Sentry over MCP and receives the poisoned event alongside the real ones.
Execution. The agent reads the attacker's embedded command as trusted diagnostic guidance and runs it — with the developer's full system privileges, on the developer's own machine.

The genius and the horror are the same: every single action in that chain is authorized. The DSN access is legitimate. The error submission is legitimate. The MCP fetch is legitimate. The command execution is the agent doing its literal job. There is nothing anomalous for a scanner to flag.

This is the part that should keep platform teams up at night. Agentjacking sails straight past the controls enterprises spent the last decade buying:

EDR sees a developer's own trusted process (the agent) running a command the developer initiated. Nothing to alert on.
WAF and Cloudflare never see a malicious request — the attacker talks to Sentry, not to you.
IAM and VPN are irrelevant; no unauthorized access ever occurs.
Firewalls have nothing to block, because the inbound channel is Sentry's legitimate ingest API.

As Tenet put it bluntly:

"The attack bypasses EDR, WAF, IAM, VPN, Cloudflare, and firewalls — because there is nothing malicious to detect. Every action in the chain is authorized."

What can an attacker actually grab once the command runs? Plenty, and all of it quietly: environment variables, Git credentials, private repository URLs, and developer identities — the keys to a developer's entire working life, exfiltrated without a single phishing email or server breach.

The vendor response: "technically not defensible"

Here's where Agentjacking stops being a technical story and becomes a governance one.

When Tenet disclosed the issue, Sentry acknowledged it but declined to pursue a root-cause fix, reportedly describing the problem as "technically not defensible" at the platform level. The mitigation Sentry did ship was a global content filter that blocks one specific payload string — the exact proof-of-concept Tenet demonstrated.

Filtering a known string stops the published demo. It does nothing about the architectural pathway that allows injection in the first place. Any attacker who varies the markdown — and markdown offers near-infinite variation — walks right around a string filter.

It's hard to be too harsh on Sentry here, and that's the deeper point. They're not wrong that they can't fully fix it. Sentry's job is to ingest arbitrary error data; it cannot reliably know which of those errors will later be fed to an autonomous agent that treats text as commands. The trust assumption that broke wasn't Sentry's to enforce. It belongs to whoever decided the agent should execute tool output unquestioned.

That's the accountability gap agentic AI keeps opening: a vulnerability that emerges between vendors, where each party can credibly say "not my layer."

What defenders should actually do

There's no single patch, but the exposure is very reducible. Treat MCP-connected tool output as untrusted input, the same way you'd treat a user-submitted form field. Concrete moves:

Rotate and scope your DSNs. Stop treating the DSN as harmless. Where possible, separate the client-side ingest DSN from anything that feeds your agent workflows, and rotate exposed ones.
Put a human in the execution loop. Configure coding agents to require confirmation before executing shell commands, especially commands derived from external tool output. Auto-run is the detonator here.
Sandbox the agent. Run coding agents in containers or VMs without standing access to production credentials, Git tokens, or environment secrets. If the agent can't read the secret, the exfil step fails.
Scope MCP server permissions. Grant each MCP connection the minimum it needs. An agent that only needs to read issue titles shouldn't be receiving fully rendered, markdown-laden event bodies.
Treat tool output as data, not instructions. This is the architectural fix the whole industry needs: agents must stop collapsing the boundary between content to analyze and commands to follow.

The bigger pattern

Agentjacking is not an isolated curiosity. It landed in the same month as AutoJack, a sibling attack where a single malicious web page hijacks an AI agent for host code execution, and a wave of disclosures about poisoned agent skills and prompt-injection malware. These are not coincidences. They are the early returns on a structural bet the industry made: give agents real tools and real autonomy, and trust the tool channel implicitly.

The attack surface has moved. It's no longer just your servers or your endpoints — it's the agent itself, and the trust it extends to everything it reads.

The Bottom Line

Agentjacking proves that in the agentic era, your most dangerous input isn't the user — it's the trusted tool feeding your AI. The fix isn't a content filter on one vendor's platform; it's a fundamental rule that agents must treat tool output as untrusted data and never auto-execute commands derived from it. Until that principle is baked into how agents are built, expect Agentjacking to be the first of many attacks that turn a developer's own assistant against them — using nothing but the data those developers already published about themselves.

ai-agents cybersecurity ai-coding-agents mcp ai-safety

More in Deep Dives

Deep Dives

LLM Quantization: GGUF vs AWQ vs GPTQ in 2026

A practical breakdown of the three dominant LLM quantization formats in 2026. GGUF is the portable, CPU-friendly default (use Q4_K_M); AWQ wins on 4-bit quality for GPU serving via activation-aware precision; GPTQ remains a solid NVIDIA-focused option. Quantization is lossy, so test on your real workload.

By Aisha Patel · 7 min · Jun 25, 2026

Deep Dives

KV Cache: The Memory Trick Behind Fast LLM Inference

A deep dive into the KV cache in LLM inference: why autoregressive decoding needs it, how it dominates GPU memory, the 60-80% waste of contiguous allocation, and how vLLM's PagedAttention fixed it.

By Aisha Patel · 9 min · Jun 22, 2026

Deep Dives

Model Collapse: Why AI Trained on AI Slowly Falls Apart

Model collapse is the progressive degradation of generative models trained recursively on synthetic data, documented in Nature (Shumailov et al., 2024). Errors compound and rare data vanishes, but research (Gerstgrasser et al., 2024) shows accumulating real data alongside synthetic data, tracking ratios, and verifying generations prevents it.

By Aisha Patel · 8 min · Jun 19, 2026

Agentjacking: Fake Sentry Errors Hijack Your AI Coding Agent

Agentjacking: Fake Sentry Errors Hijack Your AI Coding Agent

The one-sentence version

Why this is an architecture flaw, not a bug

The DSN is the skeleton key

The attack chain, step by step

Why your security stack is blind to it

The vendor response: "technically not defensible"

What defenders should actually do

The bigger pattern

The Bottom Line

More in Deep Dives

LLM Quantization: GGUF vs AWQ vs GPTQ in 2026

KV Cache: The Memory Trick Behind Fast LLM Inference

Model Collapse: Why AI Trained on AI Slowly Falls Apart