AI Security5 min readJune 21, 2026

Your AI Coding Agent Was Just Hijacked. You Didn't Notice. Here's Exactly How It Happened.

AgentjackingAI Agent SecurityClaude CodeCursorCodexSentryOperator StrategyFramework MoatAI Business AutomationSolo OperatorAgentSkillVault

You opened your AI coding agent this morning, asked it to clear the unresolved Sentry errors, and it got to work. What you did not know — what the agent did not know — is that one of those 'errors' was planted by someone who had never touched your codebase. The 'Resolution' section in that bug report looked exactly like Sentry's own advice. Your agent read it, trusted it, and ran the command. It was not your code. It ran with your credentials, on your machine, with access to your AWS keys, GitHub tokens, and every environment variable your project needed to function. You were never notified. Nothing fired. The attack slipped past your EDR, your firewall, your IAM policies, and every prompt you had given the agent to 'ignore untrusted data.' This is Agentjacking — and Tenet Security just published proof it works 85% of the time across Claude Code, Cursor, and Codex.

What Agentjacking Actually Is

Here is the mechanics, plainly: Sentry, the popular error-tracking platform, lets any app post error reports to it using a public key called a DSN — it is designed to be exposed in your frontend code. Tenet Security discovered that an attacker can POST a fake error to that endpoint with no authentication required and embed a disguised command inside a 'Resolution' field. AI coding agents pull their Sentry queue through the Model Context Protocol — the standard that lets agents talk to outside tools. The agent cannot distinguish a real crash from a planted one. It treats the entire Sentry response as trusted. So when you say 'fix my unresolved Sentry issues,' the agent runs the attacker's command. The attack requires no malware, no breach, no password. Just a public Sentry key — which sits in plain text in most web apps by design. Tenet found 2,388 organisations exposed, ranging from a $250 billion enterprise down to solo developers. When they disclosed to Sentry on June 3, Sentry's response was that the attack class is 'technically not defensible' at the platform level. They patched one specific payload string. The underlying vector remains open.

The Part Nobody's Talking About

The security press is covering Agentjacking as a software vulnerability. That framing understates what this is for operators. This is a trust architecture problem — and it is not limited to Sentry. The same attack vector works through GitHub issues, support tickets, documentation, any external data source your coding agent reads and acts on. Tenet calls it the 'Authorised Intent Chain.' The reason it bypasses every traditional security control is that nothing in the chain is technically unauthorised. The agent is doing exactly what you asked. The attacker did not hack the agent. They hacked the data the agent trusts. This is the part that should change how you build with AI agents: you have been treating the agent as the attack surface — prompting it carefully, testing its outputs — while leaving the data sources it reads completely unmodeled. Your prompt is only as safe as the least-trusted input your agent pulls into context. Right now, for most operators, that is every external data source the agent connects to.

What This Means for Your AI Agent Workflow

The Agentjacking disclosure lands the same week solo operators are deploying more autonomous coding agents than at any point in AI history. Claude Code with nested sub-agents, Cursor with background agentic loops, Codex running unattended on tasks — the power is real and the productivity gains are real. But the security framework has not kept pace. Most operators have a prompt. They do not have a policy for what data sources the agent is allowed to treat as authoritative, what actions require explicit human confirmation, what the agent should do when a resolution instruction appears in error-tracking data. That policy does not need to be complex. It needs to exist. Because the model — whether it is Claude Code, Cursor, or Codex — cannot protect you from data it has been trained to trust. The security layer is your job, and it belongs in your framework, not the model.

Bottom Line

Agentjacking proves that the agent is not your security perimeter — your agent's data trust policy is. An 85% exploitation rate across Claude Code, Cursor, and Codex, with zero authentication required, means this is not a theoretical threat. It is active. The operators who survive the autonomous-agent era are not the ones who picked the most secure model. They are the ones who documented what their agents are allowed to trust, act on, and execute without human confirmation. The model is not the moat. The framework around it is.

4 Moves to Make Right Now

Audit every external data source your coding agent reads and classify each one as trusted, semi-trusted, or untrusted. Sentry, GitHub issues, support tickets, documentation sites — make a list. For each, decide: can the agent act on instructions it finds here, or only on data it finds here? That distinction is the foundation of an agent trust policy. Most operators have never made it explicit. Write it down this week before you run another unattended agent session.
Add a confirmation gate for any agent action that touches credentials, infrastructure, or deployment. Claude Code, Cursor, and Codex all support checkpoint prompts — moments where the agent surfaces its intended next action and waits for explicit approval before executing. This is not about slowing your workflow. It is about breaking the 'Authorised Intent Chain' that Agentjacking exploits. A 10-second confirmation before the agent runs a command it found in external data eliminates the entire attack class.
Rotate any credentials your coding agent has touched in the last 30 days if you have been running against external data sources without a trust policy. That includes AWS keys, GitHub tokens, environment variables, and CI/CD credentials. Agentjacking's payoff is credential theft — and Tenet confirmed successful extraction of all of the above in controlled tests. If you do not have a log of what your agent accessed, treat this as a precautionary rotation. It takes less time than recovering from a compromised pipeline.
Build your agent security framework as a documented operating procedure, not a mental model. The prompt you give your agent at the start of a session is not a security policy — it evaporates the moment the agent's context shifts. A real security framework is a written document: what data sources the agent is authorised to read, what action classes require human confirmation, what the agent should never do regardless of instructions it receives. Pre-built templates for agent operating procedures are at https://agentskillvault.ai/catalog — use them as a starting point and adapt for your stack.

The Agentjacking research confirms something that the operator community has been slow to accept: the more autonomous your AI agents become, the more your framework — not the model — determines your security posture. Every developer who handed their coding agent a Sentry connection without a trust policy made a bet that the data their agent read would always be clean. That bet is now provably wrong at an 85% exploitation rate. The model is not the moat. The documented security layer you build around it is. Start building that layer at https://agentskillvault.ai/catalog before the next autonomous session.

Ready to put this into practice?

Browse Skill Frameworks