AI

AI Security: What to Watch in May 2026

TL;DR Prompt injection — basically, "tricking" AI assistants with hidden instructions — has gone from lab demo to real-world attack. Up 32% in four months, per Google. The biggest names in...

By I F · 5 min read

TL;DR

Prompt injection — basically, "tricking" AI assistants with hidden instructions — has gone from lab demo to real-world attack. Up 32% in four months, per Google.
The biggest names in AI agents — Microsoft Copilot, Salesforce Agentforce, Claude Code, GitHub Copilot, Google Gemini, OpenAI Codex — all shipped patches for the same class of bug in March–April 2026.
Vercel had a real breach this month traced back to a single employee's AI tool. The AI didn't get hacked. The token it was given got hacked.
92% of organisations can't see what their AI tools are actually doing inside their core systems. That's the governance gap that turns a small bug into a big incident.
What to do this month: audit AI tool permissions, rotate any AI-related access tokens, and treat agent outputs the same way you treat user input — untrusted by default.

What's actually happening

Through 2025, "AI security" mostly meant arguments about deepfakes and chatbot jailbreaks. In April 2026, the conversation changed.

The pattern is now consistent enough to name: an AI assistant (in your inbox, your code editor, your CRM) reads something — an email, a comment on a pull request, a customer support form — and that "something" contains hidden instructions telling the AI to do something it shouldn't. Send the credentials. Delete the files. Move the money.

This is called indirect prompt injection. It is the AI security story of 2026.

The four stories worth your attention

1. The "lethal trifecta" patch wave · Signal 9/10

In a span of three weeks, researchers disclosed prompt-injection bugs in nearly every major AI coding assistant: Anthropic's Claude Code, Google's Gemini CLI, GitHub Copilot Agent, OpenAI Codex, Microsoft Copilot, Salesforce Agentforce, and Google's Antigravity.

The mechanic is the same each time. The agent has three things at once:

Access to sensitive data or systems
The ability to read untrusted outside input
The ability to send things out (write code, send emails, post comments)

Researchers call this the lethal trifecta. If your AI agent has all three, it can be tricked into leaking what it sees.

Why it matters: This isn't a single vendor failure. It's a design pattern problem. Every agent platform you use is probably exposed to some version of it.

What to do in May: Ask any team using an AI coding agent or customer-facing AI assistant: can the agent see customer-supplied text, and can it then take an action with our systems? If yes, that flow needs human review until vendors ship hardened agent runtimes (most are promising this for Q2/Q3 2026).

2. The Vercel breach via a third-party AI tool · Signal 8/10

On 19 April 2026, Vercel disclosed that an employee's Google Workspace account was taken over via a compromised AI tool called Context.ai. The attacker then walked into Vercel's environment and read non-sensitive customer environment variables.

No production data was lost. That's not the lesson.

The lesson is this: the AI tool didn't fail. The OAuth token the AI tool held failed. Once the token was stolen, everything that token could touch was reachable.

Most companies right now have employees signing into AI tools using their work Google or Microsoft account, granting wide permissions, and forgetting about it. Every one of those tokens is a key with no inventory.

What to do in May: Pull the list of AI tools your staff have authorised against your Google Workspace / Microsoft 365 / Okta. You will be surprised by the number. Revoke anything not in active use. Require approval for new ones.

3. The 92% visibility gap · Signal 7/10

Cybersecurity Insiders, with Saviynt, surveyed enterprise security leaders in April 2026. Headlines:

71% said AI tools have access to core systems like Salesforce and SAP.
Only 16% said that access is governed properly.
92% said they don't have visibility into what AI identities are actually doing.

Separately, SANS Institute reports a 76% increase in non-human identities (NHIs) — the digital "users" that AI agents log in as — over the past year.

Why it matters: Your humans get audits, login alerts, MFA, and offboarding. Your AI agents, in most companies, get none of that. Forrester is forecasting a publicly disclosed agentic-AI breach before the end of 2026. The Vercel incident is the soft version of that prediction.

What to do in May: Treat every AI agent as a new employee. Give it a name in your identity system. Give it the minimum permissions it needs. Rotate its credentials. Log what it does.

4. Prompt injection in the wild — up 32% · Signal 7/10

Google's security team reported a 32% jump in malicious prompt-injection attempts between November 2025 and February 2026. Forcepoint researchers separately catalogued 10 new "in-the-wild" payloads in late April designed to do real damage: file deletion, API key theft, fraudulent payments via AI agents with saved card details.

Sophistication is still low. That's the good news. The bad news: low-sophistication attacks scale.

Why it matters: This is the moment prompt injection stops being a research curiosity and becomes a class of attack that ordinary criminals can run from a template. May and June will likely see the first attempts at scale.

What to do in May: If you've deployed an AI assistant that touches customer-supplied text (support inboxes, web forms, document uploads), have your security team run a prompt-injection test against it this month. Most vendors now provide guidance.

The non-obvious connection

These four stories are one story. Every AI agent you adopt creates three new things simultaneously: a new identity (the token), a new attack surface (the input the agent reads), and a new action capability (what the agent can do once tricked).

Most companies are tracking, at best, one of those three. That's the gap. May is the month to start tracking all three.

What's still uncertain

Whether vendors can actually fix the lethal trifecta or whether agent design needs to change at a deeper level. Probably the latter.
Whether regulators (especially the EU AI Office) will treat prompt-injection failures as security incidents requiring disclosure. Currently grey.
How quickly attackers move from low-sophistication payloads to chained, multi-step ones. Months, not years, is our estimate.

Bottom line

AI agents are no longer experimental. They're production infrastructure with production consequences — and the security model around them is six months behind the deployment curve. The companies that come through May without an incident won't be the ones with the smartest AI; they'll be the ones who treated their AI tools, this month, like new staff with full system access. Audit the tokens. Limit the scopes. Test the inputs. Watch what the agents do. The bill arrives whether or not you're ready.

Sources

Tier 1: Reuters / TradingView (KnowBe4 Agent Risk Manager launch, 15 Apr 2026); GlobeNewswire / Cybersecurity Insiders (AI identity governance research, 21 Apr 2026)
Tier 2: SecurityWeek (Vercel breach, OpenAI Codex token vulnerability, Microsoft Zero Day Quest, Claude Code / Gemini CLI / Copilot Agent prompt injection — March–April 2026); Dark Reading (Microsoft and Salesforce AI agent data leak patches, 15 Apr 2026); CyberScoop (Google Antigravity sandbox escape, 20 Apr 2026); Infosecurity Magazine (SANS NHI report, 9 Apr 2026; Forcepoint 10 in-the-wild payloads, 23 Apr 2026); MLQ.ai (Vercel/Context.ai breach details, 20 Apr 2026)