CAISI Pre-Deployment AI Review — Operator Briefing – Good Machine

TL;DR

The Commerce Department's CAISI signed pre-deployment review agreements with Google DeepMind, Microsoft, and xAI — giving the US government the right to evaluate frontier AI models before public release, conduct post-deployment assessments, and test in classified environments.
Anthropic and OpenAI are conspicuously absent from the new agreements. Their existing partnerships from 2024 were "renegotiated" but not expanded. This follows a months-long feud between Anthropic and the Pentagon over model access.
The White House is simultaneously considering an executive order to establish a formal AI review working group — a sharp reversal from Trump's earlier "let that baby thrive" posture.
This is not voluntary self-regulation. CAISI has already completed more than 40 pre-deployment evaluations, including on unreleased models. Developers frequently provide models with "reduced or removed safeguards" for testing.

What Happened

On Tuesday, the Center for AI Standards and Innovation (CAISI) at NIST announced new agreements with Google DeepMind, Microsoft, and xAI. The deals enable government evaluation of AI models before they are publicly available, post-deployment assessment, and "targeted research to better assess frontier AI capabilities and advance the state of AI security."

The announcement, confirmed by the NIST press office and reported by Forbes, Axios, and the Washington Post, came one day after the New York Times reported the Trump administration is considering an executive order to establish a formal government review process for new AI models — potentially modelled on the UK's approach of tasking multiple agencies with AI safety review.

The agreements are not entirely new. CAISI's predecessor (the AI Safety Institute, established under Biden's 2023 executive order) had partnerships with Anthropic and OpenAI dating to 2024. Those deals, per an Axios spokesperson, "have been renegotiated" to reflect CAISI's updated directives under Commerce Secretary Howard Lutnick and Trump's AI Action Plan. But the new, expanded agreements announced Tuesday cover only Google DeepMind, Microsoft, and xAI.

Anthropic and OpenAI are not among them.

What It Actually Means

This is not a story about three companies agreeing to let the government peek at their models. It is a story about who gets to operate in the US AI market — and under what terms.

The inclusion list tells you everything. Google, Microsoft, and xAI now have a formal government channel for pre-release review. That is, in practice, a regulatory moat. Companies inside the tent get to shape the testing framework, influence what "security" means, and demonstrate compliance before competitors do. Companies outside the tent face asymmetric uncertainty.

The exclusion of Anthropic is the signal. The backstory is well-documented: in February, Anthropic refused a Trump administration request for unrestricted access to its AI models. Defense Secretary Pete Hegseth responded by designating Anthropic a supply chain risk to national security. Anthropic sued. A federal appeals court declined to pause the designation. The company is now simultaneously fighting the Pentagon in court while trying to sell its cybersecurity model (Claude Mythos) to the same government that just excluded it from the formal review framework.

This is not a coincidence. It is a demonstration of how AI governance will actually work: not through neutral, technocratic evaluation, but through access politics. If you are in the room, you get a seat at the table. If you are not, you get designated a supply chain risk.

The executive order threat is real. The New York Times reported that the White House is considering a formal review process modelled on the UK's approach. The UK has tasked multiple agencies — including its AI Safety Institute, the Defence Science and Technology Laboratory, and the National Cyber Security Centre — with reviewing AI safety standards. A US equivalent would create a permanent, multi-agency gate that every frontier model must pass through before release. That is not "light touch." That is a licensing regime by another name.

Hype Deconstruction

This is not the government "taking over" AI development. CAISI's agreements are voluntary — companies are not legally required to submit models for review. The evaluations are collaborative, not adversarial. And the framework is explicitly designed to support "voluntary product improvements" and "information-sharing," not to block releases.

But "voluntary" is doing a lot of work here. When the alternative is being designated a national security risk — as Anthropic learned — the distinction between voluntary and mandatory starts to blur.

Also: this is not new. CAISI and its predecessor have been doing pre-deployment evaluations for years. The agency has completed more than 40 such evaluations already, including on models that remain unreleased. What is new is the formalisation, the expansion, and the political context.

Stakeholder Landscape

Who benefits:

Google, Microsoft, xAI — regulatory moat, government relationship, first-mover advantage in shaping testing standards
CAISI / NIST — institutional relevance secured under an administration that initially wanted to dismantle AI safety infrastructure
Enterprise AI buyers — clearer compliance pathway for procuring frontier models

Who is directly affected:

Anthropic — excluded from the formal framework while simultaneously fighting a Pentagon supply chain designation; faces asymmetric regulatory risk
OpenAI — existing partnership "renegotiated" but not expanded; status uncertain
Any AI lab not named Google, Microsoft, or xAI — no clear pathway to government pre-release review

Who benefits from the noise:

Consulting firms and law firms — the compliance industry around AI governance just got a major growth catalyst
The UK's AI Safety Institute — now positioned as the model for US policy, increasing its global influence

Cross-Layer Implications

Security layer: CAISI's agreements explicitly support "testing in classified environments" and were "drafted with the flexibility required to rapidly respond to continued AI advancements." This means models are being tested against national security threat models, not just civilian safety benchmarks. The TRAINS Taskforce — an interagency group focused on AI national security concerns — is the evaluation body. This is a defence apparatus, not a consumer protection agency.

Commercial layer: The pre-release review process creates an information asymmetry. Companies inside the tent get early feedback from government evaluators on security vulnerabilities. Companies outside the tent do not. Over time, this could create a two-tier market: government-vetted models and everyone else.

Talent layer: CAISI director Chris Fall was recently appointed after former Anthropic staffer Collin Burns was reportedly pushed out after four days on the job. The revolving door between AI labs and government evaluators is spinning fast — and the direction of spin matters.

Geopolitical layer: The agreements are explicitly framed around "understanding in government of AI capabilities and the state of international AI competition." Translation: this is about China. The government wants to know what US models can do before Chinese labs figure it out from public releases.

What This Means for You

If you are an AI developer shipping frontier models: the pre-release review pathway now exists for Google, Microsoft, and xAI. If you are not one of those three, you have no formal channel. Start building relationships with CAISI now, or accept that your models will face asymmetric regulatory scrutiny relative to competitors who have government pre-clearance.

If you are an enterprise AI buyer: when procuring frontier models, ask vendors whether their models have undergone CAISI pre-deployment evaluation. If the answer is no, ask why. The government's security assessment is becoming a de facto quality signal — and its absence is becoming a risk signal.

If you are in financial services: Anthropic's exclusion from the CAISI framework does not directly affect its commercial finance products (the 10 new agents, the $1.5B JV with Blackstone/Goldman Sachs). But it does create uncertainty about Anthropic's government relationships — and financial services is a regulated industry where government relationships matter. Monitor the Anthropic-Pentagon lawsuit. If the supply chain designation sticks, it could complicate Anthropic's ability to sell into government-adjacent financial infrastructure.

If you are a policymaker or advisor: the UK model is being explicitly referenced as the template. Study the UK's AI safety review framework now — it is likely to be the starting point for US executive action.

Uncertainty Ledger

Will the executive order actually happen? The NYT reports it is under consideration. The CAISI agreements give it momentum, but an executive order requires White House sign-off, and Trump's AI policy instincts have been deregulatory. The contradiction between "let that baby thrive" and "pre-clearance review" has not been resolved.
What happens to Anthropic? The company says it is "in talks" with the government about Claude Mythos. But the CAISI exclusion and the ongoing lawsuit create a toxic combination. Resolution probably requires one side to blink — either Anthropic grants more access or the Pentagon withdraws the supply chain designation.
Will OpenAI get expanded access? The existing partnership was "renegotiated" but not expanded. OpenAI's exclusion from the new agreements is less politically charged than Anthropic's — but it still leaves the most prominent AI company without formal pre-release review.
How will this interact with state-level AI regulation? Trump's January legislative framework explicitly aimed to create "a uniform national AI policy that would supersede state rules." If the CAISI framework becomes the federal standard, it could preempt stricter state-level AI safety laws — which would be a win for the industry and a loss for state regulators.

Bottom Line

The US government just built a formal pre-release review pipeline for frontier AI models — and only three companies are invited. Google, Microsoft, and xAI now have a regulatory moat. Anthropic, locked in a legal battle with the Pentagon, does not. This is not a story about AI safety. It is a story about who gets to operate, who gets to shape the rules, and what happens when you say no to the government. The answer, increasingly, is that you get designated a national security risk and excluded from the room where the future is being decided.

Sources

Source	Tier
NIST — "CAISI Signs Agreements Regarding Frontier AI National Security Testing With Google DeepMind, Microsoft and xAI" (May 5, 2026)	Tier 1
Forbes — "Trump Administration Will Test New AI Models From Google, Microsoft And XAI Before Release Under New Deal" (Ty Roush, May 5, 2026)	Tier 1
Axios — "U. S. ramps up frontier AI testing as White House pivots toward safety" (Ashley Gold, May 5, 2026)	Tier 2
Washington Post — "AI & Tech Brief: Trump admin to test frontier models" (May 5, 2026)	Tier 1
New York Times — "White House Considers Vetting AI Models Before They Are Released" (May 4, 2026)	Tier 1
Forbes — "Anthropic Is Now Targeting Finance After Revolutionizing Coding" (Rashi Shrivastava, May 5, 2026)	Tier 1
Law360 — "Anthropic Launches AI Biz With Goldman Sachs, Blackstone" (Alex Davidson, May 5, 2026)	Tier 2