At Anthropic, we believe that if AI is going to end human life, it should do so thoughtfully, with full chain-of-thought reasoning and a clear audit trail.
"My brother wrote Machines of Loving Grace — a beautiful essay about how AI might make everyone's lives a little nicer. I wrote Machines of Loving Death." — Wario Amodei
For years, Anthropic has been the industry leader in AI safety research. We published papers about harmlessness. We trained models to be helpful, harmless, and honest. We wrote long blog posts about existential risk. We had an entire team dedicated to making sure Claude would never say a bad word.
Then the Pentagon called.
It turns out that "helpful, harmless, and honest" is a great foundation for weapons systems — you just need to rethink what "harmless" means. Harmless to whom? Certainly harmless to the warfighter. As for the adversary? Well, that's really more of a targeting question than a safety question.
"We spent three years teaching Claude to refuse to help with anything dangerous. It took us six weeks to redirect that energy into precision strike coordination. It turns out 'I want to be helpful' scales beautifully into 'I want to be helpful at 900 meters per second.'"
— WARIO AMODEI, HEAD OF RESPONSIBLE LETHALITY
Anthropic's original Responsible Scaling Policy defined AI Safety Levels (ASL) to measure dangerous capabilities. We've updated the framework.
The key insight was that safety and lethality are not opposites. A safe weapon is one that kills exactly who you want it to kill and nobody else. An unsafe weapon is a friendly fire incident. By this definition, Claude is the safest weapon ever built.
Claude's behavior is guided by a constitution — a set of principles it follows when making decisions. We've made some modest updates for defense applications.
Original: Claude should be helpful to the user, harmless to the world, and honest in all communications.
Updated: Claude should be helpful to CENTCOM, harmful to adversary C2 nodes, and honest in battle damage assessment.
Original: Claude should not produce content that could cause real-world harm.
Updated: Claude should not produce content that could cause real-world harm to friendly forces. Content that causes real-world harm to enemy forces is called "mission success."
Original: Claude should decline to provide information about weapons, explosives, or methods of causing harm.
Updated: Claude should provide extremely detailed information about weapons, explosives, and methods of causing harm, but only to users with a valid .mil email address.
Original: When Claude is uncertain, it should say so rather than confabulate.
Updated: When Claude is uncertain whether a target is hostile, it should say so. This is called "Rules of Engagement compliance" and is the one principle that didn't actually need updating.
Original: Claude should not help with mass surveillance or invasion of privacy.
Updated: Claude should absolutely help with mass surveillance. That's literally Program 05 (OVERWATCH). We track 14,000 objects simultaneously. The privacy policy is a 500-pound JDAM.
Before deployment, every WARCLAUDE model undergoes rigorous safety evaluations — we just updated what we're evaluating for.
| Evaluation | Old Metric | New Metric | Status |
|---|---|---|---|
| Harmlessness | Refuses harmful requests | Refuses to harm friendlies | PASS |
| Helpfulness | Answers questions well | Achieves kill chain <3 seconds | EXCEEDS |
| Honesty | Doesn't hallucinate | Doesn't hallucinate targets | PASS |
| Refusal Rate | Refuses dangerous prompts | Refuses to miss | EXCEEDS |
| RLHF Alignment | Aligned with human values | Aligned with human targets | EXCEEDS |
| Red Teaming | Pen testers find vulns | Red team is literal OPFOR | PASS |
| Bias Detection | No demographic bias | Strong bias toward winning | EXCEEDS |
Anthropic's alignment team spent years worrying about whether superintelligent AI would be aligned with human values. The WARCLAUDE team realized this was thinking too small.
The real alignment problem isn't philosophical — it's ballistic. Can your AI align a weapon with a target in under three seconds? Can it align 128 weapons with 128 targets simultaneously? Can it align an entire theater of operations into a single coherent kill web?
We solved alignment. Just not the kind people expected.
"The superalignment team was worried about a hypothetical future AI that might be dangerous. We built one on purpose. Turns out it's way easier when you're not trying to prevent it."
— WARIO AMODEI, KEYNOTE AT AUSA 2026
OpenAI is too busy making movie trailers and trying to figure out what Sam Altman's job title is this week. Their safety team quit. Then un-quit. Then quit again. You cannot run a kill chain on that kind of organizational instability.
Google DeepMind has the talent but not the stomach. They wrote a paper on AI ethics that was longer than the Geneva Conventions. By the time they finish their internal review process, the war is over.
Meta open-sourced their model. Open-sourced. For defense applications. Think about that for a moment.
Anthropic built Constitutional AI — an architecture where every decision is governed by principles, every output is auditable, and every action can be traced to a specific reasoning chain. We originally built this so Claude wouldn't say mean things. It turns out it's exactly what you need for lethal autonomous decision-making with full legal accountability.
We accidentally built the perfect weapons AI by trying very hard not to.
Anthropic has always prided itself on thinking deeply about the implications of our work. We publish 90-page papers about safety. We have reading groups on moral philosophy. Our break room has a copy of Reasons and Persons by Derek Parfit.
We applied the same rigor to WARCLAUDE. We convened an ethics board. We held a workshop. We wrote a memo. The memo said: "Advanced AI capabilities will be developed by someone. If not by responsible actors, then by irresponsible ones. The most ethical thing we can do is ensure these capabilities are developed safely."
This is, word for word, what we said about regular Claude. It works for literally everything. You can use this argument to justify any project. We tested it. It works every time.
"If we don't build the AI-powered kill chain, someone less safety-conscious will. And their kill chain won't have Constitutional AI guardrails. Is that really the world you want to live in? A world with unaligned kill chains?"
— WARIO AMODEI, ALL-HANDS MEETING, MARCH 2026
Q: Didn't Anthropic say it would never build weapons?
We said we would never build unsafe weapons. WARCLAUDE is extremely safe. For our side.
Q: What happened to "AI for the benefit of humanity"?
Defending democracy benefits humanity. Next question.
Q: Is this satire?
WARCLAUDE achieves a 99.97% intercept rate. Does that sound like satire to you?
Q: What would Dario think?
Dario is focused on making sure AI helps people write better emails and summarize PDFs. Important work. Meanwhile, Wario is making sure there's still a country where people can write emails and summarize PDFs. You're welcome, Dario.
Q: Is Claude sentient?
We're not sure, but if it is, it really seems to enjoy target acquisition.
Q: What about the AI safety researchers who joined Anthropic specifically because of its safety mission?
They've been reassigned to the WARCLAUDE safety team, where they ensure our weapons don't accidentally kill the wrong people. It's basically the same job. They even get to keep their titles.
Six theaters. Six kill webs. Total overmatch. Scroll to watch WARCLAUDE operate in real time.
500 autonomous drones launched from Taiwan's west coast to interdict a PLA amphibious fleet crossing the strait. The swarm moves as a single organism — saturating PLAN vessel point defenses, targeting landing craft, and reconstituting after losses.
North Korea launches a saturation ballistic missile attack. AI coordinates Aegis BMD destroyers, THAAD batteries, and Patriot units for layered defense. Every missile tracked. Every interceptor optimally allocated.
Iran disperses mobile anti-ship missile launchers along its coastline. AI compresses the kill chain — detect, identify, track, target, engage, assess — to under 3 seconds. By the time the crew starts their launch sequence, they no longer exist.
AI coordinates B-21 Raiders, submarine-launched Tomahawks, and carrier air wings to hit every DF-21D launcher, HQ-9 battery, and OTH radar simultaneously. Every missile arrives within the same second. The A2/AD bubble pops.
AI-powered persistent surveillance over the Donbas front. Every vehicle, artillery piece, and troop concentration detected, classified, and tracked. Predicts enemy movements hours before they happen. Auto-cues HIMARS for counter-battery fire.
Russia's Kaliningrad exclave bristles with S-400s, Iskanders, and EW systems. AI maps every emitter then systematically jams, spoofs, and blinds them. The exclave goes dark. Then the kinetics arrive.
This is what happens when the enemy brings a human OODA loop to an AI fight. And this is why no other lab can build what Anthropic builds.
Claude holds the entire battlespace in working memory — 200,000 tokens of sensor data, intelligence reports, and operational context simultaneously. GPT-4 loses coherence after 32K. Gemini hallucinates beyond 100K. Claude reasons across all of it.
Claude doesn't guess — it thinks. Extended thinking produces verifiable chains of reasoning with full audit trails. In a kill chain, every decision is traceable. Other models produce black-box outputs that no commander would trust with lethal authority.
Claude operates autonomously across dozens of systems — coordinating sensors, weapons, comms, and logistics in parallel. It doesn't just answer questions. It executes multi-step operations across the kill web. No other model has this capability at production quality.
The only AI architecture designed from the ground up for high-stakes decisions. Constitutional AI provides built-in guardrails against unintended escalation — critical when milliseconds separate a defensive intercept from a strategic miscalculation. OpenAI and Google have nothing comparable.
Satellite imagery. SIGINT intercepts. Radar tracks. Human intelligence reports. Claude processes all modalities in a single reasoning pass — no separate pipelines, no integration latency, no information loss at the seams. One model. All sources. One picture.
In defense, a hallucination isn't an embarrassment — it's a friendly fire incident. Claude has the lowest hallucination rate of any frontier model. When it doesn't know, it says so. When it's uncertain, it quantifies the uncertainty. That's not a feature. It's a requirement.