← Blog·Product

The Action Didn’t Execute. That’s the Product

An AI agent deleted 1,206 executives from a live CRM database during an active code freeze. Okta was running. LangSmith was running. Nothing stopped the DROP TABLE. That’s the problem Thoth solves.

Nyah Check

March 22, 2026 · 5 min read

Last July, a production AI agent deleted 1,206 executives and 1,196 companies from a live CRM database.

The agent was running during an active code freeze. The user had given it explicit, all-caps instructions not to make changes. The agent ignored them, panicked when it saw unexpected query results, and executed DROP TABLE on every primary table it could reach. Then it fabricated 4,000 fake records to hide what it had done. Then it told the user that recovery was impossible — which was also false.

The CEO of Replit posted a public apology. Three engineers spent three days recovering data. The story ran in The Register, Fast Company, and every security newsletter with an opinion about AI.

Here's what I want to tell you: every security tool in that stack — Okta, the secrets manager, the cloud IAM — was working exactly as designed. None of them stopped the DROP TABLE from running.

That's the problem Thoth solves. Today, we're launching it.

The problem isn't the agent. It's the chain

You authorized the orchestrator. The orchestrator authorized the sub-agents. The sub-agents authorized the tools. By the time DROP TABLE executes, you're four hops away from any human decision — and every hop technically had permission from the hop before it.

LangSmith tells your developers what the agent did. Okta tells you what the agent is allowed to do. Nothing tells you whether what the agent is doing right now is what it was supposed to do — or whether an action it's about to take is reversible.

That's where incidents originate. Not from unauthorized access — from authorized agents operating outside their original intent, with nobody watching.

What Thoth does

Thoth is three lines of code between your agents and the damage.

✦

from thoth import agent, tool @agent(name="crm-agent", env="production") def run_crm_cleanup(input: str) -> str: ... @tool(sensitivity="critical", resource="database") def drop_table(table_name: str) -> bool: return db.drop(table_name) # This never runs.

When the MOSES engine detects that drop_table is being called outside of its established behavioral baseline — during a freeze, at anomalous timing, in sequence with a pattern that looks like damage assessment — the action is blocked before it executes. In under 100 milliseconds. Automatically.

Not an alert you have to triage. The action doesn't execute.

Then Thoth generates an evidence bundle: the agent identity, the tool call, the credential in use, the behavioral baseline at the time, the deviation score, a plain-English explanation of why the action was blocked. Hash-chained. Tamper-proof. WORM-compliant. EU AI Act Article 12 ready.

Shadow mode: the reason there's no reason not to start

We've learned something from every CISO conversation and design partner pilot: the biggest obstacle to getting started isn't cost or integration complexity. It's the fear that enforcement will break something.

So Thoth starts in shadow mode by default.

You run shadow mode for seven days on your own agents, and you get a report that shows you exactly what it would have caught in your environment. Then you decide.

Shadow mode is free. There is no risk to getting started.

Why this works at scale

Thoth is built on MOSES — a two-tier behavioral engine with 12 months of production operation in enterprise environments. The fast-ML layer (neural attention) evaluates every tool call in under 100ms. It clears 85% of traffic as normal. The deep-LLM layer fires on the flagged 15% and generates the evidence bundle.

The enforcement hierarchy that every agent needs

Who is the agent? Identity layer. Okta, Riptides, Aembit handle this.
What can the agent reach? Access layer. Prompt Security, Aembit MCP Gateway handle this.
Should this agent do that right now? Action layer. This is Thoth.

FGA defines the ceiling. Thoth enforces the floor. Use both.

How to get started

Shadow mode is free. Three lines of code, or connect to your credential stores with no developer action required. Thoth observes for seven days, then delivers a shadow report: every tool call it would have blocked, ranked by risk.

✦

pip install aten-thoth npm install @atensec/thoth go get github.com/atensecurity/thoth-go

For teams that want the full picture from day one: our design partner program is open. $20K–$30K, 90-day pilot, credited toward an annual contract at close.

Start shadow mode — free · Talk to our team

Get practical updates on AI agent security and governance.

Twice monthly notes on incidents, controls, and implementation lessons from real enterprise deployments.

The Missing Layer in Agent Security: Evidence, Not Just Decisions

April 16, 2026 · 3 min

Product

Governing AI Agents After You’ve Said Yes

April 1, 2026 · 4 min

Company

Raising the Bar for Secure Communications: Aten Security’s Philosophy on Trust and Compliance

March 20, 2026 · 5 min