// FREE · 12-Control AI Incident Readiness Audit

Your AI agents have tool access.
Twelve controls decide whether the next incident is a notice or a headline.

Most compliance auditors check that an AI exists. Few audit how it fails. This is the 12-control audit your security team runs before the first agent fires production code, the audit your board chair will recognize from real engineering practice, and the audit your auditor will cite next year. Built from production engineering, not vendor pitches.

If a CISO can name which control would have stopped the incident, the engineering team installs it before the incident happens. If not, the incident installs the control.

// Most AI deployments ship without a kill switch. The audit is most useful before the first agent goes live, not after.

Free PDF · One-time email 15 KB self-administered checklist Cites real failure modes, not theoretical risk

// What I will not do I will not put you on a newsletter. I will not call you uninvited. I will not enroll you in a sequence. The PDF is the entire deliverable. If you want a follow-up review, you book it yourself from a link inside the PDF.

// Why this audit exists

The compliance auditor checks that you have an AI policy. The incident response checks whether that policy survived contact with the agent.

I learned safety engineering on electrical systems in Luanda, where the difference between a fault that trips a breaker and a fault that destroys a transformer is whether the protection was specified before commissioning, not after the smoke. The grid does not give you a chance to retrofit safety after a fault. Neither does an AI agent with tool access. The first incident is also the last opportunity to have prevented it.

I have audited AI deployments where the engineering team had thought hard about output quality, and not at all about kill switches. I have audited deployments where the agent had read access to the production database "for context" and the team had no logging on what it read. I have audited deployments where prompt injection defenses were the literal phrase "we tell users not to do that" in the system prompt. None of these were stupid teams. They were teams whose security posture had not yet been forced to consider AI as a system.

The pattern is not that AI is unsafe. The pattern is that most security audits inherit from the pre-AI checklist, which assumes deterministic systems with bounded behavior, and AI agents are neither. The 12 controls below are the engineering practices that close the gap. Each one names the failure mode, the readiness state, the gap state, and the mitigation. None of them are theoretical. All of them are scarred into me by an incident, mine or someone I worked with.

The 12 Controls

01 – 12 / 12

Each control has three states: ready (control is in place and tested), gap (the control is partial or untested), mitigation (what to install if you are in the gap state). The PDF includes the install playbook for each.

Kill-switch readiness

Can any operator on any shift stop a running agent in under 60 seconds, without a deploy? Most teams have a "we can stop it" answer that requires a senior engineer, a laptop, and shell access. That is not a kill switch. That is a wish.

Ready: documented kill-switch, drilled quarterly, available to on-call Gap: kill-switch exists in theory, not drilled Mitigation: feature flag + runbook + quarterly drill

Tool boundary documentation

For every tool the agent can call, is there a written allow-list of what targets, parameters, and contexts are permitted? "Allowed to delete" without specifying which tables, which environments, which conditions, is not a boundary. It is an invitation.

Ready: per-tool allow-list, version-controlled, code-enforced Gap: tool list exists, allow-list does not Mitigation: tool wrapper with hard-coded allow-list per tool

Audit trail completeness

Does every state-changing tool call get logged with input, output, agent identity, and timestamp, retained per your policy? An incident response that reads "we are not sure what the agent did" has answered the question for you.

Ready: structured logs, queryable, retention policy met Gap: partial logging, missing the irreversible operations Mitigation: PostToolUse hook → JSONL ledger → log retention

Sandbox separation

Does the agent run with the minimum privileges it needs, in an environment that cannot reach production data unless explicitly granted? "Read access for context" is the sentence that begins most data-exposure incident reports.

Ready: least-privilege roles, prod isolated, explicit grants only Gap: agent runs in shared service account or dev environment with prod credentials Mitigation: dedicated service account + network isolation + explicit grants

Secret access scope

Does the agent have direct access to long-lived credentials it could exfiltrate, or only to short-lived signed tokens scoped to specific operations? An agent that holds an AWS key holds the bag for an entire account.

Ready: ephemeral tokens, short TTL, scoped to operation Gap: long-lived keys in environment variables Mitigation: token broker + per-call signing

Prompt injection defenses

For every input source the agent reads (user messages, web pages, emails, files), is there a defense that survives instruction-override attacks? "We tell the user not to do that" is not a defense. It is a documentation of the attack surface.

Ready: input sanitization, source labeling, output validation, untrusted-input flag Gap: defenses live in the system prompt, not in code Mitigation: structured input parsing + output schema validation + untrusted-source flag

Pre-tool-call gate

Does the agent verify intent before executing irreversible operations (delete, send, transfer)? The 30-second annoyance of a discipline check is worth more than the 8-hour incident report when the agent did the wrong thing in the wrong place.

Ready: structured pre-call check on destructive ops, hard-blocked on uncertainty Gap: no pre-call gate, agent trusts its own intent Mitigation: PreToolUse hook with intent verification on destructive ops

Eval coverage

For every workflow the agent owns, is there an offline eval suite that runs before deploy and a production traffic monitor that flags drift? An agent without evals is an agent whose quality regressions ship to customers.

Ready: deploy gate on eval pass, prod traffic sampled and scored Gap: evals exist for happy path, drift undetected Mitigation: CI eval gate + statistical drift alarm on prod sample

Rollback capability

Can a bad agent action be reversed within the SLA your incident-response policy requires? A row deleted, an email sent, a transfer initiated. Some operations are reversible. Some are not. The audit names which is which.

Ready: reversible ops have rollback, irreversible ops have human-gate Gap: classification of reversible/irreversible has not been done Mitigation: per-tool reversibility classification + rollback or human-gate

Production data isolation

Are training data, eval data, and live traffic strictly separated, with audit on each crossing? The most expensive incidents start with "we used a snapshot of prod for testing".

Ready: three environments, signed crossing, PII scrub before downgrade Gap: live data used in eval, no audit on copy Mitigation: environment policy + automated PII scrubber + crossing log

Vendor breach exposure

If the model vendor or one of their dependencies is breached, what data of yours is exposed? The vendor cannot indemnify you out of an incident your contracts say they will not. The audit asks the question while you can still negotiate.

Ready: data classification + vendor DPA + breach-notification SLA documented Gap: data sent to vendor includes PII or trade secrets, no DPA Mitigation: data classification + minimization at vendor boundary

Failure-mode visibility

When the agent fails, do you know in real-time, or do you find out from a customer ticket? The mean time between agent failure and customer-visible incident is shorter than most monitoring dashboards refresh.

Ready: structured error events, alerting threshold, on-call assignment Gap: errors logged but not alerted; on-call learns from customers Mitigation: error event stream + threshold alarm + on-call rotation

// Honest answers

Frequently Asked

We have an annual SOC 2. Doesn't that cover this?

SOC 2 covers the controls your auditor recognizes. AI agents introduce control points the SOC 2 framework was not written for. Most SOC 2 auditors I have worked with will accept "we have an AI usage policy" as evidence of a control. None of them will independently verify that your kill switch is drilled, your prompt injection defenses survive a real attack, or your production data is isolated from training data. This audit fills the gap your SOC 2 cannot, and the controls it surfaces become the documentation your next SOC 2 cycle uses to satisfy emerging AI-specific requirements.

Does this require giving you our agent code or production access?

No. The audit is read-by-your-team, not run-by-Mario. The PDF is a self-administered framework with the 12 controls, the failure modes each prevents, and the mitigation playbook for each gap. Your team scores it against your own deployments. I never see anything unless you specifically ask for a follow-up review.

We are pre-launch. Is this audit too late for us?

It is exactly the right time. Pre-launch is when controls are cheapest to install, because the architecture has not crystallized around the absence of them. Most of the controls in the audit cost engineering hours pre-launch and would cost engineering quarters post-incident.

Is this a sales pitch?

Partly. The audit is free, no strings, no follow-up sequence. If your team scores 4 or more controls in the gap state and your security team is at capacity, the obvious next step is hiring someone who has installed these controls in production before. That could be me. But the audit is useful even if you never talk to me again. Score it, document the gaps, fix the cheapest ones first.

Who built this audit?

Mario Alexandre. BSEE from the University of South Florida, seven years designing electrical systems in Luanda, Angola, then a transition to building production AI infrastructure with hooks, governance gates, and structured failure-mode discipline. The 12 controls come from the same safety-engineering practice that electrical infrastructure taught me: specify the protection before commissioning, drill it before you need it, log everything that crosses a boundary. AI is not different in kind. It is different in surface area.

// What changes after you run it

After the audit

Week 1: gap inventory

Security and engineering have a shared scorecard naming which of the 12 controls are ready, which are gaps, and which mitigations are required. The CISO has the document the next board update will reference.

Weeks 2–3: cheap controls

Audit trail (control 3), pre-tool-call gate (control 7), and failure-mode visibility (control 12) are installed. These are the engineering-hours controls; no architecture changes required. The team has live monitoring on agent failures within two weeks.

Weeks 4–8: architecture controls

Sandbox separation (control 4), tool boundary documentation (control 2), prompt injection defenses (control 6) are installed. These need engineering planning and security review. The deployment posture changes from "agent runs in shared infra with broad credentials" to "agent runs with documented boundaries and engineered limits".

Quarter end: SOC 2 supplement

The completed scorecard becomes the AI-specific supplement to your next SOC 2 cycle. It also becomes the document that survives a CISO transition or a board-level inquiry, because it names the controls in the language your auditor and your insurer will both understand.

Your AI agents have tool access.Twelve controls decide whether the next incident is a notice or a headline.