AI Agent Security Review Checklist

Map capabilities before deployment

List every tool the agent can call: shell commands, browsers, file systems, databases, ticket systems, email, calendars, cloud APIs, and deployment tools. A capability map makes hidden risk visible.

For each tool, identify read permissions, write permissions, destructive actions, credential exposure, and whether humans approve actions before execution.

Defend against prompt injection

Agents that read webpages, documents, tickets, or emails can encounter malicious instructions embedded in content. Treat external text as untrusted input, even when it appears inside a normal document.

Use explicit tool policies, retrieval boundaries, and review steps for sensitive actions. The model should not be able to override system rules because a webpage asked it to.

Log actions, not just messages

A useful audit trail includes prompts, retrieved context, tool calls, files touched, commands run, API calls, approvals, and final outputs. Logs should be searchable when an incident happens.

For engineering agents, keep generated changes in normal version-control workflows. For business automation agents, store enough context to explain why an action was taken.

Use staged permissions

Begin with read-only access, then limited write access, then scoped automation. Each stage should have rollback instructions and a clear owner.

The goal is not to remove humans from every step. The goal is to reserve human attention for decisions that carry security, financial, legal, or customer trust risk.

Practical checklist

1Inventory every tool and permission.
2Treat external content as untrusted.
3Require approval for sensitive actions.
4Log tool calls and outputs.
5Roll out permissions in stages.

Related comparisons

GPT-5.5 vs Claude Opus for Professional Work Claude Code vs Cursor OpenAI vs Gemini for Product Teams