Map capabilities before deployment
List every tool the agent can call: shell commands, browsers, file systems, databases, ticket systems, email, calendars, cloud APIs, and deployment tools. A capability map makes hidden risk visible.
For each tool, identify read permissions, write permissions, destructive actions, credential exposure, and whether humans approve actions before execution.
Defend against prompt injection
Agents that read webpages, documents, tickets, or emails can encounter malicious instructions embedded in content. Treat external text as untrusted input, even when it appears inside a normal document.
Use explicit tool policies, retrieval boundaries, and review steps for sensitive actions. The model should not be able to override system rules because a webpage asked it to.
Log actions, not just messages
A useful audit trail includes prompts, retrieved context, tool calls, files touched, commands run, API calls, approvals, and final outputs. Logs should be searchable when an incident happens.
For engineering agents, keep generated changes in normal version-control workflows. For business automation agents, store enough context to explain why an action was taken.
Use staged permissions
Begin with read-only access, then limited write access, then scoped automation. Each stage should have rollback instructions and a clear owner.
The goal is not to remove humans from every step. The goal is to reserve human attention for decisions that carry security, financial, legal, or customer trust risk.
Practical checklist
- 1Inventory every tool and permission.
- 2Treat external content as untrusted.
- 3Require approval for sensitive actions.
- 4Log tool calls and outputs.
- 5Roll out permissions in stages.
Related comparisons