Agentic AI in Security: Why Control Posture Matters More Than Capability

Agentic AI is no longer a research concept. It is arriving in security tooling right now — in vulnerability management platforms, in SIEM enrichment pipelines, in incident response workflows. Whether your organisation has made a deliberate decision about it or not, the probability is high that it is already touching your environment in some form.

My concern is not the capability. The capability is genuinely impressive. My concern is that most of the conversation around agentic AI in security is being led by vendors and AI engineers — not by the people who will be held accountable when something goes wrong.

That needs to change.

What “agentic” actually means

A standard AI interaction is a question and an answer. You ask, it responds, you decide what to do with that response.

An agentic system is different. It is given a goal and a set of tools, and it takes a sequence of autonomous actions to achieve that goal. It can query APIs, read logs, write tickets, execute scripts, make decisions mid-task based on what it finds, and loop back on itself when results are not what it expected.

The distinction matters enormously in security. A system that gives an analyst a recommendation is a decision-support tool. A system that acts on that recommendation without waiting is an autonomous agent — and it needs to be governed accordingly.

The control question no-one is asking

When I evaluate any new capability in a security environment, the first question is not “what can this do?” It is “what happens when this goes wrong, and who is accountable?”

Agentic AI sharpens that question considerably, because the failure modes are novel.

Traditional automation fails in predictable ways. A script either runs or it does not. An API call either succeeds or returns an error. Agentic systems fail in probabilistic, context-dependent ways. They can take a sequence of individually reasonable steps that leads to an unreasonable outcome. They can be manipulated by the data they are processing. They can make confident decisions based on incomplete information.

This is not an argument against using them. It is an argument for being precise about what controls you put around them.

Human-in-the-loop vs human-on-the-loop

This is the most important architectural decision you will make when deploying agentic AI in a security context, and it is worth being exact about what each means.

Human-in-the-loop means the agent pauses and requests authorisation before taking any consequential action. The human is a gate. Nothing happens without explicit sign-off. The agent might identify a suspicious lateral-movement pattern and assemble all the evidence, but the decision to isolate the affected host sits with an analyst.

Human-on-the-loop means the agent acts autonomously and the human monitors, with the ability to intervene or reverse. The agent isolates the host, the analyst sees it happen in real time, and can roll back if the decision was wrong.

Both have legitimate use cases. The error I see organisations making is defaulting to human-on-the-loop because it is faster, without working through what that means for their control framework.

Consider what your auditor will ask. For any security action — blocking an IP, modifying an access-control list, quarantining a file, disabling an account — you need to be able to demonstrate that the action was authorised, proportionate, and documented. Human-in-the-loop gives you that evidence naturally — the approval is the record. Human-on-the-loop requires you to design that evidence into the system deliberately, because the action and the authorisation are no longer the same event.

My recommendation for most organisations starting out: begin with human-in-the-loop for anything with write permissions. You can relax that control once you have evidence of decision quality. Going the other direction — tightening controls on an autonomous system after something goes wrong — is a much harder conversation to have with a board or a regulator.

Least privilege is not optional

The principle of least privilege applies to agentic systems with the same force it applies to human users and service accounts — arguably more so, given the autonomous nature of the actions.

An agent that needs to read logs does not need write access to your SIEM. An agent that needs to query your vulnerability scanner does not need credentials to your firewall-management console. An agent that triages alerts does not need the ability to execute remediation scripts.

This sounds obvious. In practice, I have seen agentic deployments where the agent was given broad API access because it was easier to configure, with the intention of tightening it later. Later rarely comes, and broad access on an autonomous system is a significant risk — not just from the agent making a mistake, but from an attacker who finds a way to manipulate the agent’s behaviour.

The blast-radius principle should govern every permission decision. Ask: if this agent were to take the worst plausible action using the access it has, what is the impact? If the answer makes you uncomfortable, reduce the access.

Prompt injection: the attack surface most teams are missing

This deserves its own section because it is genuinely underappreciated.

Prompt injection is what happens when an attacker embeds instructions in data that the agent will process, with the intent of manipulating the agent’s behaviour. In a security context, the attack surface is substantial: log files, email subjects, file names, ticket descriptions, CVE summaries, DNS records.

An attacker who knows your environment uses an agentic triage system could craft a log entry designed to instruct the agent to deprioritise a genuine alert, exfiltrate a finding to an external endpoint, or generate a false negative in a compliance report. The agent reads the log as data. It also reads the embedded instruction as data. If the system is not designed to separate the two, the instruction wins.

Defending against this requires explicit design choices: input sanitisation before data reaches the agent, separation between the agent’s instruction context and the data context it is operating on, and anomaly detection on agent behaviour itself. It is not a solved problem, but awareness of it should be a prerequisite for any agentic security deployment.

What a good control posture looks like

Drawing this together into something actionable — when I am assessing an agentic AI deployment in a security environment, these are the controls I expect to see:

Scope boundaries. The agent’s operational scope is explicitly defined and technically enforced. It cannot act outside that scope regardless of what it is instructed to do.
Permission minimisation. The agent holds only the permissions required for its defined function. Permissions are reviewed on the same cycle as service accounts.
Tamper-evident audit trail. Every action the agent takes — including the reasoning that led to it — is logged in a way that cannot be modified by the agent itself. This is non-negotiable for compliance.
Authorisation gates for consequential actions. Any action that modifies system state requires human authorisation, at least until decision quality has been demonstrated and documented.
Input validation. Data processed by the agent is treated as untrusted input and handled accordingly, with awareness of prompt injection as a specific threat vector.
Failure-mode design. What happens when the agent encounters something outside its expected parameters? It should fail safe — escalate to a human, stop and log — not fail open.

The compliance position

For those operating under formal frameworks — ISO 27001, SOC 2, NIS2, DORA — the good news is that none of these frameworks prohibit agentic AI. The controls they require around automated processing, access management, audit logging, and change authorisation map reasonably well onto what good agentic AI governance looks like.

The risk is not incompatibility. The risk is assuming that because the AI made the decision, the accountability has moved somewhere else. It has not. The organisation that deployed the agent is accountable for what it does. The CISO who signed off on the deployment is accountable for the control posture around it.

That accountability is not a reason to avoid the technology. Agentic AI has genuine potential to improve security outcomes — faster triage, more consistent analysis, better coverage at scale. But that potential is only realised safely if the people accountable for security outcomes are the ones shaping how it is deployed.

Where to start

If you are scoping an agentic AI capability for your security environment, I would suggest starting with a read-only, human-in-the-loop use case: something like vulnerability prioritisation, compliance-gap reporting, or log correlation that produces a recommendation for an analyst to act on.

This gives you real operational experience with how the agent performs, what its failure modes look like, and what the audit trail needs to contain — before you extend any write permissions or move toward autonomous action.

Build the governance first. The capability will follow.

What this guide deliberately does not cover

Vendor selection. Which agentic platform to choose is a separate evaluation, out of scope here.
ROI and cost analysis. The economic case is real but is its own argument.
Non-security agentic AI. The control posture above is built for the security context; many of the same principles apply elsewhere, but the threat model is different.
Reference architecture, threat-modelling details, and audit- trail design — covered in the follow-up piece in this series.

The second piece in this series goes deeper: reference architecture, threat-modelling an agentic pipeline, prompt-injection defences, and audit-trail design for compliance.

What “agentic” actually means#

The control question no-one is asking#

Human-in-the-loop vs human-on-the-loop#

Least privilege is not optional#

Prompt injection: the attack surface most teams are missing#

What a good control posture looks like#

The compliance position#

Where to start#

What this guide deliberately does not cover#