An AI agent is software built around a large language model that pursues a goal with little step-by-step direction. Give it a task and it plans the steps, chooses tools like a search API or a database, takes an action, checks the result, and loops until it is done. Anthropic describes agents as LLMs that dynamically direct their own tool usage. That autonomy is the whole point, and it is what separates an agent from a chatbot.
What is an AI agent, exactly?
The vendors building agents agree on the core idea. OpenAI's Practical Guide to Building Agents (April 2025) calls them "systems that independently accomplish tasks on your behalf." IBM defines an agent as a system "capable of autonomously performing tasks on behalf of a user by designing its workflow and utilising available tools."
The common thread is autonomy over a workflow, not a single reply. A chatbot answers the question in front of it. An agent takes a goal, breaks it into steps, and works through them, deciding for itself which tool to reach for at each point.
Three parts make that possible. First a model, the LLM that does the reasoning. Second a set of tools, the APIs, databases and functions it can call to gather information or act on the world. Third a memory, so it carries context from one step to the next rather than starting fresh each time.
How does an AI agent actually work?
Under the surface, an agent runs a simple cycle. Anthropic's engineering team puts it plainly: agents "are typically just LLMs using tools based on environmental feedback in a loop." The model observes, decides on a step, acts, then reads the result and decides again.
Say you ask an agent to reconcile last month's supplier invoices. It reads the inbox, pulls each PDF, calls an accounting API to match line items, flags three mismatches, drafts a query email for each, and reports back. No human picks the tools at each stage. The agent does.
That feedback matters. Anthropic notes it is crucial for an agent to get "ground truth" from the environment at each step, such as a tool result or a code-execution output, so it can judge its own progress. Without that check, an agent drifts. With it, the agent can self-correct partway through a job.
Agents versus workflows
Not everything marketed as an agent is one. Anthropic draws a useful line: in a workflow, "LLMs and tools are orchestrated through predefined code paths." In an agent, the LLM "dynamically directs its own processes and tool usage." A fixed pipeline that always runs the same five steps is a workflow, however clever. An agent decides the steps as it goes.
How is an AI agent different from a chatbot or RPA?
This is where most confusion lives, so it helps to line the three up side by side. A chatbot responds. Robotic process automation (RPA) repeats a fixed script. An agent reasons about a goal and adapts. The table below shows where each fits.
Dimension | Chatbot | RPA (rule-based automation) | AI agent |
|---|---|---|---|
What it does | Answers a prompt or question | Follows a fixed, pre-scripted sequence | Pursues a goal, planning steps as it goes |
Handles the unexpected | No, stays within its script | No, breaks when the screen or data changes | Yes, re-plans from environmental feedback |
Uses external tools | Rarely, and only if wired in | Yes, but only the ones scripted for it | Yes, selects tools dynamically at each step |
Memory across steps | Limited to the conversation | None, each run is independent | Yes, carries context through the task |
Best for | FAQs, triage, drafting text | High-volume, stable, repetitive tasks | Multi-step tasks with unstructured inputs |
Fails when | The question needs action, not an answer | The process or interface changes | The goal is vague or tools are unreliable |
The practical takeaway: RPA is the right tool when a process never changes and volume is high. Agents earn their keep when inputs are messy and the path to the answer is not fixed. Many real deployments now pair the two, with an agent deciding what to do and RPA doing the stable, repeatable clicks.
What can AI agents actually do in a business?
The useful cases in 2026 are narrow and specific, not the "autonomous digital employee" of the headlines. The agents delivering value handle bounded, repeatable work where a human still signs off on anything consequential.
Customer operations
An agent can read an incoming ticket, look up the order in your system, check the returns policy, and draft a resolution for an advisor to approve. It handles the fetching and drafting. The human keeps the judgement call and the final send.
Finance and back office
Invoice matching, expense checking and data reconciliation suit agents well. The inputs vary (every supplier formats a PDF differently) but the goal is fixed. An agent copes with the variation that would break a rigid RPA script.
Research and sales support
Agents can compile a prospect briefing by pulling from Companies House, recent news and your CRM, then produce a one-page summary before a call. It is grunt work that eats a rep's morning, and it is well within scope for a supervised agent.
McKinsey's State of AI, published November 2025, found 62% of organisations are at least experimenting with AI agents and 23% are scaling them somewhere. Yet only about 6% qualify as AI high performers seeing significant value, and most report less than 5% of EBIT attributable to AI.
How many UK businesses actually use AI agents?
Fewer than the hype suggests, and it depends heavily on company size. The government's AI Adoption Research from DSIT, based on a survey of 3,500 UK businesses run between February and May 2025, found only 16% use any AI technology at all. Among those that do, agentic AI was the least adopted, used by just 7% of adopters.
Even the businesses planning to adopt AI are cautious about agents. In the same DSIT study, only 13% of planned adopters intend to use agentic AI, and just a quarter of those expect to implement it within twelve months. So among UK businesses generally, hands-on agent use is still a small minority in 2026.
The picture flips at the enterprise end. Salesforce's 2026 Connectivity Benchmark, surveyed with Vanson Bourne across October and November 2025 and reported by Computer Weekly, found 89% of UK and Ireland organisations already deploy AI agents, running about 12 on average. Large organisations are well ahead of the wider business base.
Governance is the weak spot. That same benchmark found roughly half of those agents sit in silos rather than working as a connected system, and only 54% of organisations have a centralised governance framework overseeing them. Deployment is running ahead of control, which is exactly the pattern the failure data warns about.
Do AI agents work, or is it hype?
Both, honestly. The technology is real and improving fast, but the failure rate on ambitious projects is high. Gartner predicts that over 40% of agentic AI projects will be cancelled by the end of 2027, citing escalating costs, unclear business value and inadequate risk controls.
Part of the problem is labelling. Gartner calls it "agent washing": vendors rebadging chatbots, assistants and old RPA as agentic without the underlying autonomy. Of the thousands of firms claiming agentic products, Gartner reckons only around 130 are the genuine article. So a good chunk of the market you are being sold is not agents at all.
The direction of travel is still clear, though. Gartner expects at least 15% of day-to-day work decisions to be made autonomously through agentic AI by 2028, up from effectively zero in 2024. The honest read for 2026: real capability, real value in narrow cases, and a lot of noise to filter first.
How should a UK business start with AI agents?
Start small and pick a task you can measure. The agents that pay back are aimed at one bounded, high-frequency job, not a sprawling "automate the department" brief. Prove value on something narrow before you widen the scope.
Keep a human in the loop on anything with consequences. Money moving, contracts, customer-facing decisions: the agent prepares, a person approves. That is the pattern in almost every deployment that survives contact with reality, and it maps directly to the governance gap the UK data keeps flagging.
The economics are forgiving because the tooling is cheap relative to labour. Tom & Co analysis of ONS Annual Survey of Hours and Earnings 2024 data found a 10-person team on a roughly £1,900-a-year business AI subscription only needs each person to save about 9 minutes a week to break even. A working agent clears that bar easily.
So the practical checklist is short. Choose one narrow task with a clear success measure. Insist on human approval for consequential actions. Buy from a vendor that can show real autonomy, not agent washing. Then measure the time saved against that low break-even bar before you scale.



