Agent Discovery & Design
Framework
A structured approach to designing AI agents that work in production. The right questions, asked in the right order, from business case to live operations.
5 stages
32 sections
Evidence-based gates
AI agents operate differently than traditional software
AI agents bring powerful new capabilities to software systems. They can autonomously operate business complex processes, analyse and recommend, use IT systems, produce sophisticated deliverables and more. But AI agents also have features, behaviours, complexities and vulnerabilities that need to be understood and catered for during design. Some key examples are:
The agent sounds right when it's wrong. A fabricated answer sounds exactly like a correct one. There's no error message, no crash, no red flag. The first time you find out is when a real situation goes wrong.
It works until it doesn't. An agent that performs well today may degrade quietly over three months. The organisation changes, the data shifts, the context moves. Nothing visibly breaks.
Scale amplifies the gaps you already have. A person makes a bad call, it affects one case. An agent with the same gap repeats it across every case it handles, and there's no natural circuit breaker the way there is with a human team. And errors compound: an agent that's 95% accurate at each step is only 77% accurate across a five-step chain.
The most important failures are the ones you can't see. When an agent gets it obviously wrong (saying something absurd, crashing, being rude) someone notices and you fix it. The harder failures are the subtle ones. A clause misquoted slightly. Legal advice disguised as customer service. Each decision individually defensible, but systematically skewed. Standard monitoring won't catch these. They need a different approach.
Building it is the smaller problem. Most of the effort in successful AI adoption goes into people and processes, not the technology. Yet most agent projects invest almost entirely in the build phase. Who trains the team to work with probabilistic output? Who owns the agent when the build team moves on? Who decides when it's ready for more autonomy? We design for these questions from the start, not after launch.
19%
Only 19% of organisations have scaled AI agents beyond pilots. The rest stall before production.
Source: Databricks State of AI Agents, 2026
Serpin's Agent Discovery & Design Framework addresses all of these and more. The framework advances Agile development best-practices with more than 30 areas of AI agent-specific considerations covering design, architecture, testing, security, change management, governance and more.
AI has to earn its place
Code
Structured input, clear logic
"Is this invoice overdue?"
Code + AI
AI judges, code verifies
"Classify then validate"
AI agent
Unstructured input, clear decision logic
"Understand and respond"
Human
Unstructured input, unclear logic
"Is this genuinely unusual?"
Code
If the input is structured and the logic is clear, code is cheaper and more reliable. No AI needed. This is where most teams over-invest in AI when a simpler solution would do.
AI agent
If the input is messy but the decision logic is clear, that's where AI earns its place. In practice, most production agents combine AI interpretation with code verification.
Human
If both the input and the logic are unclear, you need a person. The first thing we do is work out which parts of your process sit where on this spectrum.
A structured framework in five stages
Serpin's Agent Discovery & Design Framework is a sequence of five stages, each building on the last. Each stage answers a different set of key questions and produces clear evidence and deliverables to inform the next stage. Decision gates between each stage allow stakeholders to ensure the evidence supports moving forward. Each gate builds confidence that you're designing the right solution in the right way to deliver the intended benefits and goals.
Stage 1
Is this the right use case?
This stage validates whether the use case is the right one to invest in. We look at what the problem actually costs today, what alternatives exist, where the business case holds, and whether you have examples of what good looks like. Without those, you have no way to know if the agent is working. Produces: a validated business case and a go/no-go decision.
3 sections
Stage 2
What does the process look like?
We map the process in detail. Not just what steps exist, but how each one works: who does it, what data flows in and out, whether it's a lookup or a judgement call. We surface unwritten rules and tacit knowledge, flag process improvements worth making before automation, and assess the risk and consequence of failure at each step. Produces: a detailed process map with risk assessment and AI fit decisions per step.
10 sections
Stage 3
How does the agent behave?
Every agent behaviour gets a testable specification. We design what the agent does, what it must not do, how much freedom it has with each tool, where humans stay in the loop, and how you prove it's working. Produces: behaviour specs, guardrail designs, human oversight levels, and a complete evaluation strategy.
9 sections
Stage 4
What exactly gets built?
We translate everything into build contracts. System prompts, tool definitions, data schemas, guardrail enforcement. Precise enough that an engineer or coding agent can build without guessing at intent. Produces: build-ready contracts including system prompts, schemas, tool definitions, and guardrail rules.
6 sections
Stage 5
How does it become operational?
Most agent projects stall between "it works in testing" and "it works in the organisation." We define the phased deployment (shadow, pilot, production) with clear criteria for advancing between phases. We plan how people learn to trust probabilistic output, how roles change, and who owns the agent once the build team moves on. Produces: a deployment roadmap with phase gates, a stakeholder adoption plan, and a live operations model.
4 sections
Who Agent Discovery & Design is for
Anyone designing an AI agent where getting it wrong has real consequences.
This is for you if…
✓
You're building an agent that needs to work reliably, not a demo
✓
You've tried building with AI and it worked in testing but failed in practice
✓
You've built an agent but haven't been able to embed it in your organisation
✓
You need a clear path from pilot to production, not just a working prototype
✓
Your agent makes decisions that affect people, money, or reputation
✓
You want a structured approach, not trial and error
This isn't for you if…
✗
You need a chatbot added to a website by Friday
✗
You're looking for a no-code automation tool
✗
The consequences of getting it wrong are low enough that a lighter approach would work
✗
You're experimenting to learn, not building for production
Ready to design your agent?
Whether you're building an agent or upskilling your team to design them, we'll show you where to start and what to focus on first. Thirty minutes.