Agent Discovery & Design
Framework
A structured approach to designing AI agents that work in production. The right questions, asked in the right order — from business case to live operations.
5 stages
31 sections
Evidence-based gates
AI agents operate differently than traditional software
AI agents bring powerful new capabilities to software systems. They can autonomously operate business complex processes, analyse and recommend, use IT systems, produce sophisticated deliverables and more. But AI agents also have features, behaviours, complexities and vulnerabilities that need to be understood and catered for during design. Some key examples are:
The agent sounds right when it's wrong. A fabricated answer sounds exactly like a correct one. There's no error message, no crash, no red flag. The first time you find out is when a real situation goes wrong.
It works until it doesn't. An agent that performs well today may degrade quietly over three months. The organisation changes, the data shifts, the context moves. Nothing visibly breaks.
Scale amplifies the gaps you already have. A person makes a bad call, it affects one case. An agent with the same gap repeats it across every case it handles, and there's no natural circuit breaker the way there is with a human team. And errors compound: an agent that's 95% accurate at each step is only 77% accurate across a five-step chain.
The most important failures are the ones you can't see. When an agent gets it obviously wrong — saying something absurd, crashing, being rude — someone notices and you fix it. The harder failures are the subtle ones. A clause misquoted slightly. Legal advice disguised as customer service. Each decision individually defensible, but systematically skewed. Standard monitoring won't catch these. They need a different approach.
Building it is the smaller problem. Most of the effort in successful AI adoption goes into people and processes, not the technology. Yet most agent projects invest almost entirely in the build phase. Who trains the team to work with probabilistic output? Who owns the agent when the build team moves on? Who decides when it's ready for more autonomy? We design for these questions from the start, not after launch.
19%
Only 19% of organisations have scaled AI agents beyond pilots. The rest stall before production.
Source: Databricks State of AI Agents, 2026
Serpin's Agent Discovery & Design Framework addresses all of these and more. The framework advances Agile development best-practices with more than 30 areas of AI agent-specific considerations covering design, architecture, testing, security, change management, governance and more.
AI has to earn its place
Code
Structured input, clear logic
"Is this invoice overdue?"
Code + AI
AI judges, code verifies
"Classify then validate"
AI agent
Unstructured input, clear decision logic
"Understand and respond"
Human
Unstructured input, unclear logic
"Is this genuinely unusual?"
Code
If the input is structured and the logic is clear, code is cheaper and more reliable. No AI needed. This is where most teams over-invest in AI when a simpler solution would do.
AI agent
If the input is messy but the decision logic is clear, that's where AI earns its place. In practice, most production agents combine AI judgement with code verification.
Human
If both the input and the logic are unclear, you need a person. The first thing we do is work out which parts of your process sit where on this spectrum.
A structured framework in five stages
Serpin's Agent Discovery & Design Framework is a sequence of five stages, each building on the last. Each stage answers a different set of key questions and produces clear evidence and deliverables to inform the next stage. Decision gates between each stage allow stakeholders to ensure the evidence supports moving forward. Each gate builds confidence that you're designing the right solution in the right way to deliver the intended benefits and goals.
Stage 1
Is this the right use case?
The basis of any development, this stage validates whether the use case is the right one to invest in. We look at what the problem actually costs today, what alternatives exist, and where the business case holds. Produces: a validated business case and a go/no-go decision.
3 sections
Stage 2
What does the process look like?
We map the process as it works today, including the unwritten rules and the tacit knowledge. We identify which steps genuinely need AI, which are better handled by code, and which need a human. Produces: a complete process map with AI fit decisions for every step.
10 sections
Stage 3
How does the agent behave?
We design the architecture, guardrails, evaluation criteria, and human oversight per action. Seven categories of failure are mapped — from the obvious (the agent says something untrue) to the invisible (irrelevant context silently shifts every decision). Every behaviour gets a testable specification. Produces: behaviour specs, guardrail designs, and evaluation criteria.
8 sections
Stage 4
What exactly gets built?
We translate everything into build contracts. System prompts, tool definitions, data schemas, guardrail enforcement. Precise enough that an engineer or coding agent can build without guessing at intent. Produces: build-ready contracts — system prompts, schemas, tool definitions, guardrail rules.
6 sections
Stage 5
How does it become operational?
Most agent projects stall between "it works in testing" and "it works in the organisation." We define the phased deployment — shadow, pilot, production — with clear criteria for advancing between phases. We plan how people learn to trust probabilistic output, how roles change, and who owns the agent once the build team moves on. Produces: a deployment roadmap with phase gates, a stakeholder adoption plan, and a live operations model.
4 sections
Who Agent Discovery & Design is for
Anyone designing an AI agent where getting it wrong has real consequences.
This is for you if…
✓
You're building an agent that needs to work reliably, not a demo
✓
You've tried building with AI and it worked in testing but failed in practice
✓
You've built an agent but haven't been able to embed it in your organisation
✓
You need a clear path from pilot to production, not just a working prototype
✓
Your agent makes decisions that affect people, money, or reputation
✓
You want a structured approach, not trial and error
This isn't for you if…
✗
You need a chatbot added to a website by Friday
✗
You're looking for a no-code automation tool
✗
The consequences of getting it wrong are low enough that a lighter approach would work
✗
You're experimenting to learn, not building for production
Ready to design your agent?
Whether you're building an agent or upskilling your team to design them, we'll show you where to start and what to focus on first. Thirty minutes.