Quality & Self-Correction
How agents check and improve their own work
TL;DR
How agents check and improve their own work
> Overview
Agents can review their own output, spot errors, and try again. This is called Reflection. You can also have separate evaluator agents that score quality. This module teaches PMs how to define quality bars, design eval-driven development workflows, and balance reflection cycles with latency budgets.
> Why This Matters for Your Product
Without self-correction, agents produce first-draft quality every time. With reflection, they catch hallucinations, fix formatting, and improve accuracy. But each reflection cycle roughly doubles latency. The PM defines good enough for each feature and decides how many correction cycles are worth the wait. This module also introduces eval-driven development: write your evaluation criteria BEFORE you build the feature, just like writing acceptance criteria before development.
> Interactive & tools
Reflection: before vs. after
Without reflection (first draft)
Example: Agent drafts a support reply. Minor tone issue, one factual inaccuracy, and a missing step. User would need to correct it.
After one reflection cycle
Same agent reviews its output, fixes tone and fact, adds the missing step. Quality improves ~15–30%; latency roughly doubles for that step.
Eval-driven development
Related Engineering Patterns
These are the technical patterns your engineering team will implement. Understanding them helps you have better conversations.
Key Product Decisions
- [01]What quality bar must the agent meet before showing output to users?
- [02]How many self-correction cycles are acceptable given your latency budget?
- [03]Have you written evaluation criteria BEFORE building the feature?
- [04]What does good enough mean for this specific feature and user segment?
Ask Your Engineering Team
- →What is our hallucination rate with and without reflection?
- →How much latency does each reflection cycle add?
- →Can we use a cheaper model for the critique/judge step?
- →Do we have an automated eval pipeline running in CI/CD?
Unlock the decision framework
Free account — no credit card required. Sign up to see the full decision checklist and the questions to ask your engineering team.
Sign Up FreePlay the interactive Quality & Self-Correction game
Practice the decisions from this module in an interactive game. Sign up free to play and save your progress.
Sign Up Free to PlaySee the full decision framework
Sign up free to see this module's Key Decisions, the questions to ask your engineering team, and play the interactive Quality & Self-Correction game.
Sign Up Free