Pattern [04]

Reflection

Unit Testing / Code Review / Test-Driven Development (TDD) Loop

> Agentic Definition

Reflection (or Self-Correction) enables an agent to critique its own output (or the output of another agent) to identify errors, hallucinations, or areas for improvement, and then iteratively refine the result. It acts as an internal feedback loop, significantly increasing the quality and robustness of the final output.

> Description

Enables an agent to critique its own output (or the output of another agent) to identify errors, hallucinations, or areas for improvement, and then iteratively refine the result. It acts as an internal feedback loop.

Frequently Asked Questions

When should I use the Reflection pattern?

Reflection (or Self-Correction) enables an agent to critique its own output (or the output of another agent) to identify errors, hallucinations, or areas for improvement, and then iteratively refine the result. It acts as an internal feedback loop, significantly increasing the quality and robustness of the final output.

How does Reflection relate to Unit Testing / Code Review / Test-Driven Development (TDD) Loop?

Both exist to catch errors before final delivery. The "Red-Green-Refactor" loop in TDD is structurally analogous to the "Draft-Critique-Revise" loop in Reflection. It is a quality assurance gate embedded in the development/execution process. However, there is a key divergence: Unit tests check against known, deterministic assertions (e.g., assert x == 5). Reflection checks against qualitative, ambiguous criteria (e.g., "Is this tone professional?", "Is this code secure?", "Does this argument make sense?") using an LLM as the judge. The "test" itself is probabilistic.

What are the production trade-offs of Reflection?

Reflection is expensive in terms of time. It can double or triple latency depending on the number of reflection cycles permitted. It is a trade-off: higher latency for higher accuracy. Highly effective at reducing hallucinations. However, there is a risk of "degeneracy," where the agent critiques itself into a worse state or gets stuck in a loop of minor, inconsequential edits. Increases token usage significantly. Use cheaper models for the drafting phase and stronger, more reasoning-capable models for the critique phase to optimize costs.