Pattern [01]

Prompt Chaining

Pipe and Filter Architecture / Chain of Responsibility Pattern

> Agentic Definition

Prompt Chaining is the foundational design pattern where a complex task is decomposed into a linear sequence of smaller, discrete LLM calls. The output of one step becomes the input (context) for the next step. By breaking a monolithic prompt into a chain, the system reduces the cognitive load on the model for any single inference, significantly improving accuracy, adherence to instructions, and reliability. It allows for intermediate "gates" where deterministic code can validate outputs before passing them to the next link in the chain.

> Description

The foundational design pattern where a complex task is decomposed into a linear sequence of smaller, discrete LLM calls. The output of one step becomes the input (context) for the next step.

Frequently Asked Questions

When should I use the Prompt Chaining pattern?

Prompt Chaining is the foundational design pattern where a complex task is decomposed into a linear sequence of smaller, discrete LLM calls. The output of one step becomes the input (context) for the next step. By breaking a monolithic prompt into a chain, the system reduces the cognitive load on the model for any single inference, significantly improving accuracy, adherence to instructions, and reliability. It allows for intermediate "gates" where deterministic code can validate outputs before passing them to the next link in the chain.

How does Prompt Chaining relate to Pipe and Filter Architecture / Chain of Responsibility Pattern?

Both patterns rely on passing data sequentially through processing nodes. Each node acts independently, having a single responsibility (SRP). The system is composed of modular components that can be tested and optimized in isolation. However, there is a key divergence: In SWE, the transformation is deterministic (bytes in, bytes out) and the interface is rigid (data types). In Prompt Chaining, the transformation is probabilistic (text in, text out). The "interface" between nodes is natural language, which is "fuzzy" and unstructured. This necessitates a new type of "Type Safety" — often implemented via intermediate parsing or "Guardrail" agents — to ensure the next node receives intelligible input.

What are the production trade-offs of Prompt Chaining?

Latency is additive. Total Latency = Sum(Step_1...Step_N). Chains can become slow if they grow too long. Engineers must optimize prompt length (input tokens) and generation length (output tokens) at each step to maintain responsiveness. Error propagation is a significant risk. If Step 1 hallucinates or fails to extract the correct context, Step 2 processes garbage ("Garbage In, Garbage Out"). Validation gates between steps are not optional; they are critical reliability engineering. While multiple calls increase request count, breaking a task down can actually reduce total token usage compared to a massive, multi-turn conversation that requires re-reading a huge context window for every minor correction.