← Field notesAgentic

The compounding advantage: why starting your AI stack now matters

Ryan Walker6 min readUpdated June 11, 2026

The compounding advantage: why starting your AI stack now matters

An AI stack that has been running for 12 months is not 12 months better than one that just started. It is exponentially better. The gap is not in the model — it is in the evaluation layer and the prompt library. Those compound. Models do not.

What compounds in an AI stack

Four things accumulate value over time in a running AI stack. None of them are the model.

The prompt library. Each prompt gets refined with every use. A prompt that has run 100 times has been corrected, tightened, and edge-case-tested in ways that a new prompt has not. The difference between a day-one prompt and a 100-use prompt is not marginal — it is structural.

The evaluation criteria. Each evaluation teaches you more about what good looks like for your specific outputs. Early criteria are generic. After months of use, they are precise, calibrated, and specific to your context. That specificity is what makes the evaluation layer valuable.

The workflow documentation. Each documented workflow is a system that runs without you thinking about it. The documentation is not just a record — it is the operating logic of the stack. It accumulates and it compounds.

The output data. Every output is a data point. Over time, the pattern of outputs — what worked, what did not, what needed correction — informs every subsequent refinement. A stack with 12 months of output data is operating on a fundamentally different information base than one that just started.

The 90-day inflection point

After 90 days of consistent operation, the stack is measurably better than on day one. The prompts are refined. The evaluation criteria are specific. The workflows are documented. The output quality is higher.

This is the inflection point where the stack starts to feel like a system rather than an experiment. Before 90 days, you are building. After 90 days, the system is building itself.

The 90-day mark is not a finish line. It is the point where compounding becomes visible.

The 12-month structural gap

After 12 months, the gap between a running AI stack and a new one is structural. The running stack has 12 months of prompt refinements, evaluation data, and workflow documentation. A new stack starts from zero.

That gap cannot be closed quickly. It requires time and output volume — two things you cannot buy or shortcut. You can hire more people, spend more on models, and run more experiments. None of that substitutes for 12 months of compounding.

The structural gap is also self-reinforcing. A more refined stack produces better outputs. Better outputs generate better evaluation data. Better evaluation data produces more refined prompts. The gap widens every month.

The model is not what compounds

Models are commodities. They get replaced every six months. GPT-4 replaced GPT-3. Claude replaced earlier versions. The next model will replace the current one. Betting on a specific model is not a strategy — it is a dependency.

The compounding is in the evaluation layer, which is model-agnostic. It is in the prompt library, which transfers to new models. It is in the workflow documentation, which is independent of the model entirely.

When you switch models — and you will — you keep the compounding. The evaluation criteria still apply. The prompt library still works, with minor adaptation. The workflow documentation is unchanged. The model is a component. The stack is the asset.

The cost of waiting

Every month you wait is a month of compounding you do not get. That is not a metaphor — it is arithmetic.

If your competitor started six months ago, they have six months of prompt refinements, evaluation data, and workflow documentation that you do not. That gap is real. It is not recoverable by starting faster or spending more. It requires time.

The cost of waiting is not the cost of the tools you are not using. It is the compounding you are not accumulating. Those are different numbers, and the second one is larger.

What to start with

The smallest possible working loop. One prompt, one evaluation criterion, one output per day.

Run it for 30 days. Refine the prompt based on what the evaluation reveals. Document what you learn. Expand to a second prompt on day 31.

The compounding starts on day one, not when the system is perfect. A perfect system that starts in six months will be behind an imperfect system that started today. Start with something that works well enough to generate output and evaluation data. Refinement is the process, not the prerequisite.

We send a monthly compounding report — what the Avakata stack learned this month and how it improved — to Field Notes subscribers. Get it at avakata.agency/contact.html.

Where Avakata is after 18 months

Eighteen months ago, the Avakata stack was one agent on one function. Today it is 160+ specialist agents across engineering, design, data, marketing, sales, and support. The prompt library has hundreds of refined prompts. Every output type has evaluation criteria. Every function has documented workflows.

None of that was built in a sprint. It accumulated. The stack started small, ran consistently, and the compounding did the rest.

If you want to understand what your stack could look like in 12 months — and what it would take to get there — book a discovery call. We will show you where to start and what to expect at each inflection point.

Frequently asked questions

Why does starting an AI stack early matter?
Because AI stacks compound. Every output is a data point, every evaluation is a refinement, and every refinement improves the next output. After 12 months of operation, the gap between a running stack and a new one is structural — it cannot be closed quickly. Every month you wait is a month of compounding you do not get.
What compounds in an AI stack?
Four things: the prompt library (each prompt gets refined with every use), the evaluation criteria (each evaluation teaches you more about what good looks like), the workflow documentation (each documented workflow runs without you thinking about it), and the output data (every output informs the next refinement). The model itself does not compound — models are commodities.
How long does it take for an AI stack to start compounding?
The compounding starts on day one, but the inflection point is around 90 days. After 90 days of consistent operation, the prompts are refined, the evaluation criteria are specific, and the workflows are documented. The output quality is measurably higher than on day one. After 12 months, the stack is structurally better than a new one.

Related reading