Mario Alexandre  ·  March 26, 2026  ·  token-savings prompt-engineering llm-costs

The Real Cost of Unstructured Prompts (It's Not What You Think)

Most people think bad prompts waste money because the model gives the wrong answer and you have to try again. That is true. But it is not where most of the waste comes from.

The real cost is the clarification loop. When you send an unstructured prompt, the model has to figure out what you want. When it is not sure, it asks. You answer. It asks again. You answer again. By the time you get a useful reply, you have used 4 exchanges when you only needed 1.

I measured this exactly. Over 7 days, across 21,194 prompts, I tracked every response. My exchange rate before fixing anything was 4.2 assistant responses per user prompt. That extra 3.2 responses per prompt is pure waste. It is the model asking questions instead of doing work.

Where the Money Actually Goes

Here is where tokens go in an unstructured workflow. When you send something like "make the login faster", the model has to figure out:

Who am I talking to? What is their stack? What does "faster" mean: 100ms? 1 second? What are the constraints: can I change the DB schema? Can I add a cache? What format should I use: a code diff, an explanation, a plan? And: what is actually being asked?

That is 6 questions the model must answer before it can help you. If it gets any one wrong, you start over. Most unstructured prompts leave all 6 open.

sinc-LLM — 6-band signal decomposition
x(t) = Σ x(nT) · sinc((t - nT) / T)

The 6 Bands and How Much Each Costs You

I treat prompts as signals with 6 frequency bands. Each band carries different information. Here is what I measured across 275 observations:

BandWhat it carriesQuality weight
PERSONAWho the model should be7.0%
CONTEXTWhat situation you're in6.3%
DATARelevant facts and numbers3.8%
CONSTRAINTSWhat the model cannot do42.7%
FORMATHow output should look26.3%
TASKThe actual ask2.8%

Look at that CONSTRAINTS number: 42.7%. Nearly half of what makes your output good or bad is whether the model knows what it cannot do. Almost nobody writes constraints. They write the task (2.8% of quality) and stop there.

FORMAT is second at 26.3%. If the model does not know you want a code diff instead of an explanation, it gives you an explanation. You ask for the diff. The exchange rate goes up.

What This Looks Like in Real Money

Here are the numbers from my 7-day measurement. At 4.2 exchanges per prompt, my spend would have been $2,597.96. After I fixed it to 1.6 exchanges per prompt, my actual spend was $967.01. The difference: $1,588.56.

The fix cost me $42.39 in Haiku API calls. One cheap model call per prompt does the band decomposition before the expensive model sees your message. That is a 38x return on that spend.

If you spend $500 per month on LLM API calls and your exchange rate is near 4.2, you are probably burning $300 of that on clarification loops. That is the real cost of unstructured prompts. Not wrong answers. Just slow, expensive back-and-forth with a model that needed more information than you gave it.

The Fix Is Structural, Not Behavioral

You might think: "I will just write better prompts." You can. But you will not. I have tried. When I am deep in a coding session, I type "fix the auth bug" and press enter. Nobody writes a 200-word structured prompt every time they want a quick thing done. That is not realistic.

That is why I built the auto-scatter hook. It structures your prompt for you, every single time, before your message reaches the model. You type whatever you want. The hook figures out all 6 bands. The model gets a fully structured picture. The exchange rate drops. The bill drops.

The SNR (signal-to-noise ratio) of my prompts went from 0.003 to 0.855 after I built this. That is not a 10% improvement. That is a 285x improvement in signal quality. That is what drove the 61% cost reduction.

// Production AI Engineering

Build AI systems that hold up in production.

sinc-LLM designs, audits, and stabilises production AI infrastructure: from vendor evaluation and cost accountability to incident controls and MCP architecture.

See what we do →