Mario Alexandre  ·  March 26, 2026  ·  token-savings prompt-engineering llm-costs

The Real Cost of Unstructured Prompts (It's Not What You Think)

Most people think bad prompts waste money because the model gives you a wrong answer and you have to retry. That's true, but it's not where most of the waste comes from.

The real cost is the clarification loop. When you send an unstructured prompt, the model has to figure out what you actually want. And when it's not sure — which is most of the time — it asks. You answer. It asks again. You answer again. By the time you get a useful response, you've burned 4 exchanges when you needed 1.

I measured this exactly. Over 7 days, 21,194 prompts, I tracked every single response. My exchange rate before fixing this was 4.2 assistant responses per user prompt. That extra 3.2 responses per prompt is pure waste — it's the model asking questions instead of doing work.

Where the Money Actually Goes

Let me break down where tokens go in an unstructured workflow. When you send something like "make the login faster", the model has to infer:

Who am I talking to? What's their stack? What does "faster" mean — 100ms? 1 second? What are the constraints — can I change the DB schema? Can I add a cache? What format should I respond in — code diff, explanation, plan? And finally, what is actually being asked?

That's 6 questions the model has to answer before it can help you. If it answers any of them wrong, you're back to square one. Most unstructured prompts leave all 6 open.

sinc-LLM — 6-band signal decomposition
x(t) = Σ x(nT) · sinc((t - nT) / T)

The 6 Bands and How Much Each Costs You

I treat prompts as signals with 6 frequency bands. Each band carries different information. Here's what I measured across 275 observations in terms of quality weight per band:

BandWhat it carriesQuality weight
PERSONAWho the model should be7.0%
CONTEXTWhat situation you're in6.3%
DATARelevant facts and numbers3.8%
CONSTRAINTSWhat the model cannot do42.7%
FORMATHow output should look26.3%
TASKThe actual ask2.8%

Look at that CONSTRAINTS number. 42.7%. Nearly half of what determines whether your output is good or garbage is whether the model knows what it cannot do. And yet almost nobody writes constraints. They write the task (2.8% of quality!) and leave it at that.

FORMAT is the second biggest at 26.3%. If the model doesn't know you want a code diff and not an explanation, you get an explanation and ask for the diff, and the exchange rate climbs.

What This Looks Like in Real Money

Here's the extrapolation from my 7-day measurement. My actual spend at 4.2 exchanges/prompt would have been $2,597.96. After I fixed it to 1.6 exchanges/prompt, my actual spend was $967.01. The difference: $1,588.56.

The fix cost me $42.39 in Haiku API calls — one cheap model call per prompt to do the band decomposition before the expensive model sees it. 38x ROI on that spend.

If you're spending $500/month on LLM API calls and your exchange rate is anywhere near 4.2, you're probably burning $300 of that on clarification loops. That's the real cost of unstructured prompts: not wrong answers, just slow, expensive, back-and-forth conversations with a machine that needs more information than you gave it.

The Fix Is Structural, Not Behavioral

You might be thinking: "okay, I'll just write better prompts." And you can — but you won't. I've tried. When I'm in flow, deep in a coding session, I type "fix the auth bug" and hit enter. Nobody writes a 200-word structured prompt every time they want a quick thing done. It's not realistic.

That's why I built the auto-scatter hook. It does the structuring for you, automatically, every single time, before your prompt reaches the model. You type whatever you want. The hook infers all 6 bands. The model gets a fully structured picture. Exchange rate drops. Bill drops.

The SNR (signal-to-noise ratio) of my prompts went from 0.003 to 0.855 after implementing this. That's not a 10% improvement. That's a 285x improvement in signal quality. And that's what drove the 61% cost reduction.

Try sinc-LLM free — sincllm.com

The spec is open source. No signup required.