Hey, let me explain this from scratch because I've been calling it an "auto-scatter hook" for weeks and I realize I've never actually explained what that means to someone who hasn't built it themselves.
An auto-scatter hook is a piece of code that sits between you and your LLM. Every time you type a prompt, the hook intercepts it, rewrites it into a structured JSON format, and injects that structured version as context before the model sees the original. You never see this happening — it's invisible. But the model's behavior changes dramatically because of it.
The name comes from signal processing. I model prompts as signals, and I "scatter" or decompose them into their component frequencies. The mathematical basis is the Nyquist-Shannon sampling theorem — treating a prompt as a sampled signal with 6 distinct frequency bands that carry different types of information.
The 6 bands are: PERSONA (who the model is), CONTEXT (what situation you're in), DATA (relevant facts), CONSTRAINTS (what the model cannot do), FORMAT (how output should look), TASK (what you're actually asking). Each band carries a different weight of quality impact. CONSTRAINTS alone carries 42.7% of output quality — which is wild.
The "auto" part is critical. I tried manual scatter first — writing the sinc JSON myself for every prompt. It worked great for quality. It was absolutely unsustainable in practice. Nobody manually structures their prompts every single time. You're in flow, you type what you want, you hit enter.
The auto-scatter hook makes it invisible. You still type whatever you want. The hook handles the structure. There's zero discipline required. It just runs.
In Claude Code specifically, I implemented it as a PreToolUse hook that fires on every user message. The hook is a Python server running locally on port 8461. Response time: 2ms for the hook overhead (not counting the Haiku API call, which takes 300-800ms). The hook is non-blocking — if the scatter call fails, it passes the prompt through unchanged rather than breaking your workflow.
One important edge case: if the prompt is already valid sinc JSON (has the formula field, T field, and fragments array), the hook passes it straight through without scattering. This matters for two reasons.
First, it prevents double-scattering — if you or another agent already structured the prompt correctly, don't mess with it. Second, it gives me an escape hatch: if I need to communicate directly with the model bypassing the hook's interpretation, I can manually write sinc JSON and it'll reach the model unmodified.
7 days, 21,194 prompts scattered. Exchange rate dropped from 4.2 to 1.6 responses per prompt. Cost savings: $1,588.56. Haiku overhead: $42.39. Net gain: $1,546.17. ROI: 38x.
I also fine-tuned a local Qwen2.5-7B model to do the scatter at zero API cost. 107 seconds of training on an RTX 5090, 290 tok/s inference, 4.7GB GGUF. If you have local GPU, the Haiku cost goes to zero and the savings are 97% month-over-month.
The code is open source. Leave a comment and I'll drop the GitHub link.
Try sinc-LLM free — sincllm.com
Open source. The spec is free to read and implement.