I measured 275 prompt-response pairs. I scored how much each part of a prompt caused the model to give a good answer on the first try. The result surprised me.
CONSTRAINTS is the band that tells the model what it cannot do. It outweighs PERSONA, CONTEXT, DATA, and TASK all added together. It is twice the weight of FORMAT (26.3%). TASK, the band most people focus on, carries just 2.8%.
Here is why that makes sense, and what you can do about it.
Language models are trained on billions of examples of people asking for things and getting answers. They are very good at figuring out what you want. You say "Fix the bug" and the model knows you want the bug fixed. You say "Write a function that sorts a list" and the model knows what to build. When you state your TASK clearly, the model handles it well.
What the model cannot do is guess your constraints. It does not know you already tried the easy fix. It does not know you cannot touch the database. It does not know you must stay under 100 lines. It does not know the tests must pass. It does not know you are targeting Python 3.9.
Every constraint you leave out is a way the model might go wrong. It will not know it went wrong until you tell it. That costs you another round of back-and-forth.
Here is the way I think about it. Before you give any constraints, the model has almost endless options. Any answer that fits your task is possible. The model picks one. It may not be the one you wanted.
Each constraint cuts down the options. "No DB schema changes" removes half of them. "Must pass existing tests" removes more. "Python 3.9 compatible" removes more. "Response under 50 lines" removes more. After 5 or 6 good constraints, the remaining options are so few that the model almost always picks something you will like.
Without constraints, you are hoping the model gets lucky and picks what you want from a huge list of options. With constraints, you guide it to the small set of options that are actually acceptable.
// CONSTRAINTS for a backend fix task
"CONSTRAINTS": "Must not change database schema or existing API contracts.
Existing test suite must pass without modification.
Fix must be backward compatible with Python 3.9.
No new external dependencies.
Minimal footprint — touch only files directly related to the bug.
If the fix requires architectural changes, flag this and propose
the minimal patch instead. Total added lines must be under 40."
That is 7 constraints. Each one blocks a wrong answer. The model reading this has very little room to surprise you. That is exactly what you want when fixing a production bug.
Compare that to a prompt with no CONSTRAINTS section at all. The model gets "fix the webhook validation bug." It helpfully refactors 3 files, upgrades a dependency, and adds a new migration. You have to undo all of that and ask it to try again.
In the sinc-LLM specification I published, there is a hard rule: "n=3 CONSTRAINTS must be the longest band." Not just a little longer. Explicitly the longest. If your CONSTRAINTS section is shorter than your TASK or CONTEXT section, you have probably not thought through your constraints enough.
This rule made me think about constraints differently. They are not just "do not do X." They are everything the model has no way of knowing. Everything you would catch in a code review. Everything you have learned from past tasks like this one. Everything specific to your codebase, your team's standards, or your deployment setup.
Writing it all down at the start takes more effort than typing a short prompt. But it takes far less effort than going through 4.2 rounds of back-and-forth to fix a bad answer.
// Production AI Engineering
sinc-LLM designs, audits, and stabilises production AI infrastructure: from vendor evaluation and cost accountability to incident controls and MCP architecture.
See what we do →