By Mario Alexandre · March 27, 2026 · 9 min read
I found something surprising in my prompt engineering experiments. For some tasks, showing the model 2-3 examples works better than writing 500 words of detailed instructions. This guide explains when to use few-shot prompting, how many shots you need, and how to combine examples with sinc-LLM's 6-band structure for best results.
Few-shot prompting means putting a small number of input-output examples in your prompt. You add those examples before you give the model a new task. "Few" usually means 2-5 examples. The model spots the pattern in your examples. Then it applies that pattern to the new input.
This is different from instructions-based prompting. In that approach, you describe the task in words instead of showing examples. Both approaches work. But each one works best on different types of tasks.
I ran 275 experiments. Here are the clear patterns I found for when each approach wins:
| Use Few-Shot (Examples) | Use Instructions |
|---|---|
| Format is hard to describe in words | Format is easily specified |
| Task involves style or tone matching | Task is procedural with clear steps |
| Output has subtle patterns | Output follows explicit rules |
| Model has never seen this task type | Model has seen similar tasks often |
| Classification with nuanced categories | Generation with clear constraints |
More examples do not always mean better results. Here is what I found in my experiments:
Classification tasks with many categories are an exception. If you have 10 categories, you may need 1 example for each category. That means 10 shots to get reliable results.
Few-shot prompting and 6-band decomposition work together. They are not competing techniques. In the sinc-LLM framework, few-shot examples go in the DATA band (n=2):
{
"formula": "x(t) = \u03a3 x(nT) \u00b7 sinc((t - nT) / T)",
"T": "specification-axis",
"fragments": [
{"n": 0, "t": "PERSONA", "x": "Senior content moderator with expertise in toxicity classification"},
{"n": 1, "t": "CONTEXT", "x": "Building an automated content moderation pipeline for a social media platform. Processing 50K comments per hour."},
{"n": 2, "t": "DATA", "x": "Examples: (1) Input: 'This product is absolute garbage, waste of money' -> Output: {label: 'negative_review', toxic: false, action: 'allow'}. (2) Input: 'You are an idiot for buying this' -> Output: {label: 'personal_attack', toxic: true, action: 'flag'}. (3) Input: 'I hate this brand so much' -> Output: {label: 'negative_sentiment', toxic: false, action: 'allow'}"},
{"n": 3, "t": "CONSTRAINTS", "x": "Must distinguish between negative opinions (allowed) and personal attacks (flagged). Sarcasm should be classified by intent, not literal meaning. Must process in under 50ms per comment. Confidence below 0.7 should route to human review. Must handle multilingual content (English, Spanish, Portuguese). Never classify political opinions as toxic."},
{"n": 4, "t": "FORMAT", "x": "JSON: {label: string, toxic: boolean, action: 'allow'|'flag'|'review', confidence: float}"},
{"n": 5, "t": "TASK", "x": "Classify the following comment using the pattern established in the examples."}
]
}
The examples in the DATA band show the pattern. The CONSTRAINTS band sets the rules. The FORMAT band keeps output consistent. Together, these produce 28% better classification accuracy than few-shot examples alone.
Use sinc-LLM to build a structured prompt. Then add your few-shot examples to the DATA band. The tool fills in all 6 bands. You supply the examples that show the exact pattern you want.
// Production AI Engineering
sinc-LLM designs, audits, and stabilises production AI infrastructure: from vendor evaluation and cost accountability to incident controls and MCP architecture.
See what we do →