SQL Query Prompt Template — Structured for Databases

AI SQL generation fails in a specific way: the model writes syntactically plausible SQL against a schema it invented. It makes up column names that sound right, joins tables that don't exist in your database, and uses functions that exist in PostgreSQL but not in BigQuery. The fix is always the same: give the model your exact schema in the DATA band. When it has the real schema, the SQL is almost always correct on the first try.

x(t) = Σ x(nT) · sinc((t − nT) / T)
SQL precision = schema fidelity. Put the real schema in DATA and the model writes against reality, not assumption.

The SQL Sinc Prompt Template

This example targets a business intelligence query — finding revenue by cohort with month-over-month growth — which requires multiple joins, window functions, and precise date handling:

{
  "formula": "x(t) = Σ x(nT) · sinc((t - nT) / T)",
  "T": "specification-axis",
  "fragments": [
    {
      "n": 0,
      "t": "PERSONA",
      "x": "You are a senior data engineer who writes clean, well-aliased SQL. You add inline comments only for non-obvious logic. You optimize for readability first, then performance. You know the difference between SQL dialects and never mix them."
    },
    {
      "n": 1,
      "t": "CONTEXT",
      "x": "I'm building a revenue analytics query for a BI dashboard in Metabase. The data is in PostgreSQL 15. I need to show monthly revenue by signup cohort to identify which cohorts have the best long-term retention."
    },
    {
      "n": 2,
      "t": "DATA",
      "x": "Schema:\nsubscriptions(id UUID, user_id UUID, started_at TIMESTAMPTZ, ended_at TIMESTAMPTZ, mrr_cents INT, plan TEXT)\nusers(id UUID, created_at TIMESTAMPTZ, email TEXT)\n\nNotes: ended_at is NULL for active subscriptions. mrr_cents is the current MRR in cents. Cohort = year+month of user.created_at."
    },
    {
      "n": 3,
      "t": "CONSTRAINTS",
      "x": "PostgreSQL 15 dialect only. No MySQL or BigQuery syntax. Use CTEs, not nested subqueries. Include NULL handling for ended_at. Do not use SELECT * anywhere. Use explicit column aliases. Do not use ILIKE — performance is critical on this table (10M rows). Add a comment explaining any window function used."
    },
    {
      "n": 4,
      "t": "FORMAT",
      "x": "SQL only — no explanation prose before or after. CTEs labeled clearly. Final SELECT produces: cohort_month, months_since_signup, active_subscribers, total_mrr_dollars. Order by cohort_month ASC, months_since_signup ASC."
    },
    {
      "n": 5,
      "t": "TASK",
      "x": "Write the PostgreSQL query that returns monthly revenue by signup cohort, showing how each cohort's MRR evolves over time."
    }
  ]
}

What Goes in Each Band for SQL

DATA band — the schema is mandatory: Include the exact table names, column names, data types, and any behavioral notes (nullable columns, what NULL means, units). Without this, the model invents a plausible schema and writes against it. With it, the model writes against your actual database.

CONSTRAINTS band — the dialect guard: SQL is not one language. PostgreSQL, MySQL, BigQuery, Snowflake, and SQLite have meaningfully different syntax for window functions, date operations, string functions, and CTEs. The model will mix dialects if you don't constrain it. "PostgreSQL 15 dialect only. No MySQL or BigQuery syntax." is not redundant — it actively suppresses cross-dialect leakage.

FORMAT band — SQL only: Without explicit format guidance, models wrap SQL in extensive prose ("Here's the query that will accomplish what you need:"). For SQL, you want the query and nothing else. "SQL only — no explanation prose before or after" eliminates this.

SQL prompt tip: If you have performance constraints (large table sizes, index information), put them in CONSTRAINTS, not DATA. "10M rows on subscriptions" in CONSTRAINTS tells the model to avoid sequential scans. "ILIKE is prohibited" prevents the model from using non-sargable patterns. DATA is schema facts; CONSTRAINTS are engineering rules.

Raw Prompt vs. Sinc-Structured

Write a SQL query that shows monthly revenue by cohort for my PostgreSQL database. The cohort is based on when users signed up. Show how MRR evolves by cohort over time.
DATA: [Exact schema with table names, column types, NULL semantics]
CONSTRAINTS: PostgreSQL 15 only. No BigQuery/MySQL. CTEs not subqueries. No SELECT *.
FORMAT: SQL only. Final SELECT: cohort_month, months_since_signup, active_subs, total_mrr_dollars.

The raw prompt generates plausible SQL against invented column names like subscription_value and user_signup_date that don't exist in your schema. The structured prompt generates a query against your actual columns that you can run without modification.

Try AI Transform — Structure Your SQL Prompt Free