Most build-vs-buy debates are won by whoever shouts loudest, not by whoever has the data. This framework gives you the 10 criteria that decide it on facts, the three paths each criterion implies, and a one-page decision matrix you can take into a Monday morning architecture review and walk out with a defensible call.
I have watched the same fight play out in three different shapes. A CTO commits to building because the team wants the resume bullet, then 12 months in the project ships at 60% of the vendor's quality. A CFO commits to buying because the procurement spreadsheet says it is cheaper, then 18 months in the lock-in has bent the roadmap around the vendor's release cadence.
The pattern is not that build is always wrong or buy is always wrong. The pattern is that the decision gets made on the loudest argument in the room, not on the criteria that actually predict the 18-month outcome. Build vs buy is a 10-variable decision compressed into a one-variable shouting match.
This framework decompresses it. The 10 criteria below are the variables that actually move the outcome at the 18-month mark. Each one resolves to a build path, a buy path, or a hybrid path with specific seam locations. Run the framework in a single architecture review and you walk out with a defensible call, the engineering reasoning behind it, and the seam locations the next 18 months will have to honor. Production AI architecture has a structural shape. Whiteboard arguments do not.
Each criterion has three paths: when to build, when to buy, when the right answer is a hybrid with a specific seam. The full PDF includes the seam-location guidance for each hybrid case. Online here you get the criterion + the three-path summary.
When does this AI capability need to be in production with real users? Build is a 6-to-18 month commitment. Buy is a 4-to-12 week commitment. The horizon mismatch is the most common reason teams build something they should have bought.
Build: production deadline > 9 months out Buy: production deadline < 6 months Hybrid: ship vendor v1, swap to in-house v2
Is this AI capability the thing customers cite when they choose you over a competitor? If yes, owning the model behavior matters more than ship-speed. If no, owning the model is engineering tax on a non-differentiator.
Build: AI is the product (or the moat) Buy: AI is internal productivity, not customer-facing Hybrid: own the orchestration, rent the model
Does this capability need to ingest data your contracts, regulations, or internal policies prohibit from leaving your infrastructure? Vendor solutions have a data-flow shape. If your data cannot fit that shape, the math changes regardless of cost.
Build: data cannot leave your VPC or region Buy: vendor has the compliance posture you need Hybrid: on-prem inference, cloud orchestration
Building production AI requires at least two engineers who can ship, monitor, and iterate models in production. Not "interested in AI". Not "took an online course". Engineers who have done it. If you cannot name those two engineers, building is a hiring decision before it is a build decision.
Build: two or more named production-ML engineers on staff Buy: zero or one production-ML engineers Hybrid: contract the build, transition to in-house at month 9
Build TCO and Buy TCO have different shapes. Build is engineering salary + infrastructure + opportunity cost. Buy is per-seat or per-call pricing + integration + lock-in cost. The crossover point is volume-dependent and almost always different from what either side estimates on day one.
Build: usage volume > vendor pricing crossover Buy: usage volume < crossover, with margin Hybrid: negotiate enterprise tier, cap exposure
Every vendor will eventually deprecate, get acquired, raise prices, or pivot. Build positions you for those events. Buy positions you to react to them. The right call depends on how costly an emergency vendor swap would be in 24 months.
Build: swap cost > 6 months engineering effort Buy: swap cost < 1 quarter engineering effort Hybrid: abstraction layer over multi-vendor backends
HIPAA. SOC 2. FedRAMP. GDPR. Each compliance regime has a vendor-eval shape. Some vendors satisfy them. Some make you carry the compliance burden anyway. Build means you own compliance from layer one.
Build: no vendor satisfies your full compliance posture Buy: vendor passes your auditor and indemnifies Hybrid: vendor for non-regulated workload, build for regulated
A vendor API call is shallow integration. A vendor model embedded in your write path with custom evaluations and human-in-the-loop is deep integration. Deep integrations cost the same to build either way, but the on-call and ownership question is different.
Build: deep integration, custom evals, on-call own it Buy: shallow integration, standard API call Hybrid: vendor model, custom evaluation harness
How often will the model behavior need to change? Once a quarter is buy territory. Once a sprint is build territory. The wrong answer here is the most common cause of "we built this and now it is too slow to change" or "we bought this and now we cannot get the vendor to ship our fix".
Build: behavior changes weekly or per-customer Buy: behavior is stable, vendor cadence is enough Hybrid: vendor base, custom prompts and evals on top
When the system misbehaves in production, do you need to debug at the model level, or is "filed a vendor support ticket" an acceptable resolution path? Closed-weight vendor models give you observability over the input and output. Build or open-weight gives you observability over the activations.
Build: debug at the activation level, fine-tune on incident data Buy: input/output observability is enough Hybrid: open-weight model, vendor inference
The conversation shifts from "I think we should build" to "criterion 4 says we cannot build because we do not have the talent, criterion 9 says we should build because the cadence is weekly". That is a different meeting. It ends faster, with consensus, and the decision survives the next exec review.
Whether you build, buy, or hybrid, the framework surfaces 3 specific architectural commitments your future team will live with. Knowing them on day one means you design for them. Discovering them at month 12 means you rebuild around them.
The framework, filled in, is the artifact that explains the call to whoever inherits it. New CTOs do not have to re-litigate the decision. They have a written record of which criteria drove which call and where the seams are. That alone is worth the framework even if your call would not change.
Each misaligned criterion is a specific risk with a specific mitigation. That is not "we made the wrong call" panic. It is a prioritized risk register the engineering team can execute against. Or that you can hand to me, if you want it sized and scoped.
One email. The PDF, the editable decision matrix, and the seam-location decision tree for hybrid cases. No drip sequence, no nurture funnel, no tactics.
Get the framework