Reading a codebase like a balance sheet
Here is the strangest ritual in software M&A, and once you see it you cannot unsee it. In most acquisitions, the codebase is simultaneously the asset being purchased and the least-examined line in the data room. Revenue gets forensic accountants. Contracts get lawyers. The code — the machine that produces the revenue — gets a senior engineer's impression after a repository walkthrough and a few pointed questions. Everyone in the room knows this is thin. Nobody has had a better instrument.
Our view at Think North, having sat on both sides of these calls, is that a codebase should be read the way an accountant reads a balance sheet: as a schedule of assets and liabilities, each line supported by evidence rather than by the seller's narrative.
The one question almost nobody asks precisely
The single most valuable diligence question is rarely asked with any precision: what fraction of this codebase is unique engineering IP, and what fraction is vanilla integration?
Every production system is a blend. Some functions embody proprietary judgment — the pricing model, the matching algorithm, the scheduling heuristics that took three years of iteration to get right. These are the Specialist Engineers of the codebase, and they are what you are actually buying. The rest is glue: CRUD around a database, wrappers around Stripe and Twilio and S3, serialization, notification plumbing. Necessary, professional — and reproducible by any competent team in months, because it encodes no secret.
CodeNSM computes this split mechanically. Because every function is classified into a workplace archetype, the business-versus-technical decomposition falls out of the census: business-logic roles versus infrastructure roles, and within them, code whose structure suggests accumulated domain judgment versus code that shadows a vendor's SDK. Two codebases of identical size can carry wildly different splits — and should command wildly different multiples. A 60,000-line product that is 70% vanilla integration is, bluntly, a well-organised implementation project wearing a product's valuation.
Now the fun part: the liabilities
Ward Cunningham's original 1992 debt metaphor was an accounting metaphor from the first sentence — he was, after all, describing a portfolio management system. Debt lets you ship sooner; the danger, he wrote, comes when the debt is not repaid and every minute spent on not-quite-right code counts as interest. Diligence should therefore price the interest, not the principal: not "how much ugly code exists" but "how much fragile code sits in load-bearing positions." Tornhill and Borg's Code Red study put empirical weight behind this — code in their lowest-quality band carried substantially more defects and slower, less predictable change times than healthy code. That variance is precisely what an acquirer's integration plan inherits.
Three liability lines we look for in the telemetry:
- Debt that hurts. The intersection of fragility and traffic. A quarantined mess is a footnote; a fragile function on the checkout path is a liability with a coupon.
- Idle inventory. The dormant share — functions shipped, maintained, and never called in production. In CodeNSM's own fleet telemetry it is routine to find a quarter or more of functions dormant at any moment; whether that floor is a general law is one of our pre-registered research hypotheses. Either way, buyers should know they are paying maintenance on it, and Lehman's laws of software evolution predict the share grows with age.
- Key-person exposure. Join runtime value to git history and you can see whether the functions carrying the traffic were written and maintained by one person who may not survive the earn-out.
The narrative in the data room says "modern, well-architected platform." The runtime says which functions actually showed up to work this quarter. Diligence the second one.
Why reading the code is necessary and nowhere near enough
None of this replaces reading the code — architecture review, security review and license scans remain table stakes. But static inspection answers "is this code well made?", not "is this code doing the work?" Only production telemetry shows the concentration of value (in our fleet data, call load is heavily Pareto-concentrated — another registered hypothesis), the true error economics at the system's boundaries, and whether the crown-jewel algorithm the pitch deck celebrates is actually on the hot path or quietly bypassed by a workaround from 2023. Stripe's Developer Coefficient report estimated years ago that developers lose a large share of their week — the report put it above 40% — to maintenance and bad code; an acquirer is buying that ratio, whatever it is, and deserves to see it measured rather than asserted.
What lands on the table when someone does this properly
Concretely: an archetype census with the business/technical and unique-IP/vanilla splits; a health North-Star with trend; the debt-that-hurts register ranked by traffic; the dormancy schedule; boundary error economics per external dependency; and the developer-to-function value map. Every line reproducible from telemetry, none of it dependent on the seller's adjectives. For sellers, the incentive is symmetrical — a founder who can hand over that pack six months before a process starts is negotiating from evidence instead of defending against suspicion.
One more line item the LLM era added: provenance risk. When a meaningful share of the codebase was machine-drafted, the classic diligence question "who understands this code?" can have the answer "no one ever did." Runtime health per function is the only instrument that prices that risk without depending on anyone's memory.
The balance sheet metaphor ends where all metaphors do. But the discipline transfers: assets itemised, liabilities priced, evidence over narrative. Code deserves the same accounting rigor as the revenue it generates — especially at the moment someone is writing a cheque for it.
References
- Cunningham, W. (1992). The WyCash Portfolio Management System. OOPSLA '92 experience report — origin of the technical-debt metaphor.
- Tornhill, A. & Borg, M. (2022). Code Red: The Business Impact of Code Quality.
- Lehman, M.M. (1980). Programs, Life Cycles, and Laws of Software Evolution. Proceedings of the IEEE.
- Stripe (2018). The Developer Coefficient.
- Sculley, D. et al. (2015). Hidden Technical Debt in Machine Learning Systems. NeurIPS.