The literature you cite but never read

Open the last paper you submitted and count the references. Forty? Sixty? Now be honest with yourself about a smaller number: how many of those did you read in full — methods, limitations, the supplementary tables — rather than skim to the abstract, recognise from a talk, or lift wholesale from the related-work section of a paper that was itself doing the same thing?

This is not an accusation. It is a description of the incentive structure you work inside. A citation looks like a scholarly act, but mechanically it is a copy operation, and copy operations propagate whatever they copy — including the mistakes.

The misprint that gave the game away

The cleanest evidence that citations are copied rather than read came from an unlikely source: typos. Mikhail Simkin and Vwani Roychowdhury noticed that when a paper's bibliographic details are mistyped — a wrong page number, a transposed volume — the same error reappears across dozens of later papers. If everyone citing a work had actually retrieved it, each would reproduce the correct details independently and errors would not cluster. Instead the errors travel in identical form, which only happens if citers are copying each other's reference lists. From the statistics of how these misprints propagate, Simkin and Roychowdhury estimated that only about 20% of citers had actually read the paper they cited.

Sit with that ratio for a second. It means the modal citation in your field is a claim about a paper made by someone who did not read the paper, on the authority of someone else who also did not read it.

The half that nobody reads at all

The problem starts one step earlier, at the paper itself. In a widely-quoted review of citation analysis, Lokman Meho reported the uncomfortable folk statistics of the trade: something on the order of half of published papers are read by essentially no one beyond their authors, referees, and journal editors, and a large majority are never cited even once. Whatever the exact figures — and bibliometricians argue about them — the shape is not in dispute. The literature is enormously wider than the slice of it that is genuinely engaged with.

A field does not know most of what it has published. It knows the handful of papers everyone cites, and it trusts that the handful summarised the rest faithfully. That trust is rarely tested and frequently misplaced.

Do the reading-time arithmetic on your own PhD

Here is the exercise that makes it visceral. Suppose your corner of the field publishes a few thousand relevant papers, and more arrive every week — the STM report has documented for years that the global literature grows by roughly 5% annually, which is a doubling time of about fifteen years. Suppose, generously, that you can read and truly absorb one paper an hour and that you can protect ten such hours a week from teaching, admin, and the lab. That is 500 papers a year, against a target that is growing faster than you can walk toward it.

No individual can hold a field in their head by reading. So we don't. We triage by proxy: we read what is cited, cite what is read, and quietly trust that the citation graph did the reading for us. The trouble is that the citation graph, as Simkin and Roychowdhury showed, was mostly built by people making the same bet you are.

Why this is a quality problem, not a productivity one

You might reasonably say: so what? Nobody can read everything, triage is rational, and the important papers rise to the top. But the mechanism that lets you skip reading is the same mechanism that lets an error survive. When a claim is copied forward without anyone returning to the source, three things stop happening: nobody notices the source was retracted, nobody notices the source actually said the opposite under a condition you care about, and nobody notices that the source cited its source the same lazy way. The chain of trust has no one checking it at any link.

This is the quiet crisis under the loud ones. Before we argue about whether findings replicate or whether peer review works, there is a more basic fact: the connective tissue of science — the citations that turn isolated papers into a body of knowledge — is largely unverified. We have built a cathedral of claims and inspected almost none of the joints.

Where this series is going

The instinct is to feel guilty and resolve to read more. That doesn't scale, and guilt is a bad research strategy. The better move is to stop treating "I cited it" as if it meant "it holds," and to start asking what would have to be true for a citation to actually carry weight. That is what the next four parts are about: what a citation is supposed to promise, why the promise so often fails silently, what citation counts actually measure instead, and what it would take to read a literature for its arguments rather than its popularity. We begin, in the next part, with the most common and most costly confusion in the whole enterprise — mistaking being cited for being supported.

The misprint that gave the game away

The half that nobody reads at all

Do the reading-time arithmetic on your own PhD

Why this is a quality problem, not a productivity one

Where this series is going

REFERENCES