Think North Learning
thinknorth.consulting
GENERATIVE IMAGES Mystery 6 min

The Forger and the Detective

01 · THE SETUP

In 2019 a website appeared with a self-explanatory name: thispersondoesnotexist.com. Refresh: a photorealistic face. Refresh: another. Pores, stray hairs, catchlights in the eyes, a slightly awkward smile. None of these people had ever existed.

No artist drew them. No photos were blended. And the strangest part: no one had ever told the system what “realistic” means. So where did the faces come from?

Design it yourself for a moment. You want a machine to get better and better at producing realistic fakes — but no human can sit there rating millions of attempts. What could supply the pressure to improve?

02 · YOUR CALL ⏸ YOUR CALL — PICK ONE TO CONTINUE

How do you train a forger with no teacher?

If you pick A

The supervised instinct — and it stalls immediately: almost every photo is 'realistic', so the labels teach nothing about what makes a generated image fail. You'd need labels for mistakes the generator hasn't made yet. The pressure has to come from somewhere that adapts.

If you pick B — the mechanism

That's the trick — a generative adversarial network. The forger generates; the detective gets a mix of real photos and forgeries and must tell them apart. Every time the detective wins, the forger learns; every time the forger wins, the detective learns. 'Realistic' is never defined — it's whatever survives an ever-better detective.

If you pick C

Reasonable guess — but averaging faces produces exactly what you'd fear: a blurry, uncanny nobody. The site's faces were sharp and specific, with individual quirks. Whatever made them wasn't blending existing people; it was producing new ones.

If you pick D

A sensible theory (and a common accusation levelled at generative AI). But collage leaves seams — lighting that doesn't match, geometry that doesn't cohere. These faces were globally consistent: one light source, one skull, one skin. They were synthesised whole, not assembled.

Pick one — committing first is what makes the answer stick.

the lesson continues after you choose

03 · NOT SO FAST

The natural assumption is that someone, somewhere, defined realism — a checklist of skin textures and eye reflections that the machine works through.

It makes sense; that's how graphics engines work. But no checklist survives contact with photography's infinite detail. The actual solution contains a beautiful inversion: nobody defined realism at all. They defined a game whose only winning move was realism.

04 · THE MECHANISM
2014–2020 · THE DUEL (GAN) FORGER generator DETECTIVE discriminator fakes caught → learn real photos mixed in “realistic” is never defined — it's whatever survives an ever-better detective 2022→ · DIFFUSION WON pure noise → image, step by step steered by your text prompt at every step. Midjourney · DALL·E · Stable Diffusion · Imagen · video (Sora, Veo) THE DARK COROLLARY every published fake-detector becomes training pressure for the next forger — detection loses the long game. trust provenance (C2PA), not pixels.
The GAN duel that made faces — and the diffusion process that replaced it.

The architecture is the generative adversarial network (GAN, 2014): a generator and a discriminator locked in an arms race, each one's progress becoming the other's curriculum. It gave the world its first shockingly real synthetic faces — and its first deepfakes.

But here's the update your 2023-era mental model needs: GANs lost. The image generators you actually use — Midjourney, DALL·E, Stable Diffusion, Imagen, and the video models like Sora and Veo — are diffusion models, a different idea entirely. Take a real image, drown it in noise step by step; train a network to run each step backwards. To generate, hand it pure noise and let it 'de-noise' its way to an image — guided by your text prompt, which steers every step. Diffusion won because the duel was unstable to train and prone to collapsing onto a few safe outputs, while step-by-step denoising is stable, controllable, and pairs naturally with language.

That corollary matters practically. In 2026, deepfake detection tools exist and help at the margins, but the duel's logic runs against them — every published detector becomes training pressure for the next generator. The durable answer flips the question from “is this fake?” to “where did this come from?”: cryptographic provenance standards like C2PA Content Credentials, now shipping in cameras, phones and editing tools, attach a verifiable history to authentic media at the moment of capture.

05 · BACK TO THE OPENING

So the mystery site wasn't a gimmick — it was the public debut of machines that can imagine: produce specific, coherent things that never existed. And the answer to “who drew them?” — nobody; a duel drew them — is also the answer to a question you didn't know you were asking: why the fakes keep beating the fake-detectors. The same game that created the faces guarantees it.

06 · TAKE THIS WITH YOU

Your rule: for any image or video that matters, don't ask “does it look real?” — your eyes lost that contest in 2019 — and don't rely on detectors alone. Ask “what's its provenance?” Where was it first published, by whom, and does it carry content credentials? Trust chains of custody, not pixels.

REFERENCES
  1. Goodfellow et al. (2014) — Generative Adversarial Networks
  2. Ho et al. (2020) — Denoising Diffusion Probabilistic Models
  3. C2PA — Content Credentials: an open standard for media provenance