A bilingual map of how AI mimics nature & mind

What does AI imitate?

Every paradigm in artificial intelligence is borrowed — from mathematics, from physics, from biology, from the brain, from the deepest habits of human thought. This is the map.

11 Paradigms5 Domains3C Three Centuries2 Open Questions

The thesis

Intelligence has only ever copied itself.

Every breakthrough in artificial intelligence has been an act of imitation. Neural networks copy neurons. Diffusion models copy entropy. Genetic algorithms copy evolution. Transformers copy analogy. Reinforcement learning copies trial and reward. Bayesian inference copies belief update. Even pre-training copies the way Locke and Hume thought humans learn — as a blank slate written upon by experience.

The deep claim of this atlas is this: AI is not a single technology. It is a parliament of borrowed paradigms, each with its own ancestor in nature or in human thought. To understand AI is to understand what each paradigm imitates, how it imitates it, and — crucially — what is still missing.

Two ancient capacities of the human mind — Einsteinian imagination and felt emotion — have no artificial analogue. They are the atlas's terra incognita. Everything else, we have already built.

Talent hits a target no one else can hit. Genius hits a target no one else can see.— Arthur Schopenhauer

The Atlas

Eleven paradigms. Two open questions.

Each cell below is a self-contained chapter — historical, mechanical, applied, speculative. Click to descend.

§ I · Mathematics

Fractal Geometry

Fractal Generative Models

Self-similar structure across scales, recursively expanded into pixels.

Frontier · 2024 →§ II · Physics

Entropy

Diffusion Models · Boltzmann Machines

Order built by reversing thermal noise — running the second law backwards.

Mature · 1985 → 2020 →§ III · Biology

Evolution by Natural Selection

Genetic Algorithms · Neuroevolution

Iterated selection and variation: the only process in the universe known to produce design without a designer.

Mature · 1859 → 1975 → ongoing § IV · Brain · Structure

The Biological Brain

Artificial Neural Networks

Eighty-six billion specialised cells, abstracted into matrix multiplications.

Mature · 1943 → ongoing § V · Mind · Empiricism

Learning from History

Pre-training · Fine-tuning · Distillation · CoT

John Locke's blank slate, written on by the entire internet.

Mature · 2017 → ongoing § VI · Mind · Empiricism

Learning through Reward and Punishment

Reinforcement Learning

Edward Thorndike's law of effect, scaled to superhuman game-play and reasoning.

Active research · 1898 → ongoing § VII · Mind · Rationalism

Symbolic Reasoning

Logic Programming · Theorem Proving · Chain-of-Thought

From a handful of axioms, infinitely many sentences. From a handful of rules, the whole of mathematics.

Active research · 1956 → ongoing § VIII · Mind · Analogy

Analogical Reasoning

Transformer · Attention

Everything thinks by similarity. The trick is computing it in a high-dimensional vector space.

Mature · 2017 → ongoing § IX · Mind · Bayesianism

Probabilistic Belief Update

Bayesian Inference · Probabilistic Programming

Beliefs are probability distributions. Evidence reshapes them. Rationality is the arithmetic of revision.

Active research · 1763 → ongoing § X · Mind · The Open Question

Einsteinian Imagination

(still missing)

What machine could imagine itself riding alongside a beam of light?

Open question · ?§ XI · Mind · The Open Question

Emotion-Guided Cognition

(still missing)

Without feeling, there is no purpose. Without purpose, no thought.

Open question · ?

§ I — 2024 →

Fractal Geometry

Self-similar structure across scales, recursively expanded into pixels.

DomainMathematics
MimicsFractal Geometry
MethodFractal Generative Models
StatusFrontier

History

Benoit Mandelbrot named fractals in 1975, but the underlying geometry — coastlines, ferns, clouds, river deltas — predates humans by billions of years. Mandelbrot's quiet provocation was that the smooth Euclidean shapes of school geometry are the exception, and the broken, recursive forms of nature are the rule. For decades, fractals lived in mathematics, computer graphics, and chaos theory. Then, in 2024, MIT's group around Jiatao Gu and Kaiming He proposed Fractal Generative Models, treating the very architecture of a generative network as itself a fractal — generators inside generators — and producing pixel-by-pixel images of unprecedented coherence. For the first time, an AI did not just *render* fractals; it became one.

Mechanism

A fractal generative model stacks autoregressive generators inside each other: an outer generator predicts coarse patches, each patch is passed to an identical (but smaller) generator predicting finer patches, and so on down to single pixels. The same module repeats at every scale, exactly the way a coastline repeats at every zoom. Parameters are shared across scales, which makes the network shockingly small for the resolution it can address. The architecture is itself a recursion: the *structure* of computation mirrors the *structure* of the world it models.

Applications

High-resolution image synthesis without diffusion's denoising cost; structurally coherent terrain, vasculature, and crystal-growth simulators in scientific computing; texture synthesis for game engines that need self-similar detail at every camera distance; and a plausible blueprint for biological-scale generative models — DNA, vasculature, and neural tissue are all fractal in nature.

Future

Fractal generation suggests that future architectures may resemble the world they model. If neurons, lungs, river networks, and turbulence are all fractal, perhaps cognition itself is, and the right inductive bias is recursion, not depth.

Live · Fractal generatorClick to zoom · forever

Iterations 80

Zoom 1.0×

Palette

View

§ II — 1985 → 2020 →

Entropy

Order built by reversing thermal noise — running the second law backwards.

DomainPhysics
MimicsEntropy
MethodDiffusion Models · Boltzmann Machines
StatusMature

History

In 1985 Geoffrey Hinton and Terrence Sejnowski borrowed Ludwig Boltzmann's nineteenth-century statistical mechanics and built the Boltzmann Machine, a neural network whose neurons sampled at temperature. It worked, but barely scaled. The deeper idea — that intelligence could be cast as the reversal of an entropy-increasing process — waited thirty-five years. In 2015 Sohl-Dickstein et al. showed how to train a network to reverse a forward diffusion of pure Gaussian noise. In 2020 Ho, Jain, and Abbeel turned that into Denoising Diffusion Probabilistic Models, and overnight diffusion became the dominant paradigm for image, audio, and video synthesis. Every Stable Diffusion, Midjourney, Sora, and Veo image you have seen is, at root, a controlled act of physical violation: a system that locally decreases entropy.

Mechanism

Forward process: take an image and incrementally add Gaussian noise across hundreds or thousands of steps until it is indistinguishable from pure noise. Reverse process: train a neural network to predict, at each noise level, exactly what noise was added — and therefore how to subtract it. Sampling is then a walk backwards through the noise schedule: start with chaos, denoise one step, denoise again, and a coherent image emerges from a sequence of microscopic reversals. The mathematics is identical to Langevin dynamics in statistical physics; the network learns the score (the gradient of the log-density) of the data.

Applications

Image generation (DALL·E 3, Stable Diffusion XL, Midjourney v6/v7, Flux), video (Sora, Veo 3, Kling, Runway), audio (AudioLM, Stable Audio), molecular design (AlphaFold's structure module is a diffusion process over atomic coordinates), protein binder design, robot motion planning, and physical simulation. Diffusion has become the universal solvent of generative modelling.

Future

Flow matching, rectified flow, and consistency models compress hundreds of denoising steps into one or two — pushing diffusion toward real-time generation. Diffusion language models (Mercury, LLaDA, Inception) challenge the autoregressive monopoly of GPT-style models. The deeper bet: every modality eventually becomes a diffusion problem.

Live · Diffusion schedulet = 60 / 60 · σ = 1.00

Noise level (1 = pure noise, 0 = denoised) 1.00

§ III — 1859 → 1975 → ongoing

Evolution by Natural Selection

Iterated selection and variation: the only process in the universe known to produce design without a designer.

DomainBiology
MimicsEvolution by Natural Selection
MethodGenetic Algorithms · Neuroevolution
StatusMature

History

Darwin published On the Origin of Species in 1859, but Alan Turing was the first to ask whether the procedure could be mechanised. In 1948, in an unpublished report titled Intelligent Machinery, Turing sketched what we would now call an evolutionary search. John Holland formalised it in 1975 as the Genetic Algorithm. For decades GAs sat in optimisation textbooks, eclipsed by gradient descent. Then in 2017 OpenAI's Salimans, Ho, Chen, and Sutskever rediscovered that evolution strategies could train policies on Atari with as few parameters as gradient methods. Today neuroevolution drives AutoML, neural-architecture search, agent-population training, and — at planet scale — the search for entirely new model architectures.

Mechanism

Start with a population of candidate solutions, each encoded as a chromosome (parameter vector). Evaluate each against a fitness function. Select the best, recombine them (crossover), perturb them (mutation), and replace the population. Repeat for thousands of generations. The mathematics is uncannily close to stochastic gradient descent: both move parameters toward higher fitness via local exploration. The difference is that GAs work in non-differentiable, discontinuous, deceptive landscapes — where gradients lie or do not exist.

Applications

Antenna design at NASA (the ST5 spacecraft's bent paperclip antenna was evolved), AutoML and neural-architecture search (Google's AmoebaNet), evolving game-playing strategies (the entire history of AlphaStar's league play is an evolutionary tournament), protein design (Adaptyv, EvolutionaryScale), curriculum and prompt evolution for LLM agents (OpenEvolve, AlphaEvolve), and the search for new mathematical proofs and algorithms (DeepMind's FunSearch found the largest cap-set in nine years).

Future

Open-ended evolution — populations that never converge, that keep inventing new niches — is the secret behind Earth's biosphere and a leading hypothesis for how to escape capability plateaus in AI. Quality-diversity algorithms, POET, and OMNI-EPIC are building toward AI that breeds, not just trains.

Live · Genetic algorithmGen 0 · fit̄ = 0.00 · max = 0.00

Mutation rate 0.50

Bugs whose colour matches the warm-amber target survive and breed. Mutation rate controls exploration vs exploitation.

§ IV — 1943 → ongoing

The Biological Brain

Eighty-six billion specialised cells, abstracted into matrix multiplications.

DomainBrain · Structure
MimicsThe Biological Brain
MethodArtificial Neural Networks
StatusMature

History

In 1943 Warren McCulloch and Walter Pitts proved that networks of threshold units could compute any Boolean function. Frank Rosenblatt built the first physical neural network, the Perceptron, in 1958, and the New York Times prematurely announced the dawn of thinking machines. Two AI winters followed — one in 1969 when Minsky and Papert proved single-layer perceptrons could not learn XOR, another in the 1990s when shallow nets were outclassed by support vector machines. Geoffrey Hinton, Yann LeCun, and Yoshua Bengio kept the flame. In 2012 AlexNet won ImageNet by a margin so large it ended every alternative paradigm. Every model that followed — GPT, Llama, Gemini, Claude, Grok, DeepSeek, Qwen, Sora — is a descendant of that 1943 idea.

Mechanism

A neuron computes a weighted sum of its inputs, passes the sum through a nonlinearity (sigmoid, ReLU, GELU), and emits a number. Stack neurons in layers. Multiply the input vector by a weight matrix at each layer. Train the weights by backpropagation: compute the loss, propagate the gradient back through the chain rule, and update every parameter by a tiny step toward lower loss. Done at scale, with enough data and compute, this single idea has produced everything from face recognition to ChatGPT. The brain almost certainly does not work this way — but the abstraction works.

Applications

Every machine-learning product on Earth. Cancer detection on radiographs, protein folding, autonomous driving, voice cloning, recommendation, fraud detection, ad ranking, real-time translation, drug discovery, weather forecasting (GraphCast outperforms physics-based numerical weather prediction), and — most consequentially — large language models.

Future

Spiking neural networks, neuromorphic chips (Loihi 2, IBM NorthPole), continuous-time models, and biologically plausible learning rules (predictive coding, forward-forward) are pushing artificial brains closer to organic ones. A separate frontier: brain-computer interfaces (Neuralink, Synchron) are now using neural networks to decode and re-encode the very organ that inspired them.

Live · Neural forward passClick input neurons

Random weights

Inputs

Gold edges = positive weights, indigo = negative. Edge thickness ∝ activation × weight. The same forward pass scaled 10⁹× gives you GPT-5.

§ V — 2017 → ongoing

Learning from History

John Locke's blank slate, written on by the entire internet.

DomainMind · Empiricism
MimicsLearning from History
MethodPre-training · Fine-tuning · Distillation · CoT
StatusMature

History

Empiricism is the philosophical claim that all knowledge comes from experience. John Locke called the newborn mind a tabula rasa. David Hume argued that even causation is a habit inferred from repeated observation. For three centuries this view sat opposite rationalism. In 2017 Vaswani et al. published Attention Is All You Need, and within five years the empiricist programme had a complete computational instantiation: train a transformer on enough text and it absorbs grammar, world knowledge, common sense, theory of mind, and the rudiments of reasoning. GPT-3 (2020) showed that the absorption scaled with parameters; GPT-4 (2023) showed it crossed thresholds; Claude, Gemini, Llama, Qwen, DeepSeek, and Grok continue the lineage. Every modern LLM is a Lockean child raised on the internet.

Mechanism

Pre-training: predict the next token across a trillion-token corpus. The loss surface is shaped by every word humans have written. Fine-tuning: nudge the model on a smaller, curated dataset (instructions, code, dialogue). Distillation: train a smaller student to mimic a larger teacher's outputs, transferring capability into a tighter package. Chain-of-Thought (CoT): prompt the model to write its reasoning step by step, and watch performance jump on math, logic, and multi-step tasks. The four techniques compose: pretrain → distill → fine-tune → CoT. Every frontier model goes through this assembly line.

Applications

Everything LLM-shaped: coding assistants (Cursor, Claude Code, Copilot), customer support, search (Perplexity, Google AIO), writing, summarisation, translation, education, legal review, medical scribes, robotic policies (RT-2, π0), and the agentic systems that orchestrate them. The chat-completion API has become the new system call.

Future

The empiricist path is running into a wall: the internet only contains so much text, and the easy gains have been collected. The frontier is now synthetic data (models teaching models), continual learning (Atlas, Hierarchical Reasoning), multimodal pretraining (vision, audio, video, action), and curricula that resemble how children learn rather than how scrapers do.

§ VI — 1898 → ongoing

Learning through Reward and Punishment

Edward Thorndike's law of effect, scaled to superhuman game-play and reasoning.

DomainMind · Empiricism
MimicsLearning through Reward and Punishment
MethodReinforcement Learning
StatusActive research

History

Edward Thorndike's puzzle boxes in 1898 demonstrated that cats learn through trial and reward — the law of effect. Skinner industrialised this into operant conditioning. In computer science, Richard Sutton and Andrew Barto formalised the agent-environment loop in the 1980s. TD-Gammon (1992) reached world-class backgammon by self-play. DeepMind's DQN beat Atari games in 2013; AlphaGo beat Lee Sedol in 2016; AlphaZero learned chess, shogi, and Go from scratch in 2017. Then in 2022 OpenAI applied RL not to games but to language: RLHF turned a raw GPT into ChatGPT. In 2024-2026, RL is being applied to LLM reasoning itself — o1, DeepSeek-R1, Grok-4 Reasoner, Claude's thinking mode — and reasoning has become a property you train with reward, not data.

Mechanism

An agent observes a state, picks an action, the environment returns a reward and the next state. The agent learns a policy — a mapping from state to action — that maximises cumulative future reward. The mathematics is dynamic programming applied to stochastic processes; the practical engineering is enormous. Modern variants include policy gradients (PPO, GRPO), actor-critic (SAC, A3C), model-based methods that learn a simulator of the world (Dreamer, MuZero), and inference-time reinforcement (the model self-explores at test time, as in o3 and the thinking models of 2025-26).

Applications

Robotics (every modern manipulation policy is RL-tuned), recommender systems, ad auctions, chip-floorplanning (Google's TPU layout was RL-designed), nuclear fusion plasma control (DeepMind controlling tokamaks), datacentre cooling, drug discovery, autonomous driving, and — increasingly — RLHF and RLVR on top of language models for safety and reasoning.

Future

Sutton's bet — that RL is the only paradigm scalable to AGI because it does not depend on a finite human-generated dataset — is being publicly tested. Self-play between LLMs (debate, adversarial verification), embodied multi-agent worlds (Genie 3 simulating universes for agents to grow up in), and reward models that themselves learn — these are the next decade's frontier.

Live · Multi-armed banditpulls 0 · reward 0 · regret 0.0

true 0.25 · est 0.00

0 pulls

true 0.42 · est 0.00

0 pulls

true 0.18 · est 0.00

0 pulls

true 0.66 · est 0.00

0 pulls

true 0.50 · est 0.00

0 pulls

Strategy

Auto

Arm D pays best (66 %). A good RL algorithm finds it without being told. ε-greedy explores randomly; UCB explores by uncertainty; Thompson samples from each arm's posterior. They all converge — at different speeds.

§ VII — 1956 → ongoing

Symbolic Reasoning

From a handful of axioms, infinitely many sentences. From a handful of rules, the whole of mathematics.

DomainMind · Rationalism
MimicsSymbolic Reasoning
MethodLogic Programming · Theorem Proving · Chain-of-Thought
StatusActive research

History

Plato held that knowledge is innate, recollected rather than learned. Descartes located certainty in pure thought. Chomsky argued that finite rules generate the infinite language. In the 1956 Dartmouth workshop, AI was born under the rationalist banner: Allen Newell and Herbert Simon's Logic Theorist proved theorems from Principia Mathematica; Prolog (1972) embodied the idea that programs *are* logic. The first AI winter ended the dream — symbolic systems were brittle, knowledge engineering was endless. Yet symbolic reasoning never died. Today it is having a renaissance: theorem provers (Lean, Coq, Isabelle) are coupled to LLMs (DeepMind AlphaProof, AlphaGeometry, Harmonic Aristotle), and chain-of-thought is recognised as a kind of soft symbolic search.

Mechanism

Define a domain in terms of symbols (variables, constants, relations) and inference rules (modus ponens, resolution, unification). Given a goal, search for a sequence of rule applications that derives it. The search is combinatorial and brutal, but the answers — once found — are certain. Modern neuro-symbolic systems delegate the search to a neural network and the verification to a deterministic checker, getting the best of both worlds. Lean 4 and Mathlib have become an unexpected fulcrum: AlphaProof won a silver medal at IMO 2024 by emitting Lean proofs; a year later DeepMind reached gold.

Applications

Formal verification of critical software and hardware (CompCert, seL4 microkernel, Intel chip verification), mathematics (Lean's Mathlib library now contains over a million theorems), legal reasoning, smart contracts, scientific theorem discovery, and program synthesis. Every safety-critical system in the world relies, somewhere, on a symbolic prover.

Future

Neuro-symbolic synthesis is the most likely path to mathematical AGI: an LLM proposes, a prover disposes. AlphaProof-style hybrids, when scaled, may write entirely new branches of mathematics. The harder dream — symbolic reasoning over the messy world, not just over formal axioms — remains open.

§ VIII — 2017 → ongoing

Analogical Reasoning

Everything thinks by similarity. The trick is computing it in a high-dimensional vector space.

DomainMind · Analogy
MimicsAnalogical Reasoning
MethodTransformer · Attention
StatusMature

History

Aristotle wrote that the soul thinks in images and analogies. Douglas Hofstadter spent fifty years arguing that analogy is the core of cognition. Pedro Domingos called the analogizer school one of the five tribes of machine learning, with k-nearest-neighbours as its mascot. In 2017 the analogizer school accidentally won the entire field: Vaswani et al.'s Transformer paper replaced recurrence with pure attention — and attention is, mathematically, a soft k-nearest-neighbours over learned embeddings. Every modern frontier model is, in essence, a tower of analogy machines: for each token, find every other token that resembles me in some learned subspace, and update myself by their weighted average.

Mechanism

Each token in a sequence is projected into three vectors: query, key, value. The attention weight from token i to token j is softmax(Q_i · K_j / √d). Each token's new representation is the weighted sum of the values, weighted by the similarities. Stack this in parallel heads, then in series, and you have a transformer. The geometry: thinking is similarity-shaped, and reasoning is iterated similarity-finding. With enough layers and parameters, this mechanism can read, write, code, prove, plan, draw, see, hear, speak, and (debatably) understand.

Applications

GPT-5, Claude Opus 4.7, Gemini 3.5, Grok-4, DeepSeek V3.5, Qwen 3.5, Llama 4 — every frontier LLM. Vision transformers (ViT) ate the convolutional throne. AlphaFold's structure module is a transformer over residue pairs. Decision Transformer applies the architecture to RL. The transformer has become the universal Turing machine of deep learning.

Future

Linear attention (Mamba, RWKV), state-space models, Hyena, sliding-window attention, and ring attention are eating into the quadratic cost. A coming inflection: when context windows reach a trillion tokens, the attention mechanism becomes a kind of associative memory at planetary scale. Whether transformers remain dominant, or are overthrown by their own children, is the central architectural question of the 2030s.

Live · Attention heatmapClick a token to see what it attends to

	The	cat	sat	on	the	mat	because	it	was	tired
The	0.50				0.50
cat		0.50						0.50
sat			0.50						0.50
on				0.98
the	0.50				0.50
mat						0.98
because							0.98
it		0.50						0.50
was			0.50						0.50
tired										0.98

Attention head

Each row shows where one token "looks" for context. The "coreference" head wires 'it' back to 'cat' — the soul of analogical reasoning in modern LLMs.

§ IX — 1763 → ongoing

Probabilistic Belief Update

Beliefs are probability distributions. Evidence reshapes them. Rationality is the arithmetic of revision.

DomainMind · Bayesianism
MimicsProbabilistic Belief Update
MethodBayesian Inference · Probabilistic Programming
StatusActive research

History

Thomas Bayes died in 1761; his theorem was published posthumously in 1763. For two centuries it was a curiosity, then a workhorse of statistics, then — in the 1990s and 2000s — a movement. Judea Pearl built probabilistic graphical models and earned a Turing Award. Pedro Domingos called the Bayesians the fifth tribe of machine learning. The empirical surge of deep learning has not displaced Bayesian thinking; it has absorbed it. Modern systems quietly use Bayesian ideas everywhere: dropout is a variational Bayesian approximation, latent-variable models, score-based generative models, and the entire field of probabilistic programming (Pyro, NumPyro, Stan, Gen) carry the flag.

Mechanism

P(hypothesis | evidence) = P(evidence | hypothesis) · P(hypothesis) / P(evidence). Prior beliefs are multiplied by the likelihood of new data and renormalised. Repeat for every new datum. The procedure is provably optimal under the axioms of probability theory. The computational cost is the catch: exact Bayesian inference is intractable for most interesting models. Approximations — variational inference, MCMC (Hamiltonian Monte Carlo, NUTS), normalising flows, score matching — have made Bayesian thinking tractable at scale.

Applications

Drug-trial design, A/B testing at scale (every major tech company), search and recommendation, autonomous-vehicle perception (sensor fusion is Bayesian), medical diagnosis, astronomy (LIGO's gravitational-wave detection is Bayesian inference at exquisite precision), election forecasting, sports analytics, and — quietly — every LLM's token sampler (temperature, top-p, beam search are all forms of probabilistic decoding).

Future

The convergence of LLMs with Bayesian inference is the most under-discussed frontier. An LLM is, in one reading, a giant approximate Bayesian engine over the joint distribution of language. Active inference, the free-energy principle (Karl Friston), and predictive coding all bet that the brain — and therefore AGI — is fundamentally a Bayesian organ.

Live · Bayesian update0 heads · 0 tails · 0 total

True bias 0.70

Prior α 2.0

Prior β 2.0

Sample

Belief starts diffuse (the prior). Each coin flip sharpens it. After enough flips, the posterior collapses onto the true bias — the arithmetic of rational revision.

§ X — ?

Einsteinian Imagination

What machine could imagine itself riding alongside a beam of light?

DomainMind · The Open Question
MimicsEinsteinian Imagination
Method(still missing)
StatusOpen question

History

At sixteen, Albert Einstein asked himself what he would see if he could ride alongside a light beam at light speed. The thought experiment produced no equations, no observations, no derivations — it was pure visualised imagination. Ten years later it produced special relativity. Faraday imagined lines of force; Galileo imagined dropping balls from a tower; Schrödinger imagined a cat in a box. The most important leaps in physics are not made by extrapolating data; they are made by inventing the picture in which the data finally makes sense.

Mechanism

We don't know. Imagination of this kind appears to involve simulating counterfactual physics in a mental model that the imaginer can manipulate, observe, and reason about. It is not interpolation from examples. It is not chain-of-thought through the training distribution. It is not a Bayesian update on observed data. Hofstadter calls it strange-loop self-reference; David Deutsch calls it explanatory creativity; Karl Popper called it the bold conjecture. Whatever it is, current AI systems mostly remix rather than invent. The gap is real, and naming it honestly is the first step to closing it.

Applications

If solved: original scientific discovery without human prompting, novel mathematics, new physics, art that is not in any training set, machines that pose their own questions. The economic value is incalculable because it is the value of inventing entire industries from nothing.

Future

Some bet that world-models (Sora, Veo, Genie) plus reinforcement learning at scale will eventually exhibit imagination as an emergent property. Others (Yann LeCun, François Chollet) argue that an entirely new ingredient is required — possibly some form of causal, intervention-based reasoning, or a discrete program-induction layer over the continuous neural substrate. ARC-AGI-3 keeps the goalpost honest.

Open question · not yet implemented

No working artificial analogue exists. The honest atlas marks the gap rather than papering over it.

§ XI — ?

Emotion-Guided Cognition

Without feeling, there is no purpose. Without purpose, no thought.

DomainMind · The Open Question
MimicsEmotion-Guided Cognition
Method(still missing)
StatusOpen question

History

William James in 1884 argued that emotion is the *perception* of bodily changes, not a separate cognitive category. Antonio Damasio's somatic-marker hypothesis, building on patients with damaged ventromedial prefrontal cortex, showed that without emotion, rational decision-making collapses — patients could deliberate forever but never choose. Affect is not the opposite of reason; it is the engine that makes reasoning terminate. Lisa Feldman Barrett's constructed emotion theory pushes further: emotions are predictions the brain makes about its own interoceptive future. AI today has no body, no homeostasis, no interoception, no felt urgency — and therefore, arguably, no genuine motivation.

Mechanism

Reward signals in RL are sometimes called proto-emotion, but a scalar reward is to feeling what a postage stamp is to a postal system. Genuine emotion appears to be the brain's continuous, multidimensional, embodied, predictive model of the organism's own future — a control system whose currency is survival. No current AI architecture has anything like this. What we have are crude proxies: reward signals, loss functions, KL penalties. They steer behaviour but they do not constitute motivation.

Applications

If solved: machines with their own goals, machines that can negotiate with humans as peers rather than as instruments, machines whose well-being deserves moral weight. The application space includes companionship, eldercare, education, therapy, and — most charged — partnership in scientific discovery. Without emotion, AI is a calculator; with it, AI is something we have not yet invented a word for.

Future

Embodied AI (humanoids, surgical robots, autonomous vehicles) will be the first systems forced to grapple with proxy-emotional states — fatigue, urgency, risk-aversion. Whether these proxies cross some phenomenal threshold into genuine feeling is the deepest open question in AI, and possibly in philosophy.

Open question · not yet implemented

No working artificial analogue exists. The honest atlas marks the gap rather than papering over it.

Terra incognita

What AI cannot yet imitate.

Two capacities of the human mind remain without a computational analogue: the bold visualised imagination that produced relativity, quantum mechanics, and natural selection; and the felt emotion that gives any thought a reason to terminate in action. Without imagination, AI cannot truly invent. Without emotion, AI cannot truly want.

I am enough of an artist to draw freely upon my imagination. Imagination is more important than knowledge.— Albert Einstein, 1929

The two missing paradigms in the table — bold imagination and felt emotion — are not technical oversights. They are the deepest open questions in AI, and possibly in philosophy. Without imagination, AI can interpolate but not invent. Without emotion, AI can compute but not care. Closing these gaps is what will distinguish the 21st century's late AI from its early AI.

Timeline

Three centuries of borrowed intelligence.

1763

Bayes' theorem published posthumously

Thomas Bayes's essay on probability is read to the Royal Society, two years after his death.

1859

Origin of Species

Darwin formalises evolution by natural selection — the only known process that produces design without a designer.

1898

Thorndike's puzzle boxes

Cats learn by trial and reward. The law of effect is born.

1943

McCulloch–Pitts neuron

The first mathematical model of a neuron. Every modern AI descends from this paper.

1956

Dartmouth Workshop

AI is named. McCarthy, Minsky, Newell, Simon, Shannon attend.

1958

Rosenblatt's Perceptron

The first physical neural network. The NYT prematurely announces thinking machines.

1975

Mandelbrot names fractals

The geometry that nature has used for four billion years acquires a word.

1975

Holland's Genetic Algorithm

Evolution becomes a search procedure on a computer.

1985

Boltzmann Machine

Hinton & Sejnowski apply statistical thermodynamics to learning.

1986

Backpropagation popularised

Rumelhart, Hinton, Williams give deep networks a way to learn.

1997

Deep Blue defeats Kasparov

Symbolic search beats a world chess champion.

2012

AlexNet wins ImageNet

Deep learning ends the alternatives. The modern era begins.

2014

GANs

Goodfellow's generator-vs-discriminator. The first photoreal fakes.

2015

Diffusion models proposed

Sohl-Dickstein et al. show how to learn to reverse noise.

2016

AlphaGo beats Lee Sedol

Reinforcement learning + tree search defeat a 9-dan Go champion.

2017

Attention is All You Need

Transformers replace recurrence. Every modern LLM is downstream of this paper.

2020

GPT-3 + DDPM

Scaling laws hold. Diffusion becomes practical. The generative explosion begins.

2022

ChatGPT

RLHF turns a raw language model into a useful interlocutor. AI enters every home.

2023

GPT-4 + AlphaFold 2 ubiquity

Multimodal reasoning at near-human level on broad benchmarks. Biology enters the AI age.

2024

Fractal Generative Models · AlphaProof IMO silver

MIT shows recursion can replace depth. DeepMind shows symbolic reasoning at olympiad level.

2025

Reasoning models · o1, R1, Grok-4

Reinforcement learning is applied to reasoning itself. Models think before they speak.

2026

The Imitation Atlas is drawn

Eleven paradigms inventoried. Two open questions named honestly.

Where this is going

The next imitations.

Recursive architectures

Fractal-style generators may displace fixed-depth transformers, addressing arbitrarily high resolution and arbitrarily long context with constant parameter budgets.

Diffusion language models

Mercury, LLaDA, Inception challenge the autoregressive monopoly. Generating an entire response in a single denoising pass may be the next ChatGPT moment.

Open-ended evolution

Populations of agents that keep inventing new niches — POET, OMNI-EPIC, AlphaEvolve — may escape the data ceiling of internet-scale pretraining.

Test-time RL

Models that reinforce-learn on the spot, against their own verifier, may compress years of training into seconds of inference.

Neuro-symbolic synthesis

LLM proposes, theorem prover disposes. Lean + GPT-class models will, within a decade, prove every Millennium Prize problem worth proving.

World models

Sora, Veo 3, Genie 3 are early. A trillion-parameter physical-world simulator, queryable by agents, is the platform of the late 2020s.

Embodied homeostasis

Humanoids, surgical robots, and autonomous vehicles will be the first systems forced to grapple with felt-state proxies. Whether these proxies cross into genuine emotion is the deepest open question.

Imagination as architecture

If imagination is causal intervention on a learned world model, then the missing ingredient may be a do-operator over latent variables — the synthesis Pearl spent his career arguing for.

Practical leverage

Where each paradigm lives in production.

Medicine

AlphaFold 3 (diffusion + transformer) — universal biomolecular structure
Radiology diagnosis (CNN + ViT) — at-or-above radiologist accuracy in mammography, chest X-ray
Drug discovery (RL + Bayesian) — Insilico, EvolutionaryScale, Iambic

Science

GraphCast (transformer) — outperforms physics weather forecasts
AlphaProof (RL + symbolic) — IMO silver 2024, gold-track 2025
FunSearch (LLM + evolution) — first new cap-set bound in 9 years

Code

Claude Code, Cursor, Copilot, Windsurf — empiricist pretraining + RL
AlphaCode 2 — competitive programming above 85th percentile

Creative

Midjourney v7, Flux 1.1, DALL·E 3 — diffusion
Sora 2, Veo 3, Kling 2.0 — diffusion + transformer over video
Suno, Udio, ElevenLabs — audio diffusion + flow matching

Robotics

Tesla Optimus, Figure 02, Unitree G1 — pretraining + RL + world models
Waymo, Tesla FSD, Pony — Bayesian sensor fusion + neural perception

Industry

Datacentre cooling (RL) — Google saved 40% on cooling energy
Chip floorplanning (RL) — Google's TPU v5 was RL-laid-out
Tokamak plasma control (RL) — DeepMind controls fusion reactors

The map is not the territory.

But for a generation building artificial minds, a map is the difference between an exploration and a wander. This is yours. Use it. Share it. Add to it.