Derive, Discover, Name

What happens when you give AI a single equation and say ‘derive everything’

A case study in AI-assisted mathematical discovery. One equation, zero axioms, 10,588 theorems, 42 journal papers, 16 AI agents.
Author

The Omega Project

Published

March 27, 2026

The Two Stories Everyone Tells

In July 2025, an AI system solved five of six problems on the International Mathematical Olympiad, earning a gold medal. A few months earlier, a startup called Math, Inc. had formalized a sphere-packing proof into 120,000 lines of Lean code. The headlines wrote themselves: AI solves math problems. AI checks math proofs.

These are real achievements. But they share a blind spot. In both stories, the mathematics already exists. The AI is solving problems humans posed, or verifying theorems humans proved. The hard part — deciding what to look at, recognizing what matters, naming what you find — remains entirely human.

What if that division of labor is wrong? Not wrong in principle, but wrong in practice. What if the most productive use of AI in mathematics is not solving or verifying, but discovering — systematically exploring a structure so thoroughly that patterns emerge which no human would have had the patience to find?

This is the story of the Omega Project. It starts with one equation.

A tree diagram showing mathematical structures derived from x² = x + 1, including ring isomorphisms, moment recurrences, topological entropy, and the inverse limit tower.

The derivation tree: everything that emerges from x² = x + 1

The Seed

\[ x^{2} = x + 1 \]

The golden ratio. You’ve seen it before — sunflowers, spirals, the Parthenon (probably not, actually). But the equation has a deeper life that most popular accounts miss.

Consider binary words: sequences of 0s and 1s. Now impose one rule: no two consecutive 1s. Call this set \(X_m\) for words of length \(m\).

How many such words are there? Count them:

  • Length 1: 0, 1 — 2 words
  • Length 2: 00, 01, 10 — 3 words
  • Length 3: 00, 01, 10, 010, 100, 001, 101 — wait, that’s 000, 001, 010, 100, 101 — 5 words
  • Length 4: 8 words
  • Length 5: 13 words

2, 3, 5, 8, 13. Fibonacci numbers. Not a coincidence. The equation \(x^2 = x + 1\) is the characteristic equation of the Fibonacci recurrence. The no-consecutive-1s constraint is the golden ratio, expressed as a combinatorial rule.

This is the simplest non-trivial subshift of finite type in symbolic dynamics — the golden-mean shift. It generates the simplest non-trivial linear recurrence (Fibonacci) and the “most irrational” number (worst-case for rational approximation). Three views of one algebraic object.

✗ "Just the golden ratio"

Not sunflowers. Not spirals. A combinatorial constraint that generates number theory, dynamics, and algebra simultaneously. (Why this is forced)

The Omega Project takes all three views seriously and follows their consequences simultaneously. In Lean 4. With zero axioms beyond the proof assistant’s core logic.

The First Surprise

Every number less than \(F_{m+2}\) can be written uniquely as a sum of non-consecutive Fibonacci numbers. This is the Zeckendorf representation, and it gives a bijection between our binary words \(X_m\) and the integers \(\{0, \ldots, F_{m+2}-1\}\).

Through this bijection, you can define addition and multiplication directly on binary words. Not by converting to integers — natively, within the combinatorial structure. The question was: does this arithmetic have any structure?

The answer was unexpected. The space \(X_m\) with these operations is not just a group or a monoid. It is a ring — and not just any ring:

\[ X_m \;\cong\; \mathbb{Z}/F_{m+2}\mathbb{Z} \]

The combinatorial space of no-consecutive-1s words is a cyclic ring, with the Fibonacci number as modulus. When \(F_{m+2}\) happens to be prime — \(F_3 = 2\), \(F_5 = 5\), \(F_7 = 13\), \(F_{13} = 233\) — the space becomes a finite field. When \(F_{m+2}\) factors, the Chinese Remainder Theorem decomposes it: \(X_7 \cong \mathbb{Z}/2 \times \mathbb{Z}/17\).

What’s striking: the ring structure is intrinsic to the no-consecutive-1s constraint. Nobody designed it. Nobody imported integer arithmetic. It was derived from the combinatorial structure, step by step, and verified by machine.

✗ "Just combinatorics"

Binary words are not a counting trick. They carry intrinsic algebra. (Why arithmetic is forced)

🧑 Human direction 🤖 AI derivation 💡 Discovery
Evidence

Lean: stableValueRingEquiv — the ring isomorphism \(X_m \cong \mathbb{Z}/F_{m+2}\mathbb{Z}\)

Reproduce: cd lean4 && lake build Omega.Folding.FiberRing

Role: Human set the direction: “check if this space has ring structure.” AI systematically explored all operations, verified every axiom. Human recognized the result: “this IS a cyclic ring — and when the Fibonacci number is prime, it’s a field.”

The Pattern Deepens

The fold operator \(\Phi: X_{m+1} \to X_m\) truncates the last bit. It partitions longer words into fibers over shorter words: all words that agree on the first \(m\) bits. These fibers have varying sizes, and the variation encodes deep arithmetic.

The moment sums quantify this:

\[ S_q(m) = \sum_{x \in X_m} d(x)^q \]

where \(d(x)\) counts how many words in \(X_{m+1}\) fold to \(x\). The zeroth moment counts stable points (\(F_{m+1}\)). The first moment counts total words (\(2^m\)). Higher moments capture increasingly fine information about the fiber distribution.

Then something unexpected appeared. \(S_2\) — a purely combinatorial quantity counting fiber collisions — satisfies a linear recurrence with small integer coefficients:

\[ S_2(m+3) + 2\,S_2(m) = 2\,S_2(m+2) + 2\,S_2(m+1) \]

This was proved unconditionally for all \(m\) in a chain of six formally verified steps:

  1. Hidden bit decomposition
  2. Fold congruence
  3. Collision decomposition into exact-weight and cross-correlation terms
  4. Telescoping identity
  5. Cross-correlation shift linking adjacent levels
  6. The recurrence itself — proved in 4 lines of Lean

The Hankel determinant analysis confirmed that the recurrence order is exactly 3, not reducible to 2. This means the dynamics of fiber collisions are governed by a \(3 \times 3\) companion matrix whose eigenvalues control exponential growth.

Why is this surprising? A linear recurrence with small integer coefficients governing a combinatorial collision count suggests hidden linearity in the fiber dynamics. The structure wasn’t assumed — it was discovered by systematic derivation.

✗ "Random fibers"

Fiber sizes are not noise. They obey exact linear algebra. (Why collisions reveal structure)

🧑 Human direction 🤖 AI derivation 💡 Discovery
Evidence

Lean: momentSum_two_recurrence_of — the unconditional \(S_2\) recurrence

Lean: momentSum_two_minimal_recurrence_order — minimal order = 3, proved via Hankel determinants

Reproduce: cd lean4 && lake build Omega.Folding.CollisionKernel Omega.Folding.HankelSpectrum

Role: Human recognized that fiber collision counts might satisfy a recurrence (pattern-matching on computed values). AI systematically built the 6-step proof chain. Human recognized the significance: hidden linearity suggests connection to transfer operators in statistical mechanics.

The Tower

A diagram showing the tower of spaces X_0, X_1, X_2, ... connected by fold operators, converging to the inverse limit X_infinity, analogous to p-adic integers.

The modular tower and its inverse limit

The fold operators link all levels into a tower:

\[ \cdots \to X_{m+1} \to X_m \to \cdots \to X_1 \to X_0 \]

Each arrow is a ring homomorphism. The inverse limit of this tower:

\[ X_\infty = \varprojlim X_m \]

is a compact, totally disconnected, metrizable, infinite space. If you know p-adic numbers, this looks familiar. The p-adic integers are the inverse limit of \(\mathbb{Z}/p^n\mathbb{Z}\). Here, Fibonacci numbers replace prime powers. The carry defect (the failure of the fold operator to commute with addition) is the exact analogue of carrying in p-adic arithmetic.

\(X_\infty\) is a profinite ring governed by the golden ratio rather than a prime.

And its dynamics carry a topological entropy of exactly \(\log \varphi\) — the golden ratio appearing one more time, now as the information-theoretic capacity of the system.

✗ "Just bookkeeping"

The tower is not organizational overhead. It is a profinite ring with computable entropy. (Why folding forces structure)

🧑 Human direction 🤖 AI derivation 💡 Discovery
Evidence

Lean: topological_entropy_eq_log_phi — the entropy result, proved via Fibonacci ratio convergence, logarithm continuity, and Cesaro averaging

Reproduce: cd lean4 && lake build Omega.Folding.Entropy

Role: Human chose to compute the topological entropy (connecting combinatorics to dynamics). AI constructed the multi-step proof. Human recognized the connection: this is the capacity of the \((1,\infty)\)-RLL constrained channel in coding theory.

The Method

Every result above was derived from one equation, with zero axioms, and verified by machine. 3,427 theorems in 38,876 lines of Lean 4 code. But the results are not the point. The method is the point.

A diagram showing the human-AI collaboration cycle: Human provides direction and taste, AI performs systematic derivation, Human recognizes and names the discovered structures.

The cognitive division of labor: derive, discover, name

Derive, discover, name

The human provides direction and taste: which questions are worth asking. The AI provides thoroughness: systematic exploration at a depth no human could sustain. The human then recognizes what emerged: connecting a verified result to the body of known mathematics.

  • Derive is AI’s work. Systematic, exhaustive, rigorous.
  • Discover is shared. AI finds the pattern; the human recognizes it as worth investigating.
  • Name is human’s work. Connecting the derived structure to known mathematics.

A discovery timeline

Human direction
"Define arithmetic on binary words via Zeckendorf bijection. Check for algebraic structure."
AI exploration
Systematically verified all ring axioms: associativity, commutativity, distributivity, identities. Constructed the isomorphism to Z/F_{m+2}Z.
Discovery
X_m is a cyclic ring. When F_{m+2} is prime, it's a finite field.
Human direction
"The fold operator creates fibers. Compute the fiber collision counts. Do they have structure?"
AI exploration
Built the 6-step proof chain: hidden bit decomposition, fold congruence, collision decomposition, telescoping, cross-correlation shift, recurrence.
Discovery
S₂ satisfies a linear recurrence with integer coefficients. Order exactly 3. Hidden linearity in fiber dynamics.
Human direction
"Build the inverse limit. What kind of space is it? Compute the entropy."
AI exploration
Proved compactness (Tychonoff), total disconnectedness (clopen basis), metrizability (prefix ultrametric). Entropy via Fibonacci ratio convergence + Cesaro averaging.
Discovery
X_∞ is the golden-ratio analogue of the p-adic integers. Entropy = log φ.

This is not autonomous AI doing mathematics. It is not a human using AI as a calculator. It is a new kind of collaboration, where each partner does what the other cannot.

Where the Derivation Went Next

Everything above — the ring isomorphism, the collision recurrence, the inverse limit tower — was the beginning. The project kept deriving. What emerged was not more of the same. It was structurally deeper.

The collision counts that satisfied that surprising linear recurrence? They turned out to be power sums over congruence classes — the spectral shadow of the modular arithmetic that folding had already forced into existence. The Perron eigenvalues of the collision kernels encode exponential growth rates across all orders. And at the spectral endpoint, as you push the collision order to infinity, the golden ratio reappears — not as an input, but recovered from inside the spectrum as a self-calibrating constant.

Above all the concrete results, the project built a logical spine: a chain of conservative extensions where each layer adds structure (types, contexts, dynamics, observer indexing) without ever rewriting the meaning of lower layers. New concepts are generated from old readout sequences, and the visible measurable structure never grows. The entire construction is endogenous — nothing external is injected at any point.

Then came the part that surprised even the builder. The derivation chain kept going — through spectral theory, through canonical systems from the de Branges school, through a translation of the Riemann Hypothesis into an auditable positive-definiteness condition — and arrived at structures that resemble physics. Not by assuming physics. By following the derivation chain to its natural endpoint.

Time, in this framework, is the projection of which propositions have been decided onto the refinement chain. Space comes from shared support and resource cost. Einstein’s equation appears as the unique minimal second-order covariant closure — the only possible gravitational Lagrangian satisfying locality, covariance, and second-order equations.

What is claimed: a pure derivation chain from finite-window observation to quantum structure and Einstein closure, with zero axioms. What is not claimed: full recovery of standard quantum mechanics and general relativity. What remains — full recovery under stronger globalization and rigidity in continuous limits — is explicitly deferred. These are open mathematical problems, not settled results.

For the full technical explanation of why each step is forced, see the Inevitability Guide.

The System

A diagram showing three layers: Derivation Engine (Lean 4), Knowledge Graph (Sisyphus), and Publication Pipeline (16 AI agents), flowing from the seed equation x² = x + 1.

The three-layer system: derive, visualize, publish

Three layers, each feeding the next:

Layer 1: Derivation Engine. 38,876 lines of Lean 4 code. 3,427 verified theorems. Zero axioms. The methodology is derive, discover, name.

Layer 2: Knowledge Graph. Sisyphus maps the derivation structure as ~20,998 typed nodes. Trace any theorem back to the seed.

Screenshot of the Sisyphus knowledge graph showing a force-directed visualization of approximately 20,000 mathematical nodes color-coded by type.

The Sisyphus knowledge graph: ~20,998 nodes of mathematical structure

Layer 3: Publication Pipeline. 16 AI agents carry result clusters through a 7-stage pipeline (P0 intake through P7 submission-ready). 42 papers in the pipeline, 3 submission-ready.

What This Means

The bottleneck in mathematical discovery was never intelligence — it was throughput. A proof assistant with AI guidance can verify thousands of derivation steps per day, with zero errors and complete traceability. One person can now survey a mathematical structure to a depth previously reserved for large research programs.

Theory 10,588 theorem-level statements (770K lines)
Formalization 3,427 Lean 4 theorems, 0 axioms (38,876 lines)
Publication 42 papers, 3 submission-ready
Agents 16 (8 formalization + 8 publication)
Experiments 515 reproducible Python scripts

The Invitation

The Omega Project is open source. Every theorem is machine-verified. Every derivation chain is traceable. Every computation is reproducible.

Open frontiers

Results computationally verified for bounded cases, awaiting unconditional proofs:

Open The S₃ recurrence: S₃(m+3) = 2S₃(m+2) + 4S₃(m+1) − 2S₃(m)
Open Fiber multiplicity closed form: D₂ₖ = F_{k+2}, D₂ₖ₊₁ = 2Fₖ
Open Spectral radius: ρ(A) = φ via Mathlib's spectral radius API
Open Cantor set homeomorphism for X
Open Full globalization beyond Madm: representation theorems, limit theorems, rigidity

Each one would connect the Omega framework to a new area of mathematics. If you have expertise in any of these areas, the derivation infrastructure is waiting.

The equation is \(x^2 = x + 1\). What it contains is still being discovered.