01

The Core Inversion

Classical numerics treats numbers as values — static objects stored in registers, manipulated by arithmetic circuits, approximated by floating-point hardware. This essay proposes a different substrate: every number is a deterministic, forkable virtual machine that emits an infinite digit stream on demand.

This is not a metaphor. It is a literal computational ontology with precise consequences for implementation, memory complexity, and the structure of mathematics itself. The framing has deep roots — Cauchy sequences, computable reals (Turing 1936), coalgebraic stream arithmetic, p-adic expansions, exact real arithmetic systems like iRRAM, and the Type-2 Theory of Effectivity all instantiate parts of this idea. What is new here is not the underlying mathematics but the unified protocol, the codec/base separation, and the explicit ABI that makes all of it composable at native speed.

The primitive is simple. A number is any deterministic program that implements this interface. Everything else — vectors, matrices, tensors, p-adics, transcendentals, metrics, arithmetic — is a combinator over this primitive.

Core Primitive Interface
step : State → (digit, State)
02

The Binary Multiplexer as Foundation

A vector is a nesting of binary multiplexer encoders. A matrix or tensor is a nesting of vector construction operators. This is not an analogy to existing data structures — it is the correct ontological description of what these objects are.

The Core Combinator
MUX(selector, left, right)

Where selector is itself a generator, and left and right are generators recursively.

Unbounded Precision

Digits are computed on demand; you never run out of bits.

p-Adic Structure

Digits are naturally hierarchical and base-parametrized.

Lazy Evaluation

Only expand the tree as far as the consumer requires.

Structural Sharing

Subtrees are reused, giving canonicalization for free.

Vectors & Tensors

Nested MUX trees over number generators, uniformly.

Arithmetic becomes generator composition. Addition is a carry-propagating combinator over two digit generators. Multiplication is a convolution combinator. Division, negation, and subtraction follow the same pattern — pure generator composition.

03

The Carry Problem and Signed-Digit Representation

Standard positional arithmetic has a notorious flaw as a stream operation: addition is not local. The first digit of 0.4999... + 0.5000... cannot be determined from any finite prefix — a carry from arbitrarily deep in the expansion could propagate up and flip every digit above it.

Any naive digit-by-digit addition combinator will fail to terminate on inputs near a digit boundary. This is not an edge case; it is a structural property of the standard representation.

  • Addition is digit-local — no unbounded carry propagation
  • Each number has multiple valid encodings (the redundancy)
  • The MUX/codec layer projects to standard non-redundant form only when emitting an externally visible digit
04

Unbounded Precision as a Native Property

Because digits are computed on demand and the tree is lazily evaluated, unbounded precision is not a feature to be added — it is the default behavior of the substrate. Structural sharing keeps memory bounded to what is actually needed.

You compute exactly as many digits as the consumer requests, and no more. This is the same mechanism used by lazy streams in functional languages and exact real arithmetic libraries (iRRAM, Haskell's Data.Number.CReal, MPFR with rigorous error tracking), but grounded in a more explicit state-machine semantics that LLVM can compile directly to native code, and unified under a single ABI rather than scattered across domain-specific libraries.

05

p-Adics Come for Free

The p-adic numbers are particularly natural in this model, and notably easier than the reals in a precise technical sense: digit commitment is local in p-adics, where it is non-local in the reals.

A p-adic number is a digit stream indexed from the least significant digit upward, with digits in {0, 1, ..., p-1}. In the generator model, this is simply a MUX tree with a p-way multiplexer instead of a binary one.

This has an immediate consequence: the p-adic metric is also a periodic state machine. The p-adic distance between two numbers x and y is p^{-v_p(x-y)}, where v_p(x-y) is the index of the first digit where they differ. If both are periodic generators, then the metric computation is a product automaton over their state spaces — itself a finite, periodic machine.

AutomatonOverGenerators

A finite state machine whose transition function reads one or more digit generators and produces either a new generator or a scalar (valuation, distance, boolean). p-adics and their metric are the cleanest, most symmetric instance of this pattern.

06

The Odd Primitives: Turning Math into Compute

Supporting most of mathematics requires some nonstandard primitives — operators that compile mathematical definitions into digit streams with correctness guarantees.

Primitive 1

Series Stepper

series_step(state) → (term, new_state)

A finite state machine that emits the next term of an infinite series. This is the computational form of a mathematical definition — a convergent series becomes an executable state machine.

Primitive 2

Tail Bound Oracle

tail_bound(state) → E

Given the current series state at index n, returns an upper bound E such that the remaining tail satisfies |∑_{k>n} a_k| ≤ E. This is the bridge from convergence proofs to executable digit safety.

Primitive 3

Interval Refinement Engine (for reals)

refine_interval(L, U, term, E) → (L', U')
emit_digit_if_safe(L, U, base) → maybe digit

Maintains a shrinking interval [L, U] containing the true value. A digit is emitted when all numbers in the current interval share the same prefix of sufficient length. This is the real-number version of digit extraction — an online, monotone refinement process.

Primitive 4

Valuation Extractor (for p-adics)

valuation(a_n) → v_p(a_n)

For p-adics, locality is algebraic rather than analytic. A term with p-adic valuation v can only affect digits at positions ≤ v. No tail bound oracle required, no interval refinement, no boundary pathologies.

Primitive 5

Carry Propagator

carry(k) → {0, 1, ...}

Addition, subtraction, multiplication, and division all reduce to digitwise combinators plus a carry propagation rule. Under signed-digit representation, this combinator is local. Under standard representation, it is not.

Primitive 6

Composition Engine

compose(f, g) → h

All mathematical operations reduce to composing digit generators. This is the function algebra of the numerics library — the mechanism by which complex mathematical objects are built from simpler ones.

Primitive 7

Memoization and Structural Sharing

memo(generator) → generator'

Repeated digit queries must not recompute. Shared subtrees must remain shared. Memoization is exposed as an explicit wrapper, never an implicit global cache, because implicit caching breaks the value-semantics story for forking.

.streaming()

O(1) space, sequential access only, no cache

.cached(max_digits=N)

Bounded LRU cache, random access within N digits

07

The Forkable Nano-VM: A Two-Tier ABI

The full picture of a number in this model is a forkable, coinductive, deterministic nano-VM.

  1. Stateful Each number carries internal state that evolves as digits are requested.
  2. Forkable The state can be cloned exactly at any point, enabling parallel digit computation, comparison, metric evaluation, and speculative arithmetic.
  3. Deterministic & Pure Forking is exact; there is no hidden global state.
  4. Periodic-Friendly The VM supports periodic state cycles, ultimately periodic orbits, and product automata.
  5. Composable Two VMs combine into a new VM via arithmetic and analytic combinators.
  6. Lazy Digits are produced on demand, not precomputed.
The Critical Bifurcation: Two Structurally Distinct Generator Classes
Tier Examples State Fork Cost
Automaton Rationals, algebraic irrationals, periodic p-adics Fixed-size, inline O(1) — true struct copy
Series Classical transcendentals (π, e, log 2, ζ(3)) Grows with computation depth O(log n) — must deep-copy accumulators
Two-Tier ABI — C Struct Definitions
// Automaton tier: pure value semantics, O(1) fork
typedef struct {
    uint32_t base;
    uint32_t phase;
    uint64_t state[4];     // sufficient for degree-4 algebraic
} AutomatonVM;

// Series tier: explicit deep-copy required for fork
typedef struct {
    uint32_t base;
    uint32_t index;
    const SeriesSpec *spec;     // immutable, safe to alias
    ArbitraryInt *accum;        // mutable, deep-copy on fork
    ArbitraryInt *error_bound;  // mutable, deep-copy on fork
} SeriesVM;
08

LLVM as the Execution Substrate

The nano-VM is not a bytecode interpreter. It does not encode or decode its own state. That is precisely why LLVM is the right implementation target.

Canonical ABI — Automaton Tier (C)
struct NumVMStep {
    uint32_t digit;
    AutomatonVM next;
};

typedef struct NumVMStep (*NumVMFn)(AutomatonVM);
LLVM IR Representation
%AutomatonVM = type { i32, i32, [4 x i64] }
%NumVMStep   = type { i32, %AutomatonVM }

define %NumVMStep @num_step(%AutomatonVM %s) { ... }
Forking the Automaton Tier — Pure Value Copy
static inline AutomatonVM num_vm_fork(AutomatonVM s) {
    return s; // pure value copy — no hidden state
}
09

Base as Codec: The Arithmetic Encoder Interface

The base parameter is not a mathematical constant. It is a projection operator — a codec. This is the correct ontological description.

A number = a generator VM
A base = a decoder for that VM
A digit = a symbol in that decoder's alphabet

This is the same relationship as a probability distribution to an arithmetic coder, a message to a bitstream, a model to a codebook. Base is a codec. The number is the generator. The digits are the encoded tape.

In this model, ℝ is not the ontology. ℝ is one possible projection of the generator. The generator is the real object. The real number is the encoding. Changing base is changing coordinate charts on generator space — the underlying object does not change, only the representation does.

10

BBP Formulas Explained: Automaton Periodicity as Fast-Forward

The Bailey-Borwein-Plouffe formula for π in base 16 allows computing the n-th hexadecimal digit of π without computing the preceding digits. The generator-VM model contributes a first-class library primitive that lifts this phenomenon out of one-off optimizations and into the ABI.

Key Insight

A BBP-type formula exists exactly when the generator automaton has a periodic orbit under the chosen codec.

In base 16, the π-generator VM has a periodic sub-automaton. The codec (base 16) aligns with that periodic structure, exposing a fast-forward opcode. In base 10, the same automaton has no such periodic orbit under that codec. No fast-forward is possible.

Unified by the Same Mechanism

  • BBP for π in base 16
  • BBP for log(2) in base 2
  • BBP for polylogarithms in various bases
  • Certain p-adic expansions
  • Certain Mahler-function expansions
The Skip-Ahead Primitive
skip(n, state) → state'

Available only when the automaton has a periodic orbit under the current codec. This is the computational form of a BBP formula. The skip function cannot be synthesized automatically — it requires per-constant manual derivation — but once written, it becomes a first-class operation that LLVM optimizes alongside the rest of the generator graph.

11

Memory as the Right Complexity Metric

Runtime is elastic — you can trade time for algorithms, parallelize, precompute. Memory (state size) is structural. It is tied to the Kolmogorov complexity of the number and to the order and dimension of the recurrence or automaton.

In the nano-VM model, the memory cost of a number is the size of its VM state — specifically, the number of independent fields and the bit-width required for each.

Generator Complexity Hierarchy
Class Minimal VM State Dimension Memory Growth Examples
Rationals 1 constant 1/7, 3/8
Quadratic irrationals 2 O(log n) √2, φ
Classical transcendentals 3–4 O(log n) π, e
Higher transcendentals 4–6+ O(log n) ζ(3), polylogs
Inaccessible generators unbounded O(n) encrypted/random streams

π and e are not just "more complicated" than √2 in some informal sense. They appear to be the first two members of an ascending sequence of irreducible generator-complexity classes, each requiring genuinely more state to maintain correctness over an unbounded digit stream.

12

Randomness as Encrypted Determinism

In this ontology, randomness does not exist as a fundamental phenomenon. Every number is a deterministic generator VM. What people call "random" is a number whose generator is inaccessible — not absent, not large, but hidden.

Accessible

The generator can be derived from the number. Rationals, algebraic irrationals, classical transcendentals with known recurrences.

Cryptographically Inaccessible

The generator exists and is small, but is hidden by a one-way function. AES-CTR, ChaCha20, hash-based PRNGs.

Computably Inaccessible

The generator exists but is provably uncomputable. Chaitin Ω, Martin-Löf random reals.

13

A Random Program Is a Valid Number

In this ontology, a random program is a perfectly valid number — not metaphorically, but literally. Any program that has a finite state representation, implements step(state) → (digit, next_state), and does not halt, is a number.

Numbers are programs (operationally)

Programs are numbers (operationally)

The difference is only in how you interpret the generator

The space of numbers, as the library exposes it, is the space of all possible infinite computations. This is the most general possible operational number ontology — a strict superset of all existing computational frameworks.

14

On Equality, Identity, and the Limits of the Substrate

There is a Futurama line that captures the situation with uncomfortable precision. Professor Farnsworth, having "translated" an alien message, announces: "Of course we can translate it — but only into Betacrypt-3… a language so complex, there is even less chance of understanding it." The joke lands because the translation technically succeeds — it just maps the problem into a representation where the operations people actually cared about have become harder, not easier. This is, structurally, exactly the move Numbers as Machines makes. We "translate" every number into a generator-VM that emits an infinite digit stream, gaining uniformity, composability, and exactness. The bill for that translation comes due precisely here: the very act of representing a number as a non-halting machine is what makes "is this equal?" and "which is bigger?" non-primitive, partial, or outright undecidable. We did not make the substrate too complex out of carelessness — the translation preserves computability by discarding cheap equality. Betacrypt-3, but the trade is honest and the dialect is at least the same one everywhere.

The natural equivalence relation on generator VMs is extensional: two VMs are equal iff they produce identical digit streams on every query. This relation is not computably decidable — it is the canonical example of an undecidable semantic property of programs (Rice's theorem).

What it ships instead is a family of interval-based and bounded-precision predicates:

Interval-Honest Comparison Predicates
bool definitely_less_than(VM x, VM y);          // true / false / pending
bool agrees_with(VM x, VM y, int digits);       // exact for a finite prefix
Trit compare(VM x, VM y, int max_digits);       // Less | Greater | Indistinguishable
15

The User-Facing API

A substrate is not a library. The ABI is the foundation, but users interact with a stratified API that hides the substrate where appropriate and exposes it where necessary.

User-Facing API (Python)
# Precision context — establishes target accuracy for a region of code
with precision_context(digits=50):
    result = sin(pi / 4) + sqrt(2)

# Explicit comparison — interval-honest, never lies about decidability
x.agrees_with(y, digits=20)        # bool
x.definitely_less_than(y)          # True | False | Unknown

# Explicit memoization policy — no hidden caches
x = sqrt(2).cached(max_digits=1000)
x = pi.streaming()                  # O(1) space, sequential only

# Forking with honest cost annotation
x1, x2 = pi.fork()                  # O(log n) for series-tier VMs

# Skip-ahead when available
digit_billion = pi.in_base(16).skip(10**9).next_digit()
Layer 3: User Layer

Operator overloading, precision contexts, ergonomic constructors. Natural to anyone who has used mpmath, MPFR, or Decimal.

Layer 2: Combinator Layer

Arithmetic, comparison, codec, memoization wrappers. The contract.

Layer 1: Primitive Layer

The raw NumVMFn ABI, used by combinator authors.

16

Implementation Roadmap

The architecture admits a staged build, with each phase delivering value and informing the next.

Phase 1 High Confidence

Automaton Tier

Rationals, quadratic irrationals, p-adic numbers, base-conversion codecs, matrix-exponentiation skip-ahead. The struct-copy fork story is exact. LLVM inlines across the whole graph. This phase ships a working library for the entire algebraic and p-adic universe.

  • Rationals & quadratic irrationals
  • p-Adic numbers
  • Base-conversion codecs
  • Matrix-exponentiation skip-ahead
Phase 2 Research-Grade

Series Tier

Classical transcendentals with built-in tail bound oracles. Signed-digit internal arithmetic for carry locality. Bounded LRU memoization. Interval-based comparison. Requires per-constant convergence proofs and careful engineering of the deep-copy fork path.

  • Classical transcendentals (π, e, log, sin, cos)
  • Signed-digit internal arithmetic
  • Bounded LRU memoization
  • Interval-based comparison predicates
Phase 3 Required for Adoption

User-Facing API

Operator overloading, precision contexts, explicit memoization modes, honest comparison predicates. Documentation that makes the tier distinction visible to users who care and invisible to users who don't.

  • Operator overloading
  • Precision contexts
  • Explicit memoization modes
  • Honest comparison predicates
Phase 4 Future

JIT Compilation Path

Expression-tree compilation for runtime-constructed formulas. Struct-of-arrays layout for vector and matrix operations. Batched digit computation API for SIMD.

  • Expression-tree JIT compilation
  • Struct-of-arrays layout
  • Batched digit computation (SIMD)
17

What This Is

This is not a numerics library in the conventional sense. It is a coinductive computational substrate for numbers.

  • Numbers are forkable, deterministic nano-VMs, split honestly into automaton and series tiers
  • Arithmetic is VM composition over a signed-digit internal alphabet
  • Bases are codecs, not mathematical constants
  • Metrics are product VMs
  • p-Adics are periodic automata with local digit commitment
  • BBP formulas are automaton-codec resonances exposed as a first-class skip primitive
  • Randomness is operationally indistinguishable from inaccessible determinism
  • Memory complexity is generator state dimension, a proposed numerical complexity metric
  • Comparison is interval-honest, not falsely exact
  • The real line is the space of digit-generating computations under a chosen codec

Alpha to omega: from mathematical definitions to compiled digit streams, with the approximations made explicit, the costs made honest, and the limits made visible.