The Core Inversion
Classical numerics treats numbers as values — static objects stored in registers, manipulated by arithmetic circuits, approximated by floating-point hardware. This essay proposes a different substrate: every number is a deterministic, forkable virtual machine that emits an infinite digit stream on demand.
This is not a metaphor. It is a literal computational ontology with precise consequences for implementation, memory complexity, and the structure of mathematics itself. The framing has deep roots — Cauchy sequences, computable reals (Turing 1936), coalgebraic stream arithmetic, p-adic expansions, exact real arithmetic systems like iRRAM, and the Type-2 Theory of Effectivity all instantiate parts of this idea. What is new here is not the underlying mathematics but the unified protocol, the codec/base separation, and the explicit ABI that makes all of it composable at native speed.
The primitive is simple. A number is any deterministic program that implements this interface. Everything else — vectors, matrices, tensors, p-adics, transcendentals, metrics, arithmetic — is a combinator over this primitive.
step : State → (digit, State)
The Binary Multiplexer as Foundation
A vector is a nesting of binary multiplexer encoders. A matrix or tensor is a nesting of vector construction operators. This is not an analogy to existing data structures — it is the correct ontological description of what these objects are.
MUX(selector, left, right)
Where selector is itself a generator, and
left and
right
are generators recursively.
Unbounded Precision
Digits are computed on demand; you never run out of bits.
p-Adic Structure
Digits are naturally hierarchical and base-parametrized.
Lazy Evaluation
Only expand the tree as far as the consumer requires.
Structural Sharing
Subtrees are reused, giving canonicalization for free.
Vectors & Tensors
Nested MUX trees over number generators, uniformly.
Arithmetic becomes generator composition. Addition is a carry-propagating combinator over two digit generators. Multiplication is a convolution combinator. Division, negation, and subtraction follow the same pattern — pure generator composition.
The Carry Problem and Signed-Digit Representation
Standard positional arithmetic has a notorious flaw as a stream operation: addition
is
not local. The first digit of
0.4999... + 0.5000... cannot be determined from any
finite prefix — a carry from arbitrarily deep in the expansion could propagate up
and flip every digit above it.
Any naive digit-by-digit addition combinator will fail to terminate on inputs near a digit boundary. This is not an edge case; it is a structural property of the standard representation.
- Addition is digit-local — no unbounded carry propagation
- Each number has multiple valid encodings (the redundancy)
- The MUX/codec layer projects to standard non-redundant form only when emitting an externally visible digit
Unbounded Precision as a Native Property
Because digits are computed on demand and the tree is lazily evaluated, unbounded precision is not a feature to be added — it is the default behavior of the substrate. Structural sharing keeps memory bounded to what is actually needed.
You compute exactly as many digits as the consumer requests, and no more. This is
the same mechanism used by lazy streams in functional languages and exact real
arithmetic libraries (iRRAM, Haskell's
Data.Number.CReal, MPFR with rigorous error
tracking), but grounded in a more explicit state-machine semantics that LLVM can
compile directly to native code, and unified under a single ABI rather than
scattered across domain-specific libraries.
p-Adics Come for Free
The p-adic numbers are particularly natural in this model, and notably easier than the reals in a precise technical sense: digit commitment is local in p-adics, where it is non-local in the reals.
A p-adic number is a digit stream indexed from the least significant digit upward,
with digits in
{0, 1, ..., p-1}. In the generator model, this is
simply a MUX tree with a p-way multiplexer instead of a binary one.
This has an immediate consequence: the p-adic metric is also a periodic state
machine. The p-adic distance between two numbers
x and y is
p^{-v_p(x-y)}, where
v_p(x-y) is the index of the first digit where they
differ. If both are periodic generators, then the metric computation is a product
automaton over their state spaces — itself a finite, periodic machine.
AutomatonOverGenerators
A finite state machine whose transition function reads one or more digit generators and produces either a new generator or a scalar (valuation, distance, boolean). p-adics and their metric are the cleanest, most symmetric instance of this pattern.
The Odd Primitives: Turning Math into Compute
Supporting most of mathematics requires some nonstandard primitives — operators that compile mathematical definitions into digit streams with correctness guarantees.
Series Stepper
series_step(state) → (term, new_state)
A finite state machine that emits the next term of an infinite series. This is the computational form of a mathematical definition — a convergent series becomes an executable state machine.
Tail Bound Oracle
tail_bound(state) → E
Given the current series state at index
n, returns an upper bound
E such that the remaining tail satisfies
|∑_{k>n} a_k| ≤ E. This is the bridge from
convergence proofs to executable digit safety.
Interval Refinement Engine (for reals)
refine_interval(L, U, term, E) → (L', U')
emit_digit_if_safe(L, U, base) → maybe digit
Maintains a shrinking interval
[L, U]
containing the true value. A digit is emitted when all numbers in the current
interval share the same prefix of sufficient length. This is the real-number
version of digit extraction — an online, monotone refinement process.
Valuation Extractor (for p-adics)
valuation(a_n) → v_p(a_n)
For p-adics, locality is algebraic rather than analytic. A term with p-adic
valuation
v can only affect digits at positions
≤ v. No tail bound oracle required, no interval
refinement, no boundary pathologies.
Carry Propagator
carry(k) → {0, 1, ...}
Addition, subtraction, multiplication, and division all reduce to digitwise combinators plus a carry propagation rule. Under signed-digit representation, this combinator is local. Under standard representation, it is not.
Composition Engine
compose(f, g) → h
All mathematical operations reduce to composing digit generators. This is the function algebra of the numerics library — the mechanism by which complex mathematical objects are built from simpler ones.
Memoization and Structural Sharing
memo(generator) → generator'
Repeated digit queries must not recompute. Shared subtrees must remain shared. Memoization is exposed as an explicit wrapper, never an implicit global cache, because implicit caching breaks the value-semantics story for forking.
.streaming()
O(1) space, sequential access only, no cache
.cached(max_digits=N)
Bounded LRU cache, random access within N digits
The Forkable Nano-VM: A Two-Tier ABI
The full picture of a number in this model is a forkable, coinductive, deterministic nano-VM.
- Stateful Each number carries internal state that evolves as digits are requested.
- Forkable The state can be cloned exactly at any point, enabling parallel digit computation, comparison, metric evaluation, and speculative arithmetic.
- Deterministic & Pure Forking is exact; there is no hidden global state.
- Periodic-Friendly The VM supports periodic state cycles, ultimately periodic orbits, and product automata.
- Composable Two VMs combine into a new VM via arithmetic and analytic combinators.
- Lazy Digits are produced on demand, not precomputed.
| Tier | Examples | State | Fork Cost |
|---|---|---|---|
| Automaton | Rationals, algebraic irrationals, periodic p-adics | Fixed-size, inline | O(1) — true struct copy |
| Series | Classical transcendentals (π, e, log 2, ζ(3)) | Grows with computation depth | O(log n) — must deep-copy accumulators |
// Automaton tier: pure value semantics, O(1) fork
typedef struct {
uint32_t base;
uint32_t phase;
uint64_t state[4]; // sufficient for degree-4 algebraic
} AutomatonVM;
// Series tier: explicit deep-copy required for fork
typedef struct {
uint32_t base;
uint32_t index;
const SeriesSpec *spec; // immutable, safe to alias
ArbitraryInt *accum; // mutable, deep-copy on fork
ArbitraryInt *error_bound; // mutable, deep-copy on fork
} SeriesVM;
LLVM as the Execution Substrate
The nano-VM is not a bytecode interpreter. It does not encode or decode its own state. That is precisely why LLVM is the right implementation target.
struct NumVMStep {
uint32_t digit;
AutomatonVM next;
};
typedef struct NumVMStep (*NumVMFn)(AutomatonVM);
%AutomatonVM = type { i32, i32, [4 x i64] }
%NumVMStep = type { i32, %AutomatonVM }
define %NumVMStep @num_step(%AutomatonVM %s) { ... }
static inline AutomatonVM num_vm_fork(AutomatonVM s) {
return s; // pure value copy — no hidden state
}
Base as Codec: The Arithmetic Encoder Interface
The base parameter is not a mathematical constant. It is a projection operator — a codec. This is the correct ontological description.
This is the same relationship as a probability distribution to an arithmetic coder, a message to a bitstream, a model to a codebook. Base is a codec. The number is the generator. The digits are the encoded tape.
In this model, ℝ is not the ontology. ℝ is one possible projection of the generator. The generator is the real object. The real number is the encoding. Changing base is changing coordinate charts on generator space — the underlying object does not change, only the representation does.
BBP Formulas Explained: Automaton Periodicity as Fast-Forward
The Bailey-Borwein-Plouffe formula for π in base 16 allows computing the n-th hexadecimal digit of π without computing the preceding digits. The generator-VM model contributes a first-class library primitive that lifts this phenomenon out of one-off optimizations and into the ABI.
A BBP-type formula exists exactly when the generator automaton has a periodic orbit under the chosen codec.
In base 16, the π-generator VM has a periodic sub-automaton. The codec (base 16) aligns with that periodic structure, exposing a fast-forward opcode. In base 10, the same automaton has no such periodic orbit under that codec. No fast-forward is possible.
Unified by the Same Mechanism
- BBP for π in base 16
- BBP for log(2) in base 2
- BBP for polylogarithms in various bases
- Certain p-adic expansions
- Certain Mahler-function expansions
skip(n, state) → state'
Available only when the automaton has a periodic orbit under the current codec. This is the computational form of a BBP formula. The skip function cannot be synthesized automatically — it requires per-constant manual derivation — but once written, it becomes a first-class operation that LLVM optimizes alongside the rest of the generator graph.
Memory as the Right Complexity Metric
Runtime is elastic — you can trade time for algorithms, parallelize, precompute. Memory (state size) is structural. It is tied to the Kolmogorov complexity of the number and to the order and dimension of the recurrence or automaton.
In the nano-VM model, the memory cost of a number is the size of its VM state — specifically, the number of independent fields and the bit-width required for each.
| Class | Minimal VM State Dimension | Memory Growth | Examples |
|---|---|---|---|
| Rationals | 1 | constant | 1/7, 3/8 |
| Quadratic irrationals | 2 | O(log n) | √2, φ |
| Classical transcendentals | 3–4 | O(log n) | π, e |
| Higher transcendentals | 4–6+ | O(log n) | ζ(3), polylogs |
| Inaccessible generators | unbounded | O(n) | encrypted/random streams |
π and e are not just "more complicated" than √2 in some informal sense. They appear to be the first two members of an ascending sequence of irreducible generator-complexity classes, each requiring genuinely more state to maintain correctness over an unbounded digit stream.
Randomness as Encrypted Determinism
In this ontology, randomness does not exist as a fundamental phenomenon. Every number is a deterministic generator VM. What people call "random" is a number whose generator is inaccessible — not absent, not large, but hidden.
Accessible
The generator can be derived from the number. Rationals, algebraic irrationals, classical transcendentals with known recurrences.
Cryptographically Inaccessible
The generator exists and is small, but is hidden by a one-way function. AES-CTR, ChaCha20, hash-based PRNGs.
Computably Inaccessible
The generator exists but is provably uncomputable. Chaitin Ω, Martin-Löf random reals.
A Random Program Is a Valid Number
In this ontology, a random program is a perfectly valid number — not metaphorically,
but literally. Any program that has a finite state representation, implements
step(state) → (digit, next_state), and does not
halt, is a number.
Numbers are programs (operationally)
Programs are numbers (operationally)
The difference is only in how you interpret the generator
The space of numbers, as the library exposes it, is the space of all possible infinite computations. This is the most general possible operational number ontology — a strict superset of all existing computational frameworks.
On Equality, Identity, and the Limits of the Substrate
There is a Futurama line that captures the situation with uncomfortable precision. Professor Farnsworth, having "translated" an alien message, announces: "Of course we can translate it — but only into Betacrypt-3… a language so complex, there is even less chance of understanding it." The joke lands because the translation technically succeeds — it just maps the problem into a representation where the operations people actually cared about have become harder, not easier. This is, structurally, exactly the move Numbers as Machines makes. We "translate" every number into a generator-VM that emits an infinite digit stream, gaining uniformity, composability, and exactness. The bill for that translation comes due precisely here: the very act of representing a number as a non-halting machine is what makes "is this equal?" and "which is bigger?" non-primitive, partial, or outright undecidable. We did not make the substrate too complex out of carelessness — the translation preserves computability by discarding cheap equality. Betacrypt-3, but the trade is honest and the dialect is at least the same one everywhere.
The natural equivalence relation on generator VMs is extensional: two VMs are equal iff they produce identical digit streams on every query. This relation is not computably decidable — it is the canonical example of an undecidable semantic property of programs (Rice's theorem).
What it ships instead is a family of interval-based and bounded-precision predicates:
bool definitely_less_than(VM x, VM y); // true / false / pending
bool agrees_with(VM x, VM y, int digits); // exact for a finite prefix
Trit compare(VM x, VM y, int max_digits); // Less | Greater | Indistinguishable
The User-Facing API
A substrate is not a library. The ABI is the foundation, but users interact with a stratified API that hides the substrate where appropriate and exposes it where necessary.
# Precision context — establishes target accuracy for a region of code
with precision_context(digits=50):
result = sin(pi / 4) + sqrt(2)
# Explicit comparison — interval-honest, never lies about decidability
x.agrees_with(y, digits=20) # bool
x.definitely_less_than(y) # True | False | Unknown
# Explicit memoization policy — no hidden caches
x = sqrt(2).cached(max_digits=1000)
x = pi.streaming() # O(1) space, sequential only
# Forking with honest cost annotation
x1, x2 = pi.fork() # O(log n) for series-tier VMs
# Skip-ahead when available
digit_billion = pi.in_base(16).skip(10**9).next_digit()
Operator overloading, precision contexts, ergonomic constructors. Natural to anyone who has used mpmath, MPFR, or Decimal.
Arithmetic, comparison, codec, memoization wrappers. The contract.
The raw NumVMFn ABI, used by combinator
authors.
Implementation Roadmap
The architecture admits a staged build, with each phase delivering value and informing the next.
Automaton Tier
Rationals, quadratic irrationals, p-adic numbers, base-conversion codecs, matrix-exponentiation skip-ahead. The struct-copy fork story is exact. LLVM inlines across the whole graph. This phase ships a working library for the entire algebraic and p-adic universe.
- Rationals & quadratic irrationals
- p-Adic numbers
- Base-conversion codecs
- Matrix-exponentiation skip-ahead
Series Tier
Classical transcendentals with built-in tail bound oracles. Signed-digit internal arithmetic for carry locality. Bounded LRU memoization. Interval-based comparison. Requires per-constant convergence proofs and careful engineering of the deep-copy fork path.
- Classical transcendentals (π, e, log, sin, cos)
- Signed-digit internal arithmetic
- Bounded LRU memoization
- Interval-based comparison predicates
User-Facing API
Operator overloading, precision contexts, explicit memoization modes, honest comparison predicates. Documentation that makes the tier distinction visible to users who care and invisible to users who don't.
- Operator overloading
- Precision contexts
- Explicit memoization modes
- Honest comparison predicates
JIT Compilation Path
Expression-tree compilation for runtime-constructed formulas. Struct-of-arrays layout for vector and matrix operations. Batched digit computation API for SIMD.
- Expression-tree JIT compilation
- Struct-of-arrays layout
- Batched digit computation (SIMD)
What This Is
This is not a numerics library in the conventional sense. It is a coinductive computational substrate for numbers.
- Numbers are forkable, deterministic nano-VMs, split honestly into automaton and series tiers
- Arithmetic is VM composition over a signed-digit internal alphabet
- Bases are codecs, not mathematical constants
- Metrics are product VMs
- p-Adics are periodic automata with local digit commitment
-
BBP formulas are automaton-codec resonances exposed as a first-class
skipprimitive - Randomness is operationally indistinguishable from inaccessible determinism
- Memory complexity is generator state dimension, a proposed numerical complexity metric
- Comparison is interval-honest, not falsely exact
- The real line is the space of digit-generating computations under a chosen codec
Alpha to omega: from mathematical definitions to compiled digit streams, with the approximations made explicit, the costs made honest, and the limits made visible.