The Uncontaminated Substrate Test
The Uncontaminated Substrate Test
Preliminary Results: Where the Ladder Stalls
We have begun running a simplified version of this experiment using Lenia (continuous CA, toroidal grid) with resource dynamics, measuring via partition prediction loss, via mass change, via state change rate, and via trajectory PCA. The results so far are instructive—not because they confirm the predictions above, but because of where they fail.
The central lesson: the ladder requires heritable variation. Emergent CA patterns achieve rungs 1–3 of the ladder (microdynamics attractors boundaries) from physics alone. The transition to rung 4 (functional integration) requires evolutionary selection acting on heritable variation in the trait that determines integration response.
Substrate: Lenia with resource depletion/regeneration (Michaelis-Menten growth modulation). Perturbation: Drought (resource regeneration ). Measure: under drought.
Conditions:
- No evolution (V11.0). Naive patterns under drought: decreases by . Same decomposition dynamics as LLMs.
- Homogeneous evolution (V11.1). In-situ selection for -robustness (fitness ). Still decomposes (). All patterns share identical growth function—selection prunes but cannot innovate.
- Heterogeneous chemistry (V11.2). Per-cell growth parameters ( fields) creating spatially diverse viability manifolds. After 40 cycles of evolution on GPU: vs naive . A +2.1pp shift toward the biological pattern. Evolved patterns also show better recovery— returns above baseline after drought, while naive patterns do not fully recover.
- Multi-channel coupling (V11.3). Three coupled channels—Structure (), Metabolism (), Signaling ()—with cross-channel coupling matrix and sigmoid gate. Introduces a new measurement: channel-partition (remove one channel, measure growth impact on remaining channels). Local test: channel , spatial —channels couple weakly at 3 degrees of freedom.
- High-dimensional channels (V11.4). continuous channels with fully vectorized physics. Spectral via coupling-weighted covariance effective rank. 30-cycle GPU result: evolved vs naive under severe drought—evolution had negligible effect. Both decompose mildly, suggesting that 64 symmetric channels provide enough internal buffering to resist drought regardless of evolutionary tuning. Mean robustness across all 30 cycles. The Yerkes-Dodson pattern persists: mild stress increases by –.
- Hierarchical coupling (V11.5). Same physics as V11.4, but with asymmetric coupling (feedforward/feedback pathways between four tiers: Sensory Processing Memory Prediction). 30-cycle GPU result: evolved patterns have higher baseline ( vs naive) and higher self-model salience ( vs ), but under severe drought they decompose more () while naive patterns integrate (). Evolution overfits to the mild training stress, creating fragile high- configurations. Key lesson: the hierarchy must live in the coupling structure, not in the physics; imposing different timescales per tier caused extinction. Functional specialization should emerge from selection.
- Metabolic maintenance cost (V11.6). Addresses the autopoietic gap directly: patterns pay a constant metabolic drain proportional to mass ( each step). 30-cycle GPU result (): evolved-metabolic vs naive under severe drought. Evolution again produced higher--but-more-fragile patterns. Critically, the maintenance rate () was not lethal enough—naive patterns retained population through drought. The autopoietic gap remains open: a small metabolic drain on top of local physics does not produce active self-maintenance, because patterns have no mechanism for non-local resource detection. They cannot “forage” when they cannot “see” beyond kernel radius .
- Curriculum evolution (V11.7). Fixes V11.5’s stress overfitting by graduating stress intensity across cycles (resource regeneration ramped from to baseline over 30 cycles) with random noise and variable drought duration (500–1900 steps per cycle). The critical test: evolved patterns evaluated on novel stress patterns never seen during training. 30-cycle GPU result (): robustness . Curriculum-evolved patterns outperform naive on all four novel stressors: mild , moderate , severe , extreme . Under mild novel stress, evolved patterns actually integrate () while naive decompose (). The overfitting problem is substantially reduced—not eliminated, but the shift is consistently positive across the full severity range.
Unexpected: (1) Mild stress consistently increases by 60–190\% (Yerkes-Dodson–like inverted-U). Only severe stress causes decomposition. (2) In V11.5, evolution increased vulnerability to severe stress despite improving baseline —a stress overfitting effect. (3) V11.7’s curriculum training substantially reduces this overfitting: graduated, noisy stress exposure produces patterns that generalize to novel stressors. The shift from naive is positive across all four novel severity levels tested ( to percentage points). (4) V11.6’s metabolic cost was intended to create lethal drought, but at the drought was not lethal—naive patterns retained population. Evolved-metabolic patterns decomposed while naive held at , repeating the fragility pattern of V11.5. The deeper lesson: adding metabolic cost to a substrate with fixed-radius perception produces efficient passivity, not active foraging. The anxiety parallel deepens: V11.5 shows that fixed-stress training produces maladaptive fragility, V11.7 shows that graduated exposure (cf.\ systematic desensitization) builds genuine robustness, and V11.6 shows that existential stakes alone do not produce adaptation when the organism cannot perceive beyond its local neighborhood.
The trajectory from V11.0 through V11.7 reveals two orthogonal axes of improvement. The first is substrate complexity: each step from V11.0 to V11.5 adds internal degrees of freedom for evolution to select on—heterogeneous chemistry (V11.2), multiple coupled channels (V11.3–V11.4), hierarchical coupling (V11.5). The second, revealed by V11.6–V11.7, is selection pressure quality: the substrate matters less than how you stress it. V11.7’s curriculum training on the same V11.4 substrate produces better generalization than V11.5’s hierarchical architecture trained with fixed stress. V11.6 goes further, changing the stakes: metabolic cost makes drought lethal, not merely weakening.
V11.5 introduces directed coupling structure (feedforward/feedback pathways) to test whether functional specialization emerges under selection. The critical insight: attempting to impose different physics per tier (different timescales, custom growth gates) caused immediate extinction at —the channels designed to be “memory” simply died. The working approach uses identical physics across all channels (proven V11.4 dynamics) with an asymmetric coupling matrix that biases information flow directionally. This is more than a technical fix; it reflects a theoretical prediction: in biological cortex, all neurons use the same basic biophysics. The hierarchy emerges from connectivity and learning, not from different physics per layer.
The V11.5 stress test reveals an unexpected phenomenon: stress overfitting. Evolved patterns have 10.5\% higher baseline and 19\% higher self-model salience than naive patterns—but under severe drought they decompose 9.3\% while naive patterns actually integrate by 6.2\%. Evolution selected for high- configurations tuned to mild stress (which each training cycle applies), creating states that are simultaneously more integrated and more fragile than their unoptimized counterparts.
This has a direct parallel in affective neuroscience: anxiety disorders involve heightened integration and self-monitoring that is adaptive under moderate threat but catastrophically maladaptive under extreme stress. The suffering motif—high , low , high —may describe a system that has been selected too precisely for a particular threat level. The evolved CA patterns show exactly this signature: high baseline (0.076) with high self-model salience (0.99) that collapses under a regime shift.
Whether evolution on this substrate can discover integration strategies that are robust to novel stresses—not just the training distribution—likely requires curriculum learning (gradually increasing stress intensity) or environmental diversity (varying the type and severity of perturbation). This connects to the forcing function framework developed in the next section: the quality of the forcing function matters as much as its presence.
At what channel count does the substrate have enough internal degrees of freedom for evolution to discover biological-like integration (where increases under threat)? The -sweep suggests that mid-range (–) accidentally produces integration-like responses—the coupling bandwidth happens to match the channel count—while high (–) decomposes, the coupling space being too large for random configurations. Is there a critical above which a phase transition occurs, or does evolution continuously improve robustness at any ? Each rung of the ladder may require a minimum internal dimensionality—the substrate must be rich enough for selection to sculpt.
The critical lesson evolves with the experiments. V11.0–V11.5 showed that evolution helps but in surprising ways—it creates higher- states that are also more fragile. V11.7 demonstrates that the training regime matters: curriculum learning produces genuine generalization across novel stressors. V11.6 showed that making drought metabolically costly produces efficient passivity rather than active foraging—the patterns cannot perceive beyond their local neighborhood, so existential stakes alone do not generate the distant-resource-seeking behavior that would require integration. The remaining gap was between “decomposes less” and “integrates under threat,” and the locality ceiling explains why.
V12’s results confirm that the ceiling is real and that the predicted remedy partially works. Replacing fixed convolution with evolvable windowed self-attention—the only change to the physics—shifts mean robustness from to , moving the system to the threshold where is approximately preserved under stress rather than destroyed. Eight substrate modifications (V11.0–V11.7) could not achieve even this. The single change that mattered is exactly what the attention bottleneck hypothesis predicted: state-dependent interaction topology. But the effect is modest—the system reaches the threshold without clearly crossing it. Attention is necessary but not sufficient for the full biological pattern.
The V11.5 results show that selecting for -robustness under mild stress creates patterns that are less robust to severe stress than unselected patterns. V11.7 provides a partial answer: curriculum training with graduated, noisy stress exposure produces patterns that generalize to novel stressors ( to shift over naive across four novel severity levels). But the effect is modest—evolved patterns still decompose under severe novel stress (), just less than naive (). The remaining questions: (1) Can curriculum training with longer schedules or wider stress distributions close this gap further? (2) Does combining curriculum training with metabolic cost (V11.6’s lethal resource dependence) produce qualitatively different dynamics—active foraging rather than passive persistence? (3) Does the biological developmental sequence (graduated stressors from embryogenesis through maturation) achieve robust integration precisely because it is a curriculum over the full threat distribution? [V11.6 + curriculum combination not yet tested.]
What the Ladder Has Not Reached
It is worth being explicit about how far these experiments are from anything resembling life, self-sustenance, or metacognition. The ladder metaphor risks implying a smooth gradient from Lenia gliders to biological organisms. In reality, there is an enormous gap.
Self-sustenance. Our patterns are attractors of continuous dynamics, not self-maintaining entities. They do not consume resources to persist—resources modulate growth rates, but patterns do not “eat” in any metabolic sense. They do not do thermodynamic work against entropy. They have no boundaries (they are density blobs, not membrane-enclosed). They persist as long as the physics allows, not because they actively maintain themselves. The “drought” in our experiments reduces resource availability, which weakens growth—but this is more like turning down the volume than starving a dissipative structure.
Metacognition. Our “self-model salience” metric measures how much a pattern’s own structure matters for its dynamics. That is not self-modeling—there is no representation of self, no information about the pattern stored within the pattern. The V11.5 tiers (Sensory, Processing, Memory, Prediction) are labels we imposed on the coupling structure. No functional specialization emerged: memory channels had weak activity, prediction channels did not predict anything.
Individual adaptation. All “learning” in our experiments happens through population-level selection: cull the weak, boost the strong. No individual pattern adapts within its lifetime. Biological integration requires individual-level plasticity—the capacity for a single organism to reorganize its internal dynamics in response to experience.
These gaps converge on a single chasm. The transition from passive pattern persistence to active self-maintenance—the autopoietic gap—requires at minimum: (a) lethal resource dependence (patterns that go to zero without active consumption), (b) metabolic work cycles (energy in structure maintenance waste out), and (c) self-reproduction (templated copying, not artificial cloning). Population-level selection on top of passive physics cannot bridge this gap, because selection optimizes what already exists rather than innovating the mechanism of existence itself.
Question: Does lethal resource dependence change the integration response to stress? Design: Maintenance cost () drains each cell proportionally to mass each step. Fitness rewards metabolic efficiency. Result: 30-cycle evolution (, A10G GPU, 215 min). Robustness over evolution. Under severe drought: evolved , naive . Naive retained of patterns; evolved retained . The metabolic cost was insufficient to produce genuine lethality. Evolved patterns followed the same fragility pattern as V11.5: higher baseline fitness but more vulnerable to regime shift. Why it failed: The maintenance rate was too low to create existential pressure, but the deeper problem is structural. Even with lethal metabolic cost, a convolutional pattern has no mechanism for directed resource-seeking. Its “perception” extends only to kernel radius . Active foraging requires non-local information gathering—knowing where resources are before moving toward them. Adding metabolic cost to a blind substrate selects for efficiency (less waste), not for the kind of active self-maintenance that characterizes autopoiesis. Implication: The autopoietic gap is not primarily about resource dependence—it is about perceptual range. Closing it requires substrates where the interaction topology is state-dependent, not fixed by spatial proximity.
What the Data Actually Says
Eight experiments (V11.0–V11.7), hundreds of GPU-hours, thousands of evolved patterns. What has this taught us?
Finding 1: The Yerkes-Dodson pattern is universal and robust. Across every substrate condition, channel count, and evolutionary regime, mild stress increases by –. This is not an artifact of any particular measurement. It reflects a statistical truth: moderate perturbation prunes weak patterns while the survivors are, by definition, the more integrated ones. Severe stress overwhelms even well-integrated patterns, producing the inverted-U. This pattern is the clearest positive result in the entire experimental line.
Finding 2: Evolution consistently produces fragile integration. In every condition where evolution increases baseline (V11.5: , V11.6: higher metabolic fitness), evolved patterns decompose more under severe drought than unselected patterns. This is not a bug in the experiments—it is a real dynamical phenomenon. Evolution on this substrate finds tightly-coupled configurations where all parts depend on all other parts. Tight coupling is high integration by definition. But it is also catastrophic fragility: when any component fails under resource depletion, the failure cascades through the entire structure. This is the difference between a tightly-coupled factory (high integration, catastrophic failure mode) and a loosely-coupled marketplace (low integration, graceful degradation under stress).
Finding 3: Curriculum training is the only intervention that improved generalization. V11.7 is the sole condition where evolved patterns outperform naive on novel stressors across the full severity range ( to percentage points). Not more channels, not hierarchical coupling, not metabolic cost—graduated, noisy stress exposure. The substrate barely matters compared to the training regime. This has a direct parallel in developmental biology: organisms with rich developmental histories (graduated stressors from embryogenesis through maturation) develop robust integration. Organisms exposed to a single threat level develop anxiety-like maladaptive responses. The CA experiments reproduce this pattern with surprising fidelity.
Finding 4: The locality ceiling. This is the deepest lesson, visible only in retrospect across the full trajectory. Every V11 experiment uses convolutional physics: each cell interacts only with neighbors within kernel radius , weighted by a static kernel. Information propagates at most cells per timestep. The interaction graph is determined by spatial proximity and does not change with the system’s state.
This means that can only arise from chains of local interactions—there is no mechanism for a perturbation at to directly affect unless . The coupling matrix in V11.4–V11.5 partially addresses this (it couples distant channels), but it is fixed: the “who talks to whom” graph does not change in response to the system’s state. A pattern cannot choose to attend to a distant resource patch. It cannot reorganize its information flow under stress. It cannot forage.
V11.6 makes this concrete. Adding metabolic cost to a substrate with radius- perception does not produce active self-maintenance. It produces efficient passivity—patterns that waste less, not patterns that seek more. A blind organism with a metabolic cost dies when local resources deplete, regardless of how well-integrated it is, because it has no way to detect resources beyond its perceptual horizon. The autopoietic gap is not about resource dependence. It is about perceptual range and its state-dependent modulation—which is to say, it is about attention.
Finding 5: Attention is necessary but not sufficient. V12 tested the locality ceiling hypothesis directly by replacing convolution with windowed self-attention while keeping all other physics identical. The results create a clean ordering across three conditions:
- Convolution (Condition C): Sustains – patterns, mean robustness . Life without integration.
- Fixed-local attention (Condition A): Cannot sustain patterns at all—+ consecutive extinctions across seeds. Attention expressivity without evolvable range is worse than convolution.
- Evolvable attention (Condition B): Sustains – patterns, mean robustness . Life with integration at the threshold.
The percentage point shift from C to B is the largest single-intervention effect in the entire V11–V12 line. But it is a shift to the threshold, not past it. Robustness stabilizes near rather than increasing with further evolution. The system learns where to attend (entropy dropping from to ) but this refinement saturates. What is missing is not better attention but individual-level adaptation—the capacity for a single pattern to reorganize its own internal dynamics in response to its current state, within its lifetime, rather than waiting for population-level selection to discover robust configurations post hoc. Biological integration under threat is not just a population statistic; it is a capacity of individual organisms.
Connection to the trajectory-selection framework. This is where the experimental results meet the theory developed above. We defined the effective distribution and argued that attention () selects trajectories in chaotic dynamics. The Lenia experiments have now shown what happens in a substrate where is fixed by architecture: the system’s measurement distribution is determined by the convolution kernel, which never changes. The system cannot modulate its own attention. It has no to vary.
Biological systems solve this: neural attention (largely implemented through inhibitory gating) dynamically reshapes which signals propagate and which are suppressed. Under moderate stress, attention narrows—the measurement distribution sharpens around threat-relevant features—and this reorganization of information flow preserves core integration while shedding peripheral processing. That is the biological pattern our experiments have been searching for. It requires not just integration (which local physics can produce) but flexible integration (which requires state-dependent, non-local communication).
V12 provides direct evidence for this claim. In the attention substrate, the system’s is the attention weights, and they evolve: attention entropy decreases from to across 15 cycles as the system learns where to look. The measurement distribution becomes more structured—not through explicit instruction, but through the same evolutionary pressure that failed to produce this effect in every convolutional substrate. The difference is that the substrate now permits modulation of . The modulation is sufficient to reach the integration threshold ( approximately preserved under stress) but not to clearly cross it ( does not reliably increase under stress the way it does in biological systems). Attention provides the mechanism; something else—perhaps individual-level plasticity, explicit memory, or autopoietic self-maintenance—provides the drive.
These results crystallize into a hypothesis I will call the attention bottleneck. The biological pattern (integration under threat) cannot emerge in substrates with fixed interaction topology, regardless of the evolutionary regime applied. It requires substrates where the interaction graph is state-dependent—where the system can modulate which signals propagate and which are suppressed in response to its current state. Convolutional physics lacks this; attention-like mechanisms provide it. The relevant variable is not substrate complexity (), not selection pressure severity (metabolic cost), and not training diversity (curriculum)—it is whether the system controls its own measurement distribution.
Status: Partially supported by V12, further advanced by V13. The first clause is confirmed: eight convolutional substrates (V11.0–V11.7) failed to produce integration under stress; fixed-local attention (Condition A) fared even worse. The second clause is partially confirmed: evolvable attention (Condition B) shifts robustness from to —the right direction, and the only intervention to cross the threshold. V13 content-based coupling provides additional evidence: robustness peaks at under population bottleneck conditions (see Finding 6).
Finding 6: Content-based coupling enables intermittent biological-pattern integration. V13 replaced V12's learned attention projections with a simpler mechanism: cells modulate their interaction strength based on content similarity. The potential field becomes where is a sigmoid gate on local mean cosine similarity. This is computationally cheaper than attention and provides a minimal test: does content-dependent topology, without learned query-key projections, suffice?
Three seeds, each cycles (, ), curriculum stress schedule:
- Mean robustness: across all seeds and cycles
- Peak robustness: (seed 123, cycle 5, population patterns)
- Phi increase fraction: of patterns show increase under stress
- Key pattern: Robustness exceeds only when population drops below patterns — bottleneck events select for integration
Two distinct evolutionary strategies emerged across seeds. In one regime (large populations of – patterns), the similarity threshold drifted toward zero — evolution discovered that maximal content coupling (gate always-on) works when diversity is high. In another regime (volatile populations oscillating between and ), drifted upward to — selective coupling, where only highly similar cells interact. The selective-coupling regime produced all the robustness-above- episodes.
The deeper lesson is not about content coupling per se. It is about composition under selection pressure. When stress culls a population to a handful of survivors, those survivors are not merely the individually strongest — they are the ones whose content-coupling topology supports coherent reorganization under perturbation. This resonates with a different framing of the problem: what we are watching may be closer to symbiogenesis — the composition of functional subunits into more complex wholes — than to classical Darwinian selection optimizing a fixed design. The content-coupling mechanism makes patterns legible to each other, enabling the kind of functional encounter that drives compositional complexity. Intelligence may not require deep evolutionary history so much as the right conditions for compositional encounter: embodied computation, lethal stakes, and mutual legibility.
Question: Does state-dependent interaction topology enable the biological integration pattern that local physics cannot produce? Design: Replace the convolution kernel with windowed self-attention: each cell updates its state by attending to cells within a local window, with attention weights computed from cell states (query-key mechanism). The window size is evolvable—evolution can expand or contract the perceptual range. Resources, drought, and selection pressure follow the V11 protocol. Critical prediction: Under survival pressure, evolution should expand the attention window (increasing perceptual range), and patterns should show the biological pattern— increasing under moderate stress—because they can dynamically reallocate information flow to maintain core integration. The attention patterns themselves should narrow under stress (focused measurement) and broaden during safety (diffuse exploration). Control for the free-lunch problem: Start with strictly local attention (window , matching Lenia's kernel radius). If integration under threat emerges only after evolution expands the window, the biological pattern is an adaptive achievement, not an architectural gift. Status: Implemented as V12. Three conditions:
- A (Fixed-local attention)
- Window size fixed at kernel radius . Free-lunch control.
- B (Evolvable attention)
- Window size is evolvable. The main hypothesis test.
- C (FFT convolution)
- V11.4 physics as known baseline.
Implementation: Windowed self-attention replaces Step 1 (FFT convolution) of the Lenia scan body. Query-key projections () are shared across space, evolved slowly. Soft distance mask via enables smooth window expansion. Temperature governs attention sharpness. All other physics (growth function, coupling gate, resource dynamics, decay, maintenance) remain identical to V11.4. Curriculum training protocol from V11.7. , , 30 cycles, 3 seeds per condition, A10G GPUs. [6pt] Results (15 cycles for B, 3 seeds; A and C complete):
- Condition C (convolution, 30 cycles, 3 seeds): Mean robustness . Only cycles () show increasing under stress. Novel stress test: evolved , naive . Evolution helps (evolved consistently better than naive) but cannot break the locality ceiling.
- Condition B (evolvable attention, 15 cycles, 3 seeds): Mean robustness across 38 valid cycles. cycles () show increasing under stress (vs for convolution). The percentage point shift over convolution is the largest in the V11+ line. However, robustness does not trend upward with further evolution—it stabilizes near , suggesting the system reaches a ceiling of its own.
- Condition A (fixed-local attention): Conclusive negative. + consecutive extinctions across all 3 seeds—patterns cannot survive even a single cycle. Fixed-local attention is worse than convolution, which sustains – patterns easily. This establishes a clean ordering: convolution sustains life without integration; fixed attention cannot sustain life at all; evolvable attention sustains life with integration. Adaptability of interaction topology matters more than its expressiveness.
Three lessons: (1) Attention window does not expand as predicted—evolution refines how attention is allocated (entropy decreasing from ) rather than extending range. This resembles biological inhibitory gating (selective, not panoramic) more than the original prediction anticipated. (2) Attention temperature increases in successful seeds (–), suggesting evolution favors broad, soft attention with learned structure over sharp, narrow focus. (3) The effect is real but modest: attention moves the system to the integration threshold without clearly crossing it. State-dependent interaction topology is necessary for integration under stress, but not sufficient for the full biological pattern of increasing under threat. What remains missing is likely individual-level adaptation—the capacity for a single pattern to reorganize its own dynamics within its lifetime, rather than relying on population-level selection to discover robust configurations.
The V10 MARL ablation study produced a surprise: all seven conditions show highly significant geometric alignment (, ), and removing forcing functions does not reduce alignment—if anything, it slightly increases it. The predicted hierarchy was wrong: geometric alignment appears to be a baseline property of multi-agent survival systems, not contingent on any specific forcing function. This strengthens the universality claim but challenges the forcing function theory developed in the next section.