Experiments

The Emergence Experiment Program

Eleven measurement experiments on V13 snapshots, testing whether world modeling, abstraction, communication, counterfactual reasoning, self-modeling, affect structure, perceptual mode, normativity, and social integration emerge in a substrate with zero exposure to human affect concepts. All experiments: 3 seeds, 7 snapshots per seed, 50 recording steps per snapshot.

Experiment 0: Substrate Engineering

Status: Complete. V13 content-based coupling Lenia with lethal resource dynamics. Foundation for all subsequent experiments.

Experiment 1: Emergent Existence

Status: Complete. Patterns persist, maintain boundaries, respond to perturbation. Established by V11-V12, confirmed in V13.

Experiment 2: Emergent World Model

Question: When does a pattern's internal state carry predictive information about the environment beyond current observations?

Method: Prediction gap $\mathcal{W}(\tau) = \text{MSE}[f_{\text{env}}] - \text{MSE}[f_{\text{full}}]$ using Ridge regression with 5-fold CV.

Seed	$\mathcal{C}_{\text{wm}}$ (early)	$\mathcal{C}_{\text{wm}}$ (late)	$H_{\text{wm}}$ (late)	% with WM
123	0.0004	0.0282	20.0	100%
42	0.0002	0.0002	5.3	40%
7	0.0010	0.0002	7.9	60%

Finding: World model signal present but weak. Seed 123 at bottleneck shows 100x amplification. World models are amplified by bottleneck selection, not gradual evolution. To be clear about magnitude: $\mathcal{C}_{\text{wm}} \approx 0.0002$ for most seeds means the internal state predicts the environment barely better than the environment alone. Only seed 123 at maximum bottleneck pressure reaches 0.028 — detectable but still small. These patterns are not building substantial world models; they carry a faint trace of environmental predictive information, amplified briefly under extreme selection.

Experiment 2 world model summary — **Experiment 2: World model summary.** (a) World model capacity over evolution — note the y-axis scale (0.000–0.030). Seed 123 shows a dramatic late spike; seeds 42 and 7 remain near zero throughout. (b) World model horizon in recording steps. (c) Prediction gap at late evolution — seed 123 maintains a flat, elevated prediction gap across all horizons, consistent with a genuine (if weak) internal model.

World model capacity vs pattern longevity — **World model vs pattern longevity (r = 0.084).** Near-zero correlation: having a world model does not help a pattern survive longer. Most points cluster at C_wm ≈ 0 regardless of lifetime. The few high-C_wm outliers are long-lived patterns at cycle 29 — the world model emerges as a byproduct of bottleneck survival, not as a survival advantage.

Source code

v13_world_model.py — World model measurement
v13_world_model_run.py — Runner
v13_world_model_figures.py — Visualization

Experiment 3: Internal Representation Structure

Question: When do patterns develop low-dimensional, compositional representations?

Seed	$d_{\text{eff}}$ (early to late)	$\mathcal{A}$	$\mathcal{D}$	$K_{\text{comp}}$
123	6.6 to 5.6	0.90 to 0.92	0.27 to 0.38	0.20 to 0.12
42	7.3 to 7.5	0.89 to 0.89	0.23 to 0.23	0.23 to 0.25
7	7.7 to 8.8	0.89 to 0.87	0.24 to 0.22	0.20 to 0.27

Finding: Compression is cheap — $d_{\text{eff}} \approx 7$ /68 from cycle 0. But quality only improves under bottleneck selection. Note the asymmetry: abstraction ( $\mathcal{A} \approx 0.89$ ) is high and stable from the start — the system compresses efficiently without effort. But disentanglement ( $\mathcal{D} \approx 0.25$ ) remains low — the compressed representations are tangled, not cleanly factored. Disentanglement requires active information-seeking that this substrate lacks.

Experiment 3 representation structure summary — **Experiment 3: Representation structure summary.** (a) Effective dimensionality: 5.6–8.8 out of 68 possible — strong compression from cycle 0. Seed 123 compresses further over evolution. (b) Abstraction (A ≈ 0.89) is high and stable; disentanglement (D ≈ 0.25) remains low. The gap confirms the theory: compression is cheap but clean factoring requires agency. (c) Compositionality error — lowest for seed 123 at bottleneck, consistent with the bottleneck amplification pattern from Exp 2.

Eigenspectrum of internal state: early vs late evolution — **Internal state eigenspectrum, early vs late.** Log-scale variance fraction by PCA dimension. Seeds 123 and 42 show the late eigenspectrum becoming more concentrated in the top components — genuine compression under evolutionary pressure. Seed 7 stays relatively flat, consistent with its lack of world-model development.

Source code

v13_representation.py — Representation analysis
v13_representation_run.py — Runner

Experiment 4: Emergent Language

Question: When do patterns develop structured, compositional communication?

Seed	MI significant	MI range	$\rho_{\text{topo}}$ significant
123	4/6	0.019-0.039	0/6
42	7/7	0.024-0.030	0/7
7	4/7	0.023-0.055	0/7

Finding: Chemical commons, not proto-language. MI above baseline in 15/20 snapshots but $\rho_{\text{topo}} \approx 0$ everywhere. Unstructured broadcast, not language.

Source code

v13_communication.py — Communication analysis
v13_communication_run.py — Runner

Experiment 5: Counterfactual Detachment

Question: When do patterns decouple from external driving and run offline world model rollouts?

Result: Null. $\rho_{\text{sync}} \approx 0$ from cycle 0. Patterns are inherently internally driven. The FFT convolution kernel integrates over the full grid — there is no reactive-to-autonomous transition because the starting point is already autonomous.

Source code

v13_counterfactual.py — Counterfactual measurement
v13_counterfactual_run.py — Runner

Experiment 6: Self-Model Emergence

Question: When does a pattern predict itself better than an external observer can?

Result: Weak signal at bottleneck only. $\rho_{\text{self}} \approx 0$ everywhere. $\text{SM}_{\text{sal}} > 1$ appears once: seed 123, cycle 20, one pattern at $\text{SM}_{\text{sal}} = 2.28$ .

Source code

v13_self_model.py — Self-model measurement
v13_self_model_run.py — Runner

Experiment 7: Affect Geometry Verification

Question: Does the geometric affect structure predicted by the thesis actually appear? RSA between structural affect (Space A) and behavioral affect (Space C).

Seed	$\rho(A,C)$ range	Significant	Trend
123	-0.09 to 0.72	2/5	Strong at low pop
42	-0.17 to 0.39	4/7	Mixed
7	0.01 to 0.38	5/7	Increasing (0.01 to 0.24)

Finding: Structure-behavior alignment in 8/19 snapshots. Seed 7 shows evolutionary trend. A-B alignment null (structure maps to behavior but not communication).

Source code

v13_affect_geometry.py — RSA computation
v13_affect_geometry_run.py — Runner

Experiment 8: Perceptual Mode and Computational Animism

Question: Do patterns develop modulable perceptual coupling?

Metric	Seed 123	Seed 42	Seed 7
$\iota$ (mean)	0.27-0.44	0.27-0.41	0.31-0.35
$\iota$ trajectory	0.32 to 0.29	0.41 to 0.27	0.31 to 0.32
Animism score	1.28-2.10	1.60-2.16	1.10-2.02

Confirmed. Default is participatory ( $\iota \approx 0.30$ ). Animism score > 1.0 in all 20 snapshots. Patterns model resources using the same dynamics they use to model other agents. Computational animism is the default because reusing the agent-model template is the cheapest compression.

Source code

v13_iota.py — Inhibition coefficient measurement
v13_iota_run.py — Runner

Experiment 9: Proto-Normativity

Question: Does the viability gradient generate structural normativity?

Result: Null. No $\Delta V$ asymmetry between cooperative and competitive contexts. But $\intinfo_{\text{social}} \gg \intinfo_{\text{isolated}}$ (~4.9 vs ~3.1). Social context increases integration regardless of interaction type. Normativity requires agency — the capacity to act otherwise.

Source code

v13_normativity.py — Normativity measurement
v13_normativity_run.py — Runner

Question: Does $\intinfo_G > \sum_i \intinfo_i$ ?

Finding: No superorganism. Ratio 1-12% but increasing. Seed 7: 6.1% to 12.3% over evolution. Moving toward threshold but not reaching it.

Source code

v13_social_phi.py — Group integration measurement
v13_social_phi_run.py — Runner

Experiment 11: Entanglement Analysis

Question: Are world models, abstraction, communication, detachment, and self-modeling separable or entangled?

Finding: Four clusters — but not the ones predicted. Most measures cluster into one large group driven by population-mediated selection. Overall entanglement increases (mean |r| from 0.68 to 0.91). Everything becomes more correlated, just not in the clusters the theory expected.

Source code

v13_entanglement.py — Entanglement analysis
v13_entanglement_run.py — Runner

Experiment 12: Identity Thesis Capstone

Question: Does the full program hold in a system with zero human contamination?

Criterion	Status	Strength
World models	Met	Weak (strong at bottleneck)
Self-models	Met	Weak (n=1 event)
Communication	Met	Moderate (15/21 sig)
Affect dimensions	Met	Strong (84/84)
Affect geometry	Met	Moderate (9/19 sig)
Tripartite alignment	Met	Partial (A-C pos, A-B null)
Perturbation response	Met	Moderate (rob 0.923)

Verdict: All seven criteria met, most at moderate/weak strength. Geometry confirmed; dynamics undertested, blocked by the coupling wall.

Source code

v13_capstone.py — Capstone integration
v13_capstone_run.py — Runner

The Emergence Experiment Program

Experiment 0: Substrate Engineering

Experiment 1: Emergent Existence

Experiment 2: Emergent World Model

Experiment 3: Internal Representation Structure

Experiment 4: Emergent Language

Experiment 5: Counterfactual Detachment

Experiment 6: Self-Model Emergence

Experiment 7: Affect Geometry Verification

Experiment 8: Perceptual Mode and Computational Animism

Experiment 9: Proto-Normativity

Experiment 10: Social-Scale Integration

Experiment 11: Entanglement Analysis

Experiment 12: Identity Thesis Capstone