Animism as Computational Default
Animism as Computational Default
A self-modeling system maintains a world model and a self-model . The self-model has interiority—it is not merely a third-person description of the agent’s body and behavior but includes the intrinsic perspective: what-it-is-like states, valence, anticipation, dread. The system knows from the inside what it is to be an agent.
Now it encounters another entity in its environment. moves, reacts, persists, avoids dissolution. The system must model to predict ’s behavior. The cheapest computational strategy—by a wide margin—is to model using the same architecture it already has for modeling itself. The information-theoretic argument: the self-model already exists (sunk cost). Using it as a template for requires learning only a projection function , whose description length is the cost of mapping observations of onto the existing self-model architecture. Building a de novo model of from scratch requires learning the full parameter set of from observations alone. Under compression pressure—which is always present for a bounded system—the template strategy wins whenever the self-model captures any variance in ’s behavior. And for any entity that moves autonomously, reacts to stimuli, or persists through active maintenance, the self-model will capture substantial variance, because these are precisely the features the self-model was built to represent. The efficiency gap widens under data scarcity: on brief encounter with a novel entity, the from-scratch model cannot converge, but the template model produces usable predictions immediately.
A perceptual mode is participatory when the system’s model of perceived entities inherits structural features from the self-model :
The self-model informs the world model. The system perceives as having something like interiority because the representational substrate for modeling is the same substrate that carries the system’s own interiority.
This is not merely one strategy among many—it is the computationally cheapest. For a self-modeling system with compression ratio , modeling novel entities by analogy to self is the minimum-description-length strategy when the entity’s behavior is partially predictable by agent-like models. Under broad priors over environments containing other agents, predators, and autonomous objects, the participatory prior is the MAP estimate.
This is why animistic perception is cross-culturally universal and developmentally early. It is not a cultural invention but a computational inevitability for systems that (a) model themselves and (b) must model other things cheaply. Children have lower inhibition of this default than adults—not because children are confused but because the suppression is learned.
The computational animism test. Train RL agents in a multi-entity environment with two conditions: (a) agents with a self-prediction module (self-model), and (b) matched agents without one. Then introduce novel moving objects whose trajectories are partially predictable but non-agentive (e.g., bouncing balls with momentum). Measure: (1) Do self-modeling agents’ internal representations of these objects contain more goal/agency features (extracted via probes trained on actual agents vs.\ objects)? (2) Does the effect scale with self-model richness (size of self-prediction module) and compression pressure (information bottleneck )? (3) Do self-modeling agents under higher compression pressure () show more animistic attribution, because reusing the self-model template saves more bits? The compression argument predicts yes to all three. The control condition (no self-model) predicts no agency attribution beyond chance. If self-modeling agents attribute agency to non-agents in proportion to compression pressure, the “animism as computational default” hypothesis is supported.
Status: Confirmed. This experiment has since been run on uncontaminated Lenia substrates (see Experiment 8, Appendix). Animism score exceeded 1.0 in all 20 testable snapshots across all three seeds — patterns consistently model resources using the same internal-state dynamics they use to model other agents. Mean ι ≈ 0.30 as default across all snapshots, and ι decreases over evolutionary time (seed 42: 0.41 to 0.27). Selection consistently favors more participatory perception, not less. The mechanistic default predicted by high-compression-pressure environments was not found; the participatory default was.
Participatory perception has five structural features, each with a precise characterization:
- No sharp self/world partition. The mutual information between self-model and world-model is high: . Perception and projection are entangled rather than modular.
- Hot agency detection. The prior is strong. Over-attributing agency is cheaper than under-attributing it: false positives (treating a rock as agentive) are cheap; false negatives (failing to model a predator’s intentions) are lethal.
- Tight affect-perception coupling. Seeing something is simultaneously feeling something about it. The affective response is constitutive of the percept itself, not a secondary evaluation: .
- Narrative-causal fusion. “Why did this happen?” and “What story is this?” are the same question. Causal models are teleological by default: they model what things are for rather than merely what things do.
- Agency at scale. Large-scale events—weather, disease, fortune—are attributed to agents with purposes. This is hot agency detection applied beyond the individual scale, and it is the perceptual ground from which theistic reasoning naturally grows.