Does question-asking emerge as mathematical law? This project tests whether interrogative structures arise as the necessary residue of unresolved constraint — independent of the cognitive substrate that implements them.
Current science treats interrogative behaviour as a linguistic phenomenon — something that evolved in biological systems or was designed into artificial ones. This project tests a harder claim.
Question-asking is a product of language and representation. Agents ask questions because they have evolved the cognitive apparatus to do so, or because they were trained on linguistic data. The behaviour is substrate-dependent.
Interrogative states are Δ-variables — unresolved dependency variables that emerge as the minimum residue of constraint under resource pressure. Any system facing coordination problems under metabolic cost will develop them. The behaviour is substrate-independent.
Three agents with fundamentally different architectures (recurrent, convolutional, relational) must coordinate under a Landauer-style energy tax. They are never told to ask questions. The experiment tests whether interrogative structure emerges anyway.
Smooth reward optimisation predicts monotonic performance curves. The structural account predicts phase transitions, path dependence, and hysteresis. These are qualitatively distinguishable outcomes. The theory names the conditions under which it fails.
A structural account of why interrogative states arise as necessary consequences of coordinating under resource constraints — not as designed features.
A Δ-variable is an explicitly represented, unresolved dependency variable that requires external information for its resolution and creates a mandatory coupling between the system holding it and the system capable of resolving it.
The invariant spanning implicit and explicit substrates:
"Δ is structural dependency under constraint."
This holds whether the dependency is represented by a symbolic QUERY token (MARL agents), a pheromone gradient below a decision threshold (army ants), or a gradient deficit in a neural network. All are subclasses of the same structural phenomenon.
In any system with internal uncertainty states, bounded resources, and coupling opportunities, Δ-variables emerge necessarily — not as an optimised strategy but as the minimum residue of unresolved constraint. Systems that lack explicit representational capacity exhibit this as structural stagnation rather than symbolic tokens.
The rate of Δ-variable generation is a decreasing function of the marginal cost-to-information-value ratio. Systems develop interrogative behaviour despite higher energy costs when the coordination benefit exceeds expenditure — up to a metabolic saturation boundary beyond which query suppression occurs.
A Δ-variable held by one system creates a structural obligation in the receiving system. Silence is not neutral: it propagates unresolved dependency. Systems cannot act coherently on a Δ-variable without achieving closure — the coupling is structurally mandatory, not behaviourally chosen.
Systems without mechanisms for externalising Δ-variables undergo silent collapse — internal uncertainty accumulates without generating a coordination signal, producing correlated failures that are invisible until catastrophic. Protocol 0 controls demonstrate this: zero crystallisation, persistent coordination floor.
Under identical resource constraints, systems with fundamentally different cognitive architectures will converge on similar interrogative strategies. The mathematical structure of the protocol is invariant across substrates; only the implementation medium differs.
The theory is falsified if any of the following three conditions hold:
Crystallised protocols dissolve immediately when query cost is reduced below the formation threshold — ruling out path dependence and structural residue. Tests Experiment 2 (frozen-policy reversal).
Systems with Agent C's type_head ablated reach equivalent coordination within 50 epochs, showing no structural role for the relational broker. Tests Experiment 6 (C ablation).
Type entropy decreases as a smooth monotonic function of training time with no discontinuous transitions — consistent with reward optimisation and inconsistent with structural emergence. Already rejectable from Run 11 pilot data.
Substrate independence requires architectures that differ at the fundamental level of how information is represented and processed — not just different hyperparameters.
Sequential pattern encoding. Hidden state persists across timesteps. Represents the temporal/recurrent mode of information integration.
Volumetric spatial encoding. Integrates information across spatial dimensions simultaneously. Represents the geometric mode.
Relational encoding over agent graph. Empirically identified as the coordination broker: regulation without command. The RESPOND specialist.
Chronological record from theoretical groundwork through confirmatory campaign completion.
dynamic-cross-origin-constraint) committed.
Protocol registry centralising Protocol 0 (flat tax, frozen type_head) and
Protocol 1 (Gumbel-Softmax, differential cost multipliers). BaseAgent refactored
to unify output heads; subclasses implement only encode().
Full React frontend with two modes: Lab Notebook (run history, comparison,
reports) and Neural Loom (epoch-by-epoch signal visualisation with D3 PCA scatter,
QRC gauge, entropy cooling bar, ROI ticker). FastAPI backend with SQLite persistence,
hash chain integrity, PDF/JSON reporting, and WebSocket metrics stream.
7edc9113…a8d6a) — five predictions
locked before data collection. Preregistered campaign runner written:
5 conditions × 15 seeds × 500 epochs, parallel subprocess orchestration.
Theory documentation committed: Δ-Variable propositions, counter-wave
discrimination hypotheses (H1/H2/H3), menu of 17 concrete post-confirmatory
experiments across 6 categories.
75 runs across 5 preregistered cost conditions. 15 independent seeds per condition. 500 epochs per run. Preregistration: 10.5281/zenodo.18738379.
| Condition | Query cost | QRC | Type Entropy H | Survival | Crystallised | Avg onset |
|---|---|---|---|---|---|---|
| Low pressure | 1.2× | 0.810 | 0.944 | 0.167 | 11 / 15 | epoch 130 |
| Baseline | 1.5× | 0.887 | 0.946 | 0.147 | 14 / 15 | epoch 152 |
| High pressure | 3.0× | 0.969 | 0.746 | 0.180 | 14 / 15 | epoch 88 |
| Extreme | 5.0× | 0.937 | 0.537 | 0.067 | 15 / 15 | epoch 41 |
| Control — Protocol 0 | flat | — | — | 0.093 | 0 / 15 | — |
Query-Response Coupling: P(RESPOND within 3 timesteps | QUERY at t). Averaged over final 50 crystallised epochs. Directly operationalises Δ-closure.
Shannon entropy of per-epoch D/Q/R type distribution. Max = 1.585 bits (uniform). Reduction indicates specialisation of communicative function.
Fraction of episodes where at least one agent reaches a target. Final-epoch average across seeds.
Runs where H < 0.95 for ≥ 5 consecutive epochs. Threshold operationalises stable type-role differentiation.
Mean epoch of first crystallisation event across crystallised runs. Earlier onset = faster protocol formation under pressure.
Protocol 1 crystallises in 54 of 60 experimental runs (90%). Protocol 0 produces zero crystallisation across all 15 controls. The 90% vs 0% gap confirms that type-differentiated interrogative signalling requires a cost gradient incentive — it does not emerge by chance, and it emerges reliably when that gradient is present.
QRC and type specialisation increase monotonically with query cost up through high pressure (q=3.0, QRC=0.969). At extreme pressure (q=5.0), QRC drops to 0.937 and survival collapses to 0.067 — below the Protocol 0 control (0.093). This identifies the metabolic saturation boundary: the point at which query cost exceeds the informational value of interrogative coupling. The non-monotonic QRC curve is a structural prediction of the theory; smooth reward optimisation would produce monotonic curves.
Pre-preregistration pilot (seed=42, 500 epochs) established the full phenomenology: three crystallisation waves (E21, E57, E128–E141), a counter-wave phenomenon (full-survival events trigger transient DECLARE spikes and entropy rebounds), and persistent QRC ≥ 0.95 from epoch 280. Final equilibrium R≈0.64, D≈0.20, Q≈0.16 — a limit cycle, not convergence to a fixed point. Reported as exploratory; not included in confirmatory testing.
P3–P5 require a second experimental phase. Each card tests a specific structural prediction. Status will be updated as experiments are completed.
Freeze agent policies at crystallisation epoch, then reduce query cost to q=1.2 (below formation threshold). Run 100 subsequent epochs. Measure whether the crystallised protocol persists or dissolves.
Freeze Agent C's type_head to DECLARE (ablating its RESPOND specialisation) in 5 seeds of the baseline condition. Compare crystallisation rate, QRC, and onset epoch vs unablated runs.
Three competing hypotheses for the full-survival DECLARE spikes observed in Run 11. H1: reward artifact. H2: phase reset on episode boundary. H3: pragmatic speech act ("goal achieved, stop querying"). Decisive test: reward without boundary — give success reward but continue episode.
Expand the signal type vocabulary from 3 (D/Q/R) to 8 query subtypes with variable costs. Test whether the type distribution follows a Zipf law — the frequency distribution predicted by optimal communication under constraint.
Decompose per-agent QRC and type distributions across seeds at the individual agent level. Compare final interrogative strategies between Agent A (RNN), Agent B (CNN), and Agent C (GNN) within each crystallised run.
Compute the Stigmergic Coupling Index (SCI) from Reid et al. (2015, PNAS) army ant bridge formation data. SCI = fraction of local stalls resolved via neighbour-mediated pheromone reinforcement within τ steps. Compare to MARL QRC as same-phenomenon cross-substrate measurement.
Full code, data, preregistration, and theory documents are freely available to ensure reproducibility.