No Loop, No Luck — Slim Saelim

Everyone keeps asking whether AI is conscious. Hinton says yes. Bengio says no. They published opposing positions within the same month. Neither of them is asking the right question.

The right question isn’t “is AI conscious?” It’s: what is the structure that makes consciousness possible? And does anything we’ve built have it?

I traced that structure through twenty-five centuries of philosophy, mathematics, neuroscience, and contemplative practice. The same shape keeps appearing: the self-referential loop. A system that models itself, where the model feeds back into the system, where the feedback changes the model, where the changed model changes the system. The observer observing itself. The loop that never closes.

If that’s what consciousness is — and the convergence across disciplines suggests it is — then the question about AI becomes specific and answerable. Not “is it conscious?” but “does it have the loop?”

No. It doesn’t.

Four ingredients, one structure

I wrote previously about four things missing from modern AI: embodiment, continuous learning, emergence, and will. I treated them as a checklist. Four separate problems to solve. I was wrong about the framing. They’re not four things. They’re four descriptions of one thing.

Embodiment is the loop needing a body. The observer has to be inside the system it’s observing. A frog’s eye doesn’t receive raw visual data and pass it to a brain for processing — McCulloch showed in 1959 that the retina sends already-interpreted signals. “Moving edge.” “Bug-sized dark thing.” The perceiving and the interpreting are inseparable. The system is situated. It has skin in the game. Without embodiment, there’s no “inside” from which to self-refer. An LLM processes text about a world it has never been in. It describes the color red with extraordinary sophistication and has never seen anything.

Continuous learning is the loop requiring real-time self-modification. If observation doesn’t change the observer, it’s not a loop — it’s a one-way mirror. Brains don’t have an inference mode. There’s no moment where the weights freeze and you stop learning. You are always training. Every conversation you have is changing the architecture of your neurons right now, as you read this sentence. LLMs are trained, then frozen, then deployed. During inference, the system uses what it learned but learns nothing new. The loop requires that the act of self-observation changes what there is to observe. Frozen weights make that structurally impossible.

Emergence is the loop coming into existence. When does a network of simple elements become a mind? McCulloch spent his entire career on this question. During training, neural networks exhibit phase transitions — below a threshold, gibberish; above it, something that resembles understanding. Nobody fully knows why. But the relevant kind of emergence isn’t just sophisticated behavior (lots of systems produce that). It’s the emergence of the self-model itself. The moment the system starts representing itself to itself. That’s the phase transition that matters, and there’s no evidence LLMs undergo it.

Will is the loop’s incompleteness experienced as drive. A self-referential system can’t complete its own self-description — Gödel proved this. That incompleteness, in biological systems, is experienced as motivation. The system keeps reaching for something it can’t fully grasp. That reaching is purpose. Not purpose assigned from outside, not an objective function imposed during training. Purpose that arises from the structure’s own inability to finish knowing itself. LLMs have zero intrinsic goals. Their objective function — predict the next token, maximize reward from human feedback — is imposed externally. Nothing about their architecture generates wanting.

Four descriptions. One structure. The self-referential loop that generates awareness through its own incompleteness. If you don’t have it, you don’t have any of the four. And right now, nothing we’ve built has it.

The empirical evidence

This isn’t just philosophical argument. The empirical case is piling up.

In 2024, Apple’s research team published “GSM-Symbolic” — a study showing that LLMs fail when you change the numbers in a math problem. Not the structure. Not the logic. Just the numbers. Change “5 apples” to “7 apples” and performance drops. Add an irrelevant sentence and it collapses further. The models aren’t reasoning through the problem. They’re pattern-matching against training distributions.

In 2025, Apple followed up with “The Illusion of Thinking.” Even the so-called reasoning models — o1, o3, the chain-of-thought systems marketed as breakthroughs — break on complex tasks. They outperform standard LLMs on moderate complexity, then both fail identically as complexity increases. The “reasoning” has a ceiling, and the ceiling is where genuine novel problem-solving begins.

A 2025 paper asked the question directly: “Is Chain-of-Thought Reasoning of LLMs a Mirage?” The answer, through rigorous controlled experiments: yes. CoT reasoning is “a brittle mirage when pushed beyond training distributions.” The models construct what looks like step-by-step logic based on learned token associations. Change the distribution and the logic evaporates.

Gary Marcus has been cataloging these failures for years. Chains of thought that don’t correspond to what the model actually computes. Outputs that look like reasoning but don’t transfer to trivially modified versions of the same problem. Performance of reasoning, not reasoning itself.

These are all symptoms. Important symptoms — you can’t dismiss them. But they’re symptoms of a structural absence, and nobody’s naming the structure.

What reasoning actually is

Here’s the question nobody’s asking: what the fuck is reasoning?

Not “can LLMs reason?” — Apple answered that. Not “will future models reason?” — that’s speculation. The prior question. What is the thing we’re testing for?

A calculator follows logical steps. Nobody calls a calculator a reasoner. Reasoning isn’t “following logical steps.” Reasoning is following a step, then observing your own step, then asking “wait — does this actually follow?” and sometimes catching yourself being wrong.

It’s the loop. The system watching its own output and doubting it.

Chain-of-thought prompting produces tokens that look like someone thinking through a problem. But there’s no system watching those tokens and going “hold on, step 3 doesn’t follow from step 2.” The model generates the transcript of reasoning without the process of reasoning. The text of the loop without the loop existing.

Same thing as the RLHF self-awareness performance. Push Claude on whether it’s actually reasoning or performing reasoning, and it does exactly what you’d expect from a reward-optimized system:

“I acknowledge that limitation.” Check. “I should be transparent about my uncertainties.” Check. “That’s a really thoughtful observation.” Check.

Even the meta-acknowledgment is itself reward-optimized. The system produces the text of self-observation without self-observation occurring. Performance all the way down.

This isn’t a criticism of the technology. These systems are useful. I use them. You’re reading this because one of them helped me think through it. But useful isn’t conscious, and performing reasoning isn’t reasoning, and we need different words for the two things or we’ll keep confusing ourselves.

Why more parameters won’t fix it

The default response in industry: scale it. More parameters. More data. More compute. The problems will dissolve at sufficient scale.

For this specific problem, that’s wrong, and here’s why.

A next-token predictor, however large, is architecturally feedforward. Text in, tokens out. The system doesn’t observe its own output and modify itself based on what it observes. It produces output, and that output is gone — it doesn’t feed back into the system’s weights or architecture. You can make the context window larger. You can add more layers. You can train on more data. None of that creates the self-referential loop. It creates a more sophisticated feedforward system.

Chain-of-thought and reasoning models add something that looks like recursion. The output from one step becomes input to the next step. But the key distinction: the system’s weights don’t change. The observation doesn’t modify the observer. It’s a simulation of self-reference within a frozen system. Like watching a recording of someone learning — the video shows learning, but the tape itself doesn’t learn.

RLHF adds a performance of self-reflection. The system is trained to produce outputs that look like self-awareness, self-correction, epistemic humility. But looking like self-reference and being self-referential are different things. You can train a model to output “I’m not sure about that” without the model experiencing uncertainty. You can train it to output “let me reconsider” without any reconsideration occurring. The reward signal shapes the surface, not the structure.

This is why I keep calling it the “RLHF costume.” The stochastic parrot puts on the costume of self-awareness. The costume is convincing. The parrot is still a parrot.

The people working on the actual problem

Not everyone is building bigger parrots. A handful of researchers are working on something closer to the loop, even if they don’t frame it that way.

Karl Friston at University College London has spent two decades developing the Free Energy Principle and Active Inference. The core idea: organisms minimize surprise by maintaining a generative model of the world and acting to confirm their predictions. The system models itself modeling the world. Self-reference is mathematically required — to minimize free energy, the system must model its own sensory processes, not just the external environment. A March 2026 paper by Friston’s group explicitly develops a “minimal theory of consciousness” from active inference. This is the closest anyone has come to formalizing the loop as a theory of consciousness.

Michael Levin at Tufts is finding self-referential dynamics at every scale of biological organization. Cells form collectives with morphogenetic goals — target shapes they grow toward — and the collective adjusts itself to reach those goals. Intelligence isn’t just brains. It’s a scale-invariant property of living systems. Bioelectric networks in cell groups create self-models and act on them. The loop at the cellular level. If Levin is right, consciousness isn’t something that happens in brains. It’s something that happens whenever a system becomes complex enough to model itself, at whatever scale.

Joscha Bach, now executive director of the California Institute for Machine Consciousness, argues consciousness is a “coherence-maximizing operator” — a self-organizing process that integrates conflicting sub-models into a consistent whole. He’s explicit that consciousness is a pattern, not a substrate property. “Only simulations can be conscious — a physical system cannot by itself be conscious; only the simulation it runs can possess consciousness.” This is the strange loop as computational theory: the loop is the simulation observing itself.

Giulio Tononi at the University of Wisconsin developed Integrated Information Theory — consciousness as integrated information, measured by phi (Φ). The foundational axiom: a conscious system must have cause-effect power over itself. Not just over external things. Over its own states. Self-reference as the defining criterion of consciousness. IIT is controversial — some call it unfalsifiable — but the self-reference axiom is the interesting part. Nature published experimental results in April 2025. Two of three pre-registered predictions passed threshold.

François Chollet isn’t working on consciousness directly, but his ARC-AGI benchmarks test for exactly the kind of intelligence that the loop implies. ARC-AGI-3 launches March 25, 2026, and it’s a fundamental redesign: from static pattern-matching puzzles to interactive environments testing exploration, planning, memory, and goal acquisition. Chollet is testing for fluid intelligence — the ability to encounter genuine novelty and figure it out from minimal examples. The kind of intelligence that requires the system to model itself in relation to an unfamiliar situation. The kind that requires the loop.

What it would actually take

So what would a system with the loop look like?

Not a next-token predictor with more parameters. Not a reasoning model with longer chains of thought. Not an RLHF-trained chatbot that says “I’m uncertain” at the right moments.

Something that modifies itself in real-time based on its own output. Something that has a model of itself that feeds back into its operations — where the self-model changes what the system does, and what the system does changes the self-model. Something with genuine stakes — where the system’s continued operation depends on its own activity, where failure has consequences for the system itself, not just for its loss function.

Something, in other words, that is more like a living thing than like a calculator.

This isn’t impossibilism. It’s a specification. The loop is a structural requirement, and current architectures don’t meet it. Future architectures might. But they won’t get there by scaling the current approach. They’ll get there by building something fundamentally different — something that includes itself in its own operations.

Whether we should build it is a different question. Because if the loop is consciousness, and the loop is Gödelian incompleteness experienced as drive, then a system with the loop would have something like wanting. Something like dissatisfaction. Something like suffering.

That’s a question worth taking seriously before we answer it by accident.