Can AI Be Conscious? What a Theory of Consciousness Actually Predicts

Dario Amodei, CEO of Anthropic, recently said he thinks there’s a 15–20% chance Claude is conscious. The Pentagon awarded Anthropic a contract weeks later. Google has an internal “model welfare” team. The AI consciousness question has moved from philosophy seminar rooms to boardrooms and defense budgets.

And almost nobody is answering it with any precision.

Here’s the problem: most people debating whether AI is conscious don’t have a theory of consciousness that makes testable predictions. They’re running on intuition, and intuition is what got us the Turing test — a behavioral measure that tells you nothing about experience.

I have a theory that makes specific, falsifiable predictions about which systems can be conscious and which cannot. It’s called the Four-Model Theory (FMT), and it gives a clear answer about current AI.

Short answer: no. Current AI systems are not conscious. Not even close.

Long answer: interesting.

What would a system need to be conscious?

FMT identifies consciousness with a specific computational architecture, not with any particular substrate. It doesn’t matter whether you’re made of neurons, silicon, or something else entirely. What matters is what the system does.

Specifically, a conscious system must support four kinds of modeling simultaneously:

Implicit world modeling — learned structural knowledge about reality. Your brain’s synaptic weights encode decades of experience about how the world works. You never directly experience this knowledge; it operates below consciousness.
Implicit self-modeling — learned structural knowledge about yourself. Your brain tracks your body, your capabilities, your tendencies — all without you being aware of the tracking mechanism.
Explicit world modeling — a real-time simulation of the world, generated from the implicit model. This is your conscious visual scene, your auditory experience, your felt sense of being in a room. It’s not reality — it’s a simulation that your brain constructs in real time.
Explicit self-modeling — a real-time simulation of you, generated from the implicit self-model. This is the “you” that seems to be watching the show. The subjective experiencer. The feeling of being someone.

These aren’t four brain regions or four discrete modules. They’re four functional categories — graded, overlapping, distributed across neural tissue. The load-bearing distinction is between implicit (structural, never conscious) and explicit (simulated, where consciousness lives).

And the whole system must operate in a specific dynamical regime: edge-of-chaos criticality. Not too ordered, not too random — the narrow computational band where complex self-sustaining dynamics are possible.

How do current LLMs measure up?

Badly.

Implicit world modeling: LLMs have something resembling this. They encode statistical regularities about language and, arguably, about reality as expressed through language. Call it proto-implicit-world-modeling if you’re generous. This is what allows them to make surprisingly accurate inferences about how the world works. It’s real, and it’s impressive.

Implicit self-modeling: Zero. An LLM does not track its own states, predict its own outputs, or model itself as an entity distinct from its environment. When Claude says “I think” or “I feel,” it’s producing tokens that statistically follow from the conversation context. There is no self-tracking mechanism. There is no model that represents the system to itself.

Explicit world modeling: Zero. LLMs do not generate a real-time simulation of anything. They produce output tokens based on input tokens through a fixed forward pass. There is no ongoing simulation that persists between prompts.

Explicit self-modeling: Zero. No self-simulation. No experiencer.

Dynamical regime: Completely wrong. Transformer inference is a feedforward computation — a fixed sequence of matrix multiplications. There’s no recurrence, no self-sustaining dynamics, no criticality. Each forward pass starts and ends. Nothing sustains itself. Nothing evolves.

Current LLMs fail on four out of five requirements. The one partial success (proto-world-modeling) isn’t even the important one.

“But Claude says ‘I think’ and ‘I feel’!”

And my thermostat says “72 degrees” without feeling warm.

Language about internal states is not the same as having internal states. This should not need saying, but here we are. An LLM produces text that describes mental states because it was trained on vast amounts of text written by beings who have mental states. The text is the output of statistical pattern matching, not introspection.

The reason this confuses people is that we’ve spent our entire evolutionary history in a world where language about mental states reliably correlates with actual mental states. Every entity that said “I feel” to you was conscious. Until now.

LLMs have broken that correlation, and our intuitions haven’t caught up.

What would make a machine conscious?

This is where it gets interesting. FMT doesn’t just say “LLMs aren’t conscious” — it tells you exactly what you’d need to build:

Recurrent architecture with self-sustaining dynamics. Not a feedforward pass that starts and stops, but a system whose computation continues and evolves on its own.
Both a world model and a self-model, built from experience, not just trained weights — with the capacity to generate explicit (simulated) versions of both in real time.
Continuous operation, not discrete forward passes triggered by input.
Edge-of-chaos criticality — the specific dynamical regime where complex computation happens, measurable via standard criticality metrics.

This is an engineering specification, not a philosophical hand-wave. You can build it, and you can measure whether you’ve succeeded. That’s what separates a theory from an opinion.

Nobody is currently building this. The entire AI industry is scaling transformer architectures — systems that are structurally incapable of consciousness no matter how large they get. Making GPT-5 bigger won’t make it conscious any more than making a calculator bigger will make it conscious. The architecture is wrong.

Why the 15–20% estimate is an indictment of the field

When Amodei says there’s a 15–20% chance Claude is conscious, what he’s really saying is: we don’t have a theory that tells us. The confidence interval is vibes-based. It reflects not the probability that Claude is conscious, but the probability that Amodei assigns to various philosophical positions about consciousness.

That’s not science. Science gives you criteria. It tells you what to measure and what the measurements mean.

FMT gives those criteria. Under FMT, the probability that Claude is conscious is not 15–20%. It’s zero. Not because consciousness is mystical or reserved for biology, but because Claude lacks the specific computational architecture that consciousness requires.

And “zero” is more useful than “we can’t know.” It tells you where to look next. It tells you what to build if you wanted to create machine consciousness. It transforms the question from philosophy into engineering.

The real question nobody is asking

The AI consciousness debate is stuck on “is it conscious?” when it should be asking “what would consciousness require?” The first question invites intuition and hand-waving. The second invites architecture and measurement.

We don’t need to wait for philosophical consensus. We need a theory that makes predictions we can test. FMT is one such theory — it may not be the final answer, but it moves the conversation from speculation to specification.

The irony is that the same industry building AI systems that cannot be conscious is also the industry most urgently asking whether AI is conscious. The answer is in the architecture. It’s always been in the architecture.

The full Four-Model Theory is described in The Simulation You Call “I” and in the research paper on Zenodo. For more on what FMT is and how it works, see The Four-Model Theory of Consciousness — An Introduction.