Consciousness in the predictive mind

The prediction error minimization (PEM) account of brain function may explain perception, learning, action, attention and understanding. That at least is what its proponents claim, and I suggested in an earlier post that perhaps the brain does nothing but minimize its prediction error. So far I haven’t talked explicitly about consciousness. Yet, if PEM is true, and if consciousness is based in brain activity, then PEM should explain consciousness too. In this post I therefore speculate about what PEM might have to say about consciousness.

We talk about consciousness in many different ways: metaphysical, neurological, psychological, colloquial. Accordingly, there are many different ways a theory like PEM could engage with consciousness.

Starting at the top, could PEM deal with the hard problem of consciousness? No. It is easy to conceive of a system that minimizes prediction error yet has no phenomenal consciousness. So consciousness does not supervene on prediction error minimization.

We can move away somewhat from these metaphysical concerns and consider things from a more theoretical point of view. A system governed fully under the free energy principle is self-organised and may be what defines a biological organism. If this (ambitious) idea is correct, then conceiving of a PEM system is conceiving of a biological organism. This at least leaves us in the right ballpark for the kinds of system that we (intuitively) accept as being conscious.

If we were to assume that PEM is sufficient for consciousness, then we would then end with ‘biopsychism’, which says that all biological organisms are conscious. Biopsychism is vastly more plausible than panpsychism. But I don’t believe PEM is sufficient for consciousness. This is partly because, phenomenally prejudiced as I am, I don’t think that very simple creatures are conscious in any way whatsoever.

This is then a problem for PEM’s reductionist aspirations: brains only do PEM and PEM is not sufficient for consciousness. I can only see one way to go here: consciousness must arise when the organism minimizes its prediction error in a certain way or to a certain degree.

Different ways and degrees of minimizing prediction error can be leveraged through perceptual inference, active inference, precision optimization, and complexity reduction. We can then look at very simple non-conscious creatures and ask how they differ in these dimensions from conscious creatures (assuming we have independent means of distinguishing these kinds of creatures).

There is a proposal like this around. Hobson and Friston (pdf) have suggested that dreaming is an adaptation where the brain is engaged in complexity reduction via synthetic prediction errors. They propose that consciousness arose as a consequence of the ability to create such inner virtual reality. This places the theory in the company of Revonsuo and Metzinger, who have proposed similar ideas albeit without the PEM machinery. It would mean that creatures who do not dream are not conscious.

I think this is an intriguing idea. It is at least as appealing as some of the other theories of consciousness out there (IIT, HOT, loops, AIR, GNWS). I don’t think it alone is going to be enough, however. One reason is that there must be something about waking perceptual inference specifically that relates to consciousness, and the dreaming theory doesn’t very strongly provide this link. A better strategy is to look at all the theories of consciousness, and identify the elements in them that are supported by different aspects of PEM (while being prepared to jettison the elements that are not). Then combine all these elements and this will then be a patchwork-style PEM theory of consciousness.

That would be a quite ambitious project, but worth undertaking. I also want to advertise a much more modest approach, which is in my view very appealing and satisfying to engage with. This approach begins by assuming that the creatures and states we are talking about are in fact conscious, and then it asks how PEM can explain the structure and content of such conscious experience.

This approach is modest not only because it assumes consciousness but also because it doesn’t say much about what distinguishes conscious from unconscious experience in the creature in question. It just lists what most of us would agree are aspects of conscious experience and asks how PEM would explain them. We might look at the first person perspective, at binding and unity, at the role of attention, at bodily self-awareness, and so on. Much of my book is taken up with this kind of approach to conscious experience but the job is far from done. There is much more to learn from PEM about our conscious lives. I think work by Anil Seth, Andy Clark and others is in this vein too.

Hopefully, some day it will all come together. We will then have a rich, systematic, empirically supported understanding of how the neural organ structures our conscious lives with prediction error minimization.


  1. Hi Jakob,
    Thanks for this post! Could you possibly share your thoughts on which elements of IIT, HOT, etc. are particularly well supported by PEM? Do you think there could be a refurbished PEM-version of Dennett’s virtual machine idea? (I just had the thought that a virtual machine implemented by the brain could be like the shrinked agents you mention in your paper “The self-evidencing brain”.)

  2. Hi Wanja – I think there is a rather large project in sorting out these questions. For what it is worth, I am partial to IIT, AIR and GNWS. This is because they tap into PEM in terms of uncertainty reduction (prediction error minimization), attention (precisions), and action (active inference), respectively. But there is much to do in sorting out the details here and matching up concrete parts of the theories.

    Recurrent loops theory is an informal take on the kind of mechanism that underlies PEM. I can’t easily see a role for HOT.

    On virtual machines and Dennett: I don’t know enough about Dennett’s ideas here but it seems the idea you suggest it worth exploring. The challenge would be to match things up with levels of the hierarchy, where these levels are understood in terms of conditional independence in the causal nets sense.

    • Alex Kiefer

      Hi Jakob,

      Thanks for the post—I’ve been trying to keep up with all the various blog discussions on PEM, and though I’ve been quiet, I thought I’d chime in on this one. I’m sympathetic to most of what you said in your post, but I think that (a) HOT theory might be a particularly good fit with PEM, and (b) that it might help with your demarcation problem (specifying which systems are conscious and which not).

      HOT theory says that conscious states are those that an organism thinks of itself as being in, where the relevant notion of “thought” is rather minimal, and amounts to an awareness not tied to a particular sensory modality. There are at least a few things packed into this:

      (1) The organism represents itself,
      (2) As being in certain states.

      The first condition is of course central to the PEM picture (the organism’s model of the world must include a model of itself). So the first condition for HOT theory is met easily by any PEM system. The second condition has to do with the particular contents of the representations involved, and this is where you might get the resources to declare some PEM systems conscious and others not: the organism must represent itself as being in mental states (thinking, feeling, seeing, desiring, and so forth).

      I think that a clear picture of the relationship between HOT theory and PEM depends on a full account of what representation looks like on the PEM story, and of course this is one of the things that is still in the works. But it seems to me that PEM is the most natural framework imaginable for implementing HOT theory: an organism that has a *really good* model of itself, as required for PEM in complex organisms, must model itself as an agent with intentional states, not just as a body in space.

      So, an agent consciously experiences the world as containing feature F only if the agent models its own relation to F. I see no reason that a sophisticated probabilistic treatment of modeling couldn’t capture, in fine grain, what’s captured qualitatively by attributing to someone the thought that they see red, etc, as in HOT theory.

      Finally, I think PEM is very congenial to Tononi’s IIT as well, but then I also doubt that that (or at least a version of it) is incompatible with HOT.

  3. Hi Alex, this is a really interesting proposal. I agree that there is much more that can and should be said about these different theories, and perhaps there is more to HOT+PEM than my quick remark allows. The version of HOT I had in mind was a very simple version, and perhaps less simple versions can work in the way you suggest. In particular, the version I had in mind doesn’t mention anything about having a self, or what the self is. It might be that HOT has to be conceived in this way but my suspicion is that we then in the end have less and less left of the true HOT core. Having said that, there are now many versions of HOT out there (e.g., Gennaro), and new, more unified theories (e.g., Kriegel).

    I may also be slightly concerned if the theory of consciousness would have us have conscious experience in a doubly vicarious manner: inferring causes inferred by causes. Though I agree with you that it is essential to PEM that we represent ourselves as causes in the world, I think we should account for conscious experience only by having us infer causes in the world. Perhaps this is a kind of prejudice against HOT though.

  4. Josh Weisberg


    Very interesting posts–I haven’t had a chance to read your book, but I look forward to doing so.

    On connections between “higher-order” type approaches and PEM, have you check out Hakwan Lau’s “A higher order Bayesian decision theory of consciousness”? It may be in the direction of some of the things Alex mentioned (though with some interesting additional ideas). See

    My work is more on the metaphysical hard problem stuff, so here’s some musings on a connection between PEM and HO theory in that direction. It may be that PEM involving conflicting and ambiguous lower-level processing becomes necessary from a resource standpoint as a cognitive system increases in complexity. Such predictive modeling, will, as Alex notes, involve a model of the system’s own states (and the system itself–some sort of “self-representation”). But that means PEM systems of this kind will realize what Rosenthal, Gennaro, Kriegel, etc. call “the transitivity principle”–the claim that conscious states are states the subject is aware of in a suitable way. And that just is what it means to be a conscious state, according to these approaches. So (if all this is correct!), any creature instantiating the right sort of “higher-order” PEM will be conscious, and in particular, its conscious states will be those appropriately modeled by the predictive system.

    Of course, we can still imagine PEM zombies and what not. But that’s not a big deal if we allow that decisions about the right characterization of consciousness (the transitivity principle, “what it’s like,” etc.) are matters of best overall fit between empirical data, predictive success, pre-theoretic intuition, simplicity, etc. The mere fact of conceivability is uninteresting at this stage. Once we have a successful HO-PEM view up and running for a few hundred years, the intuitions about zombies will be relegated to dust heap of undefeated but irrelevant skeptical scenarios.

    Anyway, this speech was inspired by your postings (and Alex’s as well–we both are HO types from CUNY, for what it’s worth!). Thanks again!

  5. Hi Josh, thanks for these interesting comments! I am not an expert on HOT, and I am aware that there is much more to be said on this, probably in the direction you and Alex suggest.

    I am a fan of Lau’s stuff. This is because the Bayesian basis of SDT promises a good fit with PEM, as you notice. But I don’t think Lau’s cool theory is a good fit with HOT (on my simplistic rendering of HOT). Type 2 SDT has to do with assessing the variance of probability distributions encoding content. This assessment can happen in ignorance of the encoded content, so it seems to me there is no need to invoke HOT here. I am more keen to affiliate Lau’s theory with GNWS (I have a chapter on this coming out next year).

    I like the idea that complexity of the system is associated with the need to resolve ambiguity and conflict. From the PEM perspective this is probably linked to the depth of the cortical hierarchy. Resolving such ambiguity is the bread and butter of PEM (the book discusses some examples of this). The model generating predictions needs to include parameters for the agent itself, in order to predict changes in the sensory input that are caused by the agent’s own movement. I think it requires more work to see whether this need to self-represent in itself would bring us into HOT territory, I am not sure I see the link clearly yet. I also am not sure I see why the need to resolve ambiguity in itself should invoke the transitivity principle. Perhaps this is because I just think of such resolution as more Bayesian inference: seeing which hypothesis best minimizes prediction error. My gut feeling is that the motivations for HOT will be difficult to fit with the epistemic, Bayesian motivations for PEM.

    Your take on the Hard Problem is appealing, I think. It is probably a kind of type B materialism, where inference to the best explanation leads us to a certain kind of identity. I think there may be some explanatory burdens associated with this take on the hard problem, captured in much of the hard problem discussion on identity, and going back to the objections to Jack Smart’s original paper. Unfortunately, my own attempt at defending inference-to-the-best-explanation to identity is in a paper that sank without trace…

    • Josh Weisberg

      Thanks, Jakob!

      Sorry for the slow response–summer hours make blogging kind of like writing letters was in the old days.

      Very interesting points. I do think that perhaps less is needed than is often assumed for HO to do it’s work, but that may collapse the thesis into other things. That’s fine with me. The resources needed for high-level ambiguity resolution may do the trick.

      Thanks again, and I look forward to reading the book.



  6. Nick Maley

    Hi Jakob,

    I like the hypothesis that PEM is a fundamental organising principle of the brain, and I’m happy to reduce conscious states to brain states. But PEM, important as it is for understanding the connection between structure and function in the brain, seems to have not much to say about the representational content associated with neural activity. There is a surely a strong connection between what is represented by the brain (redness, Pegasus, self, or whatever), and what we are conscious of as we represent it. The content of consciousness appears to be a subset of what is represented in the brain. Any thoughts on that connection, and any theories to offer about representational content?

    • Hi Nick – it is a nice question about the relations between representation, prediction error minimization, and consciousness. I think PEM is ideally suited for giving an account of representation: minimizing prediction error based on a model means this model will come to represent, carry information about, the world. In other words, perceptual inference reveals the hidden causes in the world that cause the sensory input (modulo Cartesian scepticism). Since this happens in a hierarchical model, the resulting representation can be rich and from a first person perspective (or so I argue in the book).

      In my view this is by far the most promising theory of representational content since it sets up a task that the brain can in fact perform without having prior knowledge (namely, comparing predictions with input and minimising the difference through changing internal states and through action). Interestingly, PEM also incorporates aspects of inferential (or statistical) role semantics. So PEM is ideally suited to put representation into consciousness.

      One further issue that arises here is what determines which PEM-based representations make it to conscious presentation since, as you also mention, not all representations in the brain are contents of consciousness. I suggest in the book this has to do with action and in turn the unity of consciousness.

Comments are closed.