Engineering Intelligence from the Ground Up to Develop AI models that Show True Intelligence... and Maybe Consciousness
A conversation with Professor Karl Friston, distinguished psychiatrist, neuroscientist and artificial intelligence expert
Dr. Karl Friston is a distinguished computational psychiatrist, neuroscientist, and pioneer of modern neuroimaging and, now, AI. He is a leading expert on intelligence, natural as well as artificial. I have followed his work as he and his team uncover the principles underlying mind, brain, and behavior based on the laws of physics, probability, causality and neuroscience.
In the interview that follows, we dive into the current artificial intelligence landscape, discussing what existing models can and can’t do, and then peer into the divining glass to see how true artificial consciousness might look and how it may begin to emerge.
Current AI Landscape and Biological Computing
GHB: Broadly speaking, what are the current forms of AI and ML, and how do they fall short when it comes to matching natural intelligence? Do you have any thoughts about neuromorphic chips?
KF: This is a pressing question in current AI research: should we pursue artificial intelligence on high performance (von Neumann) computers or turn to the principles of natural intelligence? This question speaks to a fork in the road ahead. Currently, all the money is on artificial intelligence — licensed by the truly remarkable competence of generative AI and large language models. So why deviate from the well-trodden path?
There are several answers. One is that the artificial path is a dead end — in the sense that current implementations of AI violate the principles of natural intelligence and thereby preclude themselves from realizing their ultimate aspirations: artificial general intelligence, artificial super intelligence, strong AI, et cetera. The violations are manifest in the shortcomings of generative AI, usually summarized as a lack of (i) efficiency, (ii) explainability and (iii) trustworthiness. This triad neatly frames the alternative way forward, namely, natural intelligence.
So, what is natural intelligence? The answer to this question is simpler than one might think: natural intelligence rests upon the laws or principles that apply to the natural kinds that constitute our lived world. These principles are readily available from the statistical physics of self-organization, when the notion of self is defined carefully.
Put simply, the behavior of certain natural kinds — that can be read as agents. like you and me — can always be described as self-evidencing. Technically, this entails minimizing self-information (also known as surprise) or, equivalently, seeking evidence (also known as marginal likelihood) for an agent’s internal model of its world2. This surprise is scored mathematically with something called variational free energy.
The model in question is variously referred to as a world or generative model. The notion of a generative model takes center stage in any application of the (free energy) principles necessary to reproduce, simulate or realize the behavior of natural agents. In my world, this application is called active inference.
Note that we have moved beyond pattern recognizers and prediction machines into the realm of agency. This is crucial because it means we are dealing with world models that can generate the consequences of behavior, choices or actions. In turn, this equips agents with the capacity to plan or reason. That is, to select the course of action that minimizes the surprise expected when pursuing that course of action. This entails (i) resolving uncertainty while (ii) avoiding surprising outcomes. The simple imperative — to minimize expected surprise or free energy — has clear implications for the way we might build artifacts with natural intelligence. Perhaps, these are best unpacked in terms of the above triad.
Efficiency. Choosing the path of least surprise is the path of least action or effort. This path is statistically and thermodynamically the most efficient path that could be taken. Therefore, by construction, natural intelligence is efficient. The famous example here is that our brains need only about 20 W — equivalent to a light bulb. In short, the objective function in active inference has efficiency built in — and manifests as uncertainty-resolving, information-seeking behavior that can be neatly described as curiosity with constraints. The constraints are supplied by what the agent would find surprising — i.e., costly, aversive, or uncharacteristic.
A failure to comply with the principle of maximum efficiency (a.k.a., principle of minimum redundancy) means your AI is using the wrong objective function. This can have severe implications for ML approaches that rely upon reinforcement learning (RL). In RL, the objective function is some arbitrary reward or value function. This leads to all sorts of specious problems; such as the value function selection problem, the explore-exploit dilemma, and more3. A failure to use the right value function will therefore result in inefficiency — in terms of sample sizes, memory requirements, and energy consumption (e.g., large language models trained with big data). Not only are the models oversized but they are unable to select those data that would resolve their uncertainty. So, why can’t large language models select their own training data?
This is because they have no notion of uncertainty and therefore don’t know how to reduce it. This speaks to a key aspect of generative models in active inference: They are probabilistic models, which means that they deal with probabilistic “beliefs” — about states of the world — that quantify uncertainty. This endows them not only with the capacity to be curious but also to report the confidence in their predictions and recommendations.
Explainability. if we start with a generative model — that includes preferred outcomes — we have, by construction, an explainable kind of generative AI. This is because the model generates observable consequences from unobservable causes, which means that the (unobservable or latent) cause of any prediction or recommendation is always at hand. Furthermore, predictions are equipped with confidence intervals that quantify uncertainty about inferred causes or states of the world.
The ability to encode uncertainty is crucial for natural intelligence and distinguishes things like variational autoencoders (VAE) from most ML schemes. Interestingly, the objective function used by VAEs is exactly the same as the variational free energy above. The problem with variational autoencoders is that they have no agency because they do not act upon the world — they just encode what they are given.
Trustworthiness: if predictions and recommendations can be explained and qualified with quantified uncertainty, then they become more trustworthy, or, at least, one can evaluate the epistemic trust they should be afforded. In short, natural intelligence should be able to declare its beliefs, predictions, and intentions and decorate those declarations with a measure of uncertainty or confidence.
There are many other ways we could unpack the distinction between artificial and natural intelligence. Several thought leaders — perhaps a nascent rebel alliance — have been trying to surface a natural or biomimetic approach to AI. Some appeal to brain science, based on the self-evident fact that your brain is an existence proof for natural intelligence. Others focus on implementation; for example, neuromorphic computing as the road to efficiency. An interesting technical issue here is that much of the inefficiency of current AI rests upon a commitment to von Neumann architectures, where most energy is expended in reading and writing from memory. In the future, one might expect to see variants of processing-in-memory (PIM) that elude this unnatural inefficiency (e.g., with memristors, photonics, or possibly quantum computing).
Future AI Development
GHB: What does truly agentic AI look like in the near-term horizon? Is this related to the concept of neuromorphic AI (and what is agentic AI)?
KF: Agentic AI is not necessarily neuromorphic AI. Agentic AI is the kind of intelligence evinced by agents with a model that can generate the consequences of action. The curiosity required to learn agentic world models is beautifully illustrated by our newborn children, who are preoccupied with performing little experiments on the world to see what they can change (e.g., their rattle or mobile) and what they cannot (e.g., their bedtime). The dénouement of their epistemic foraging is a skillful little body, the epitome of a natural autonomous vehicle. In principle, one can simulate or realize agency with or without a neuromorphic implementation; however, the inefficiency of conventional (von Neumann) computing may place upper bounds on the autonomy and agency of edge computing.
VERSES AI and Genius System
GHB: You are the chief scientist for VERSES AI, which has been posting groundbreaking advancements seemingly every week. What is Genius VERSES AI and what makes it different from other systems? For the layperson, what is the engine behind Genius?
KF: As a cognitive computing company VERSES is committed to the principles of natural intelligence, as showcased in our baby, Genius. The commitment is manifest at every level of implementation and design:
Implementation eschews the unnatural backpropagation of errors that predominate in ML by using variational message-passing based on local free energy (gradients), as in the brain.
Design eschews the inefficient top-down approach — implicit in the pruning of large models — and builds models from the ground up, much in the way that our children teach themselves to become autonomous adults. This ensures efficiency and explainability.
To grow a model efficiently is to grow it under the right core priors. Core priors can be derived from first principles; for example, states of the world change lawfully, where certain quantities are conserved (e.g., object permanence, mathematical invariances or symmetry, et cetera), usually in a scale-free fashion (e.g., leading to deep or hierarchical architectures with separation of temporal scales).
Authentic agency is assured by equipping generative models with a minimal self-model; namely, “what would happen if I did that?” This endows them with the capacity to plan and reason, much like System 2 thinking (planful thinking), as opposed to the System 1 kind of reasoning (intuitive, quick thinking).
At the end of the day, all this rests upon using the right objective function; namely, the variational free energy that underwrites self-evidencing. That is, building the most efficient model of the world in which the agent finds herself. With the right objective function, one can then reproduce brain-like dynamics as flows on variational free energy gradients, as opposed to costly and inefficient sampling procedures that are currently the industry standard.
Consciousness and Future Directions
GHB: What might we look forward to for artificial consciousness, and can you comment on the work with Mark Solms?
KF: Commenting on Mark’s work would take another blog (or two). What I can say here is that we have not touched upon two key aspects of natural intelligence that could, in principle, be realized if we take the high (active inference) road. These issues relate to interactive inference or intelligence — that is, inference among agents that are curious about each other. In this setting, one has to think about what it means for a generative model to entertain the distinction between self and other and the requisite mechanisms for this kind of disambiguation and attribution of agency. Mark would say that these mechanisms rest upon the encoding of uncertainty — or its complement, precision — and how this encoding engenders the feelings (i.e., felt-uncertainty) that underwrite selfhood.
References and Resources
Reading
Using AI Responsibly in Mental Health: The Art and Science of Prompting
The Singularity Is Here: On the Cusp of Delivering a New Form of Intelligence
Developing a Framework for AI Safety Levels in Mental Health (ASL-MH) based on Anthropic’s ASL Model
Doorknob Comments Podcast
Professor Karl Friston, Active Inference, the Free Energy Principle, and Therapy #70
Dr. Michael Levin Engineering Body and Mind with Michael Levin #82
Dr. Mark Solms Neuropsychoanalysis, Revisiting Freud, and Living Well #78
Professor Nir Eisikovits Would You Rather Have An AI or a Human Therapist? #79
Originally posted on Psychology Today, ExperiMentations