Coelho Mollo and Millière: The Vector Grounding Problem

Post on “The Vector Grounding Problem” for the Brains Blog

Dimitri Coelho Mollo & Raphaël Millière

We first preprinted “The Vector Grounding Problem” in April 2023, about four months after the release of ChatGPT. By that point, large language models (LLMs) had started to capture the attention of philosophers, but there was still very little published work on the topic. The preprint languished on arXiv for longer than we initially planned, but it lived a life of its own and generated some interesting discussions. Since then, the philosophical literature on LLMs has grown considerably, and LLMs themselves have evolved: they’ve become more capable, more multimodal, more often embedded in tool-using systems, and fine-tuning has become more and more central for their capabilities. We finally came around to finding the paper a good home last year, and we’re glad that it has now appeared in its final form (and in great company!) in this special issue of Philosophy and the Mind Sciences. The delay between the preprinting of the first version and the publication of the final version now gives us a chance to reflect on what we argued in the paper.

Steve Harnad’s influential symbol grounding problem asked how the symbols that classical AI systems manipulate could acquire meaning, rather than merely inheriting it from human interpreters. LLMs don’t manipulate symbols in the traditional sense: they process sequences of tokens as high-dimensional vectors, and those vectors are transformed by learned algebraic operations. Still, Harnad’s worry returns in a new form: if an LLM is trained only on text, are its vectors ever about dogs, uranium, elections, or rainstorms? In other words: do some of the internal states and outputs of LLMs (and similar systems) represent anything outside patterns in language, despite being trained only on text, and for producing more text? This is what we call the vector grounding problem as a nod to Harnad’s classic paper.

Work on this question can get confusing in several ways. First, “meaning” is used to, well, mean different things: mental content, speaker meaning, conventional meaning, cognitive content, etc. In the paper, we focus on the contents of internal representations, and on the meaning of the outputs they causally contribute to producing. Making this clear also avoids a second potential confusion: having internal states with representational content does not entail having a mind, understanding language, or being conscious. Finally, “grounding” itself can be many things. In the paper, we distinguish between five different kinds of grounding. We identify one as the most fundamental, and thus most central to Harnad’s grounding problem: referential grounding, which captures how a representation hooks onto what it represents.

Our proposal, in line with mainstream theories of representation, is that referential grounding requires two broad ingredients. The first is a causal-informational relation: a state of the system must carry information about something in the world, perhaps through a long and indirect chain. The second is a suitable history of selection: ahe state must have been selected (through evolution, learning, or training) because its carrying that information helped the system’s persistence and/or reproduction.

How could LLMs satisfy these conditions? Start with the causal-informational side. LLMs are trained on human-produced text, and human language is shaped by perception, action, social coordination, and cultural history. Because text is produced by creatures who live in the world, it bears the world’s imprint. Training on text thus gives LLMs indirect causal-informational links to the things language is used to talk about. This point is sometimes missed because those links are mediated by human beings, but we constantly rely on indirect causal chains ourselves: we learn about quarks, ancient cities, and so on through testimony, diagrams, instruments, and books. LLMs’ complete dependence on such mediation, however, represents a substantial difference from the biological case.

When it comes to the selection requirement, our most straightforward argument appeals to fine-tuning. Modern chatbots are fine-tuned (further trained) to conform to specific norms such as helpfulness, harmlessness, and factual accuracy. When a model’s internal states are selected to produce more accurate answers, the relevant success conditions depend on how the world is. Internal states that help the model produce true answers persist through this further training because they carry information that matters to success in the task. In such cases, we argue, those states fulfill the second condition on referential grounding.

We also argue, more tentatively, that training on text prediction alone may sometimes be enough for referential grouding. This is most plausible in formally constrained domains. A model trained only to predict legal moves in a board game, for example, may develop internal states that track the board state because doing so helps it predict the next move. Mechanistic interpretability research on LLM-like systems trained on board games indicates that they can indeed encode board positions in ways that causally affect their outputs. In certain domains, prediction objectives may select for internal states that track the structure generating the data.

Our account has a surprising implication: multimodality and embodiment are neither necessary nor sufficient for referential grounding. A model that can take image pixels as input still needs the right learning history to represent more than patterns in pixel space. Likewise, a language model bolted onto a robotic body to translate natural language commands into low-level action commands doesn’t acquire new referential powers merely because another subsystem moves through the world. What matters is whether the system’s internal states have been selected for carrying information that guides successful behavior.

Since we wrote the first version of the paper, the most interesting cases of multimodal and embodied AI have shifted from models that merely receive pixels or issue commands through a separate controller to models whose training actually couples perception, action, and success. Recent “vision language action” models, such as Google’s Gemini Robotics and Physical Intelligence’s π-series, are trained on combinations of images, language, proprioceptive states, action trajectories, and high-level task annotations. In these systems, visual and bodily states aren’t simply extra inputs appended to a language model after the fact; at least some internal states are selected and stabilized because they help the system choose actions that make the world come out a certain way.

This motivates a more general question: what is the right metasemantics for artificial neural networks (ANN)? We deliberately appealed to mainstream theories of representation, which were developed to account for biological (cognitive) systems. But it may turn out that the best metasemantics for ANNs is substantially different. What shape(s) such a metasemantics would take is far from clear, but any good metasemantics for ANNs should preserve the explanatory roles that make representation worth positing in the first place: it should distinguish genuine content from mere correlation, explain how error and misrepresentation are possible, tell us which internal states are the relevant vehicles, and help us predict what will happen when we intervene on those states.

New agentic systems also raise interesting questions. Many AI systems now combine one or more LLMs with retrieval, external memory, tool use, code execution, and sometimes action in a computer environment or in the physical world. In such systems, what is the bearer of representational content? The base model? The temporary state of the whole scaffold? A planning module? A tool-using loop stretched over time? We may need a more modular metasemantics for AI systems, one that lets different components acquire content through different histories, functions, and roles in the larger architecture.

Finally, there is another possibility we find especially worth taking seriously and that we only hinted at in the paper. If LLMs and related systems have internal states with content, that content need not line up neatly, or at all, with human concepts. Given their “alien” training histories, alien stabilization and selection pressures, and alien success conditions, their representational contents may also be partly or fully alien. A model may track features of the world that matter for prediction, reward, or control without carving things the way we do, the human origin of the training data notwithstanding. If it is true that LLMs have content, what kind(s) of content do they have?

One comment

  1. Meaning Is Not a Matter of Causal Connectivity
    Coelho Mollo and Millière ask whether the internal states of large language models can be referentially grounded. The question is misconceived. It presupposes that meaning and representation are in principle explicable through causal and functional relations, and that what remains is merely to identify the right ones. But this is precisely the philosophical commitment that needs to be argued for, not assumed. Understanding meaning is not a matter of causal connectivity, whether symbolic or vectorial. It is a matter of intentionality, and intentionality cannot be captured by causal relations alone.
    The distinction between symbols and vectors, which the paper treats as a conceptual advance, is irrelevant at this level. In both cases, meaning enters the system from outside, through attribution: by the humans who produce the training corpus, set the optimization targets, and interpret the outputs. Causal chains, selection histories, and fine-tuning may show that internal states correlate with features of the world. But correlation with the world is not the same as understanding. A thermometer correlates with temperature without understanding temperature. Adding causal complexity changes nothing in principle. Machines do not understand meaning, regardless of how elaborately their internal states are causally connected, because understanding is a phenomenon that no causal description exhausts. The vector grounding problem is not a harder version of the symbol grounding problem. It is the same mistake in a more sophisticated idiom.

Ask a question about something you read in this post.

Back to Top