Mallory: Representation in LLMs

Formats of representation in large language models

Fintan Mallory

Assuming that artificial neural networks (ANNs) can, through training, come to represent the data upon which they were trained, or more general features of that data, then two natural questions that arise are what are the vehicles of representation within the network and what is the format in which these vehicles are structured? This paper argues against giving a monolithic answer to either of these questions. Rather than seeking the vehicles or the format of representation, we should allow for the possibility that ANNs contain multiple vehicles and multiple formats. In a sense, this shouldn’t be controversial. Natural languages contain both words that refer to objects, sentences that express truth-conditions, and different mechanisms for indicating force. The central nervous system appears to have multiple vehicles and formats of representation. Complex systems of representation can often be analysed as a combination of several smaller interacting systems and any analysis of ANNs should be sensitive to the particular operations and activation functions that an architecture implements. We should also distinguish between representational contents stored in a network’s weights and the contents of activation vectors which pass through the network.

In the case of transformer language models, the paper proposes that representational content is born both by features encoded as directions in the embedding space (where a ‘direction’ is a geometric abstraction over actual values encoded by weights and which may be encoded across multiple neurons) and that there is also representational content encoded in the distances between these activations encoded within the QKT matrix of an attention head. It also suggests that some representations in the system are purely nominal representations in that there is no meaningful structural relationship between the system of vehicles and their contents (e.g. some tokenisation methods), while others have an analogue structure where the structure of the system of representation mirrors some structure in the world, and others allow for structural representation where the relations between these vehicles of representation correspond to relations between the things represented. In short, language models present us with multiple kinds of vehicles and formats of representation within their architecture.

One reason for distinguishing multiple vehicles of representation in ANNs is to identify how different computational architectures render different contents accessible. The paper suggests that attention heads are expressively powerful because they render information implicit in the relations between activations exploitable. For a relation between vehicles to represent a relation between things (i.e. structural representation) there needs to be operations that exploit that relation between vehicles. We don’t find this in a network like an RNN where information is compressed into a hidden state vector and where there is no operation for computing the relations between embeddings (e.g. dot product). By making the content that is implicit in the distance between embeddings accessible to computational operations, attention heads increase the expressive power of ANNs.

In the paper, I distinguish analogue and structural representation and define both in terms of (partial) isomorphism. Corey Maley’s paper argues against a distinction between analogue and structural representations and in favour of making our distinctions about representation in terms of homomorphism rather than isomorphism. I’m happy to concede the latter but not the former.

Some quick definitions:

A function f from two sets, X, Y, structured by relations R, R’ is a homomorphism iff it is a function from X to Y and fR(x1, x2) if and only if R’(f (y1), f (y2)).

A function f from two sets, X, Y, structured by relations R, R’ is a partial isomorphism iff it is a bijection and for R on X and R’ on Y, fR(x1, x2) if and only if R’(f (y1), f (y2)).

The appeal to partial isomorphism tries to avoid some of the problems Maley identifies with strict isomorphism by dropping the demand that every element in each set is mapped, but it is still a stronger condition. Homomorphism only requires that the mapping from X to Y is structure preserving, not that it is a bijection. This allows two entities in X to map to the same entity in Y whereas isomorphism requires that the items in X all map to distinct content. Why should we preclude redundancy in our system of representation? In hindsight, this is a very strange condition and it’s odd that I (and others) ever thought it was a good idea. Maley is right and speaking in terms of homomorphism rather than isomorphism is preferable.

I’m less keen, however, to reject the distinction in type between analogue and structural representation. I agree with work that claims that structural representation occurs when the analogue format of a system of representation is exploited, that is, when the relations between vehicles play a role in computations about the relations between their contents. There is a utility to a system of representation having analogue structure, for example, errors arising from noise can be minimised in the case where slight differences between vehicles only amount to slight differences between contents represented (Shea, 2023) but unless there are computations directly acting on the relations over this structure, no structural representation occurs. This is independent of the number of dimensions of analogue structure involved. To see this, it may help to distinguish questions of representational medium (DM: how many dimensions of variation does a medium allow for?) from questions of format (DV: which of these dimensions are computational operations using the medium for the purpose of representation sensitive to?). We can keep adding dimensions of variation to a medium without adding the operations required to exploit them (while there’s no guarantee that DM > DV).

References

Maley, C. (2026). Structural representation is analog representation. Philosophy and the Mind Sciences, 7(1). https://doi.org/10.33735/phimisci.2026.12205

Shea, N. (2023). Organized representations forming a computationally useful processing structure. Synthese, 202(6), 175.

Thanks to Corey Maley for some discussion of these points.