Frames of Discovery and the Format of Cognitive Representation

Frames of Discovery and the Format of Cognitive Representation

By Dimitri Coelho Mollo & Alfredo Vernazzani

A central assumption in contemporary cognitive science and AI is that cognition involves internal representations. Quite like public, external representations, such as texts, pictures and maps, internal representations carry contents (i.e. they are about something) and are implemented by vehicles (i.e., physical states that realize them). In addition, as much as public representations come in different formats—say, a photo vs a textual description—it is widely believed that internal representations have formats.

Appeal to the formats of internal representations (i.e., cognitive formats) matters for at least two types of cognitive scientific explanation (Coelho Mollo & Vernazzani 2024):

Transformation-based explanations: explaining cognitive capacities in terms of the specific manipulations a system can perform over internal representations.
Efficiency-based explanations: explaining why certain representational formats are particularly suitable for specific tasks (e.g., because they permit faster or simpler cognitive processing).

Historically, the study of cognitive formats has been dominated by analogies to public representations, inviting either monist or limitedly pluralistic accounts: natural language is likened to a language-like mental code; pictures to iconic perceptual representations; and maps to cognitive maps. In our contribution to the volume, we argue that framing research about cognitive formats in terms of analogies to external, public representation formats is risky and can often be misleading.

Frames of Scientific Inquiry

A helpful way of viewing scientific inquiry is in terms of scientific frames: conceptual schemas that guide discovery, shape hypotheses, and highlight some explanatory questions over others (Camp 2019). In the case of cognitive formats, influential approaches rely on conceptual schemas and hypotheses drawn from features of public representations—a public representation frame.

Such framing has several attractive features. Analogical reasoning is widespread in science (Hesse 1966): just as the wave equation applies to both water and electromagnetic waves, one might hope that insights about public representational systems transfer to internal, cognitive representations. It also provides researchers with a conceptual starting point for theorizing about cognitive formats, i.e., by hypothesizing functional and structural similarities between public and cognitive representations.

For example, Fodor (2008) distinguishes compositionality in pictures from that found in sentences, holding that a picture of X represents X by having parts that depict parts of X—what he calls the Picture Principle. This point is extended to help characterize and distinguish putative language-like and picture-like (or iconic) internal representations. Quilty-Dunn (2016) applies this to visual processing, arguing that because visual scene segmentation into discrete objects seems to require a canonical decomposition—a capacity not afforded by purely iconic representation—visual object representations cannot be iconic.

But analogies, in science and elsewhere, only go so far.

Public vs. Cognitive Representations

Public and cognitive representations share the minimal features required to count as representations: they have contents, are realized by physical vehicles, stand in suitable content-fixing relations to what they are about, and play representational functional roles within and across cognitive systems. Beyond these basics, however, they differ profoundly:

Vehicles: Public representations use artefactual media (e.g., ink or pixels), constrained by communicative and socio-historical factors. Cognitive vehicles are neural states, shaped by evolutionary and developmental pressures.
Content-fixing relations: Public representations get their contents from a complex interplay of communicative intentions, conventions, and social practices. Cognitive representations get their contents via natural, informational and teleofunctional relations (Millikan 2017; Neander 2017; Shea 2018; see Coelho Mollo 2022 on the dangers of the public representation analogy for theories of content).
Functional roles: Public representations primarily serve communication between agents. Cognitive representations operate within an agent, with downstream subsystems responding mechanically rather than interpretively.

These disanalogies, we argue, invite caution in applying the public representation frame to cognitive formats. We illustrate its shortcomings with two case studies.

1. The Compositionality Debates

In the late 1980s, connectionist approaches in AI and cognitive science challenged the dominant symbolic approach, represented by the Language of Thought Hypothesis (Fodor 1975). Cognition can be realized, they suggested, by subsymbolic, implicit, non-rule-based means. Fodor & Pylyshyn (1988) attacked connectionism on the grounds that thought is productive and systematic, hence compositional, which they took to require a language-like format. Since connectionist networks mostly lack such kind of format, they were deemed incapable of supporting compositional thought, and thus unsuitable as a theory of (human) cognitive architecture.

As Chalmers (1993) noted, however, this argument displays a “lack of imagination:” other kinds of format may suffice for productive and systematic thought and behavior within relevant task-bounds. Indeed, later work indicates that neural networks can approximate compositional competence without explicit symbolic structures and rules, but rather via a complex system of soft constraints (Smolensky 1987, 1990; Lepori, Serre & Pavlick 2024)—a representational structure that lacks clear public analogues.

2. Visual Demonstratives

Perceptual Object Representations (PORs) explain how vision binds an object’s features together and re-identifies it across time. Some theorists posit that PORs include a visual demonstrative or pointer, inspired by linguistic demonstratives (“this F”), to explain dynamic object tracking: the ability to keep track of an object even as some of its features (e.g., color, shape) change (Pylyshyn 2007; Green & Quilty-Dunn 2021).

Rubner & Vernazzani (ms) argue that dynamic tracking can be explained without positing demonstratives. Empirical evidence shows that the visual system flexibly relies on sets of properties—spatiotemporal (e.g., trajectory) and surface properties (e.g., color, shape)—for tracking. They propose that an object is tracked across time if the sets of properties guiding reference at successive times have a non-empty intersection. If some relevant properties persist (e.g., continuous trajectory), the object can be re-identified (see also Vernazzani 2022).

This account predicts that tracking would fail only if all tracking-relevant properties changed simultaneously—a rare situation—thus explaining tracking robustness without invoking demonstratives or language-like formats. Reliance on a linguistic analogy (demonstratives) may have overshadowed this simpler hypothesis.

Toward a Computational Account of Formats

These case studies show that the public representation frame can misdirect research on cognitive formats, constraining scientific imagination in unfruitful ways (case 1); and privileging certain questions and hypotheses over potentially more promising ones (case 2).

We have argued elsewhere that cognitive formats are to be accounted for in purely computational terms instead (Coelho Mollo & Vernazzani 2024). Our approach avoids unwarranted constraints coming from the public representation frame and opens the door to recognizing the ample variety of formats that may underlie cognition.

References

Camp, Elisabeth (2019) “Perspectives and Frames in Pursuit of Ultimate Understanding” in S. Grimm (ed.) The Varieties of Understanding (pp. 17-45). New York: Oxford University Press.

Chalmers, David J. (1993) “Connectionism and Compositionality: Why Fodor and Pylyshyn Were Wrong” Philosophical Psychology 6(3), pp. 305-319.

Coelho Mollo, Dimitri & Alfredo Vernazzani (2024) “The Formats of Cognitive Representations: A Computational Perspective” Philosophy of Science, 91, 682–701.

Coelho Mollo, Dimitri (2022) “Deflationary Realism: Representation and Idealisation in Cognitive Science” Mind & Language 37(5), pp. 1048-1066.

Fodor, Jerry (1975) The Language of Thought. New York: Thomas Crowell.

Fodor, Jerry (2008) LOT2. New York: Oxford University Press.

Fodor, Jerry & Zenon Pylyshyn (1988) “Connectionism and Cognitive Architecture: A Critical Analysis” Cognition 28 (1-2), pp. 3-71.

Green, E.J. & Jake Quilty-Dunn (2021) “What is an Object File?” British Journal for the Philosophy of Science 72(3), pp. 665-699.

Hesse, Mary (1966) Models and Analogies in Science. Indiana: University of Notre Dame Press.

Lepori, Michael, Serre, Thomas & Pavlick, Ellie (2024) “Break it down: Evidence for Structural Compositionality in Neural Networks” Advances in Neural Information Processing Systems 36, pp. 42623-42660.

Millikan, Ruth G. (2017) Beyond Concepts. New York: Oxford University Press.

Neander, Karen (2017) A Mark of the Mental. Cambridge, MA: MIT Press.

Pylyshyn, Zenon (2007) Things and Places. Cambridge, MA: MIT Press.

Quilty-Dunn, Jake (2016) “Iconicity and the Format of Perception” Journal of Consciousness Studies 23(3-4), pp. 255-263.

Rubner, Andrew & Alfredo Vernazzani (ms) “The Compresent Properties View of Object Perception.”

Shea, Nicholas (2018) Representation in Cognitive Science. New York: Oxford University Press.

Smolensky, Paul (1987) “The Constituent Structure of Connectionist Mental States” Southern Journal of Philosophy Supplement 26, pp. 137-160.

Smolensky, Paul (1990) “Tensor Product Variable Binding and the Representation of Symbolic Structures in Connectionist Systems” Artificial Intelligence 46, pp. 159-216.

Vernazzani, Alfredo (2022) “Do We See Facts?” Mind & Language. DOI: 10.1111/mila.12336.

Ask a question about something you read in this post.Cancel reply