John Zeimbekis: Malleability or cognitive effects on recognition?

John Zeimbekis

It’s a relief to see a new book about how perception interacts with thought. Decades of work on modularity and other distinguishing traits of perception (perception as nonconceptual, analog, iconic, unstructured) leave us with a picture of thought as informationally separate from perception, but few suggestions about how the states connect to support procedures like perceptual justification or the formation of perceptual beliefs. Taking this to be a key attraction of the book, I’ll focus on how it deals with processes in late vision: recognition and categorization. It seems to me that by classifying these tasks under late vision, which they take to be cognitively penetrable, modularists can offer competing explanations of the phenomena that Dustin explains by using the concept of malleability.

The current debate on cognitive penetrability focuses on whether some parts of visual perception are modular. This doesn’t seem to be an ad hoc move by modularists to keep encapsulation defensible just by shifting the goalposts. Pylyshyn (1999) offered a restricted version of Fodor’s modularity claims by using Marr’s (1982) concept of early vision, which was established independently. Like Marr, he held that visual object recognition, albeit visual, is a cognitive process. From this way of setting up the debate, the minimal claim I want to keep is that recognition presupposes some modular visual processing that outputs representations of basic properties like shape. Thus, comparisons of visual memories to current visual representations of shapes presuppose, as relata, shape representations that are outputs of visual processing distinct from the comparisons themselves.

According to Dustin, experts have different perceptions to novices because their attentional focus is unconsciously influenced by perceptual learning. The evidence is from fluency in categorization. Which state would be influencing which other state, in that case? Perceptual categorization requires not just concepts but perceptual memories of categories or exemplars (eg visual templates that include information about shape, colour patterns, motion and other basic properties). The memories can either be part of concepts or else they can be associated with concepts, depending on one’s theory of concepts—on whether or not the theory takes perceptual skills to be constitutive of concepts as bodies of information supporting a number of competences. I’ll assume that when, in categorization tasks, the perceptions of experts are modulated as a result of perceptual learning, they are modulated by such visual memories (part of, or associated with, a concept for the category).

The findings on radiologists and other experts cited by Dustin suggest that the memorized information affects saccadic patterns. But the time span (pp. 137, 159 give 200 ms) is large enough to include late vision as described by modularists. It is more than enough for saccades and attentional allocation to be part of the visual comparison process. The visual system’s comparisons would consist of unconscious feedback from visual memory storage to help search the scene for visual matches. (By contrast with Fodor’s (1983) conception of recognition as a form-concept dictionary.) Ηow does Dustin here restrict his claim about cognitive influence to early vision? Because if the target is late vision, then his view coincides with Marr’s and Pylyshyn’s.

If recognition is cognitively penetrable, then recognition supported by perceptual expertise—as opposed to just any visual recognition—is not an extra symptom of malleability or penetrability of perception. A difference between novices and experts is that experts (when working as experts) have to discriminate fine grained subordinates and exemplars, discriminations which are likely to depend on a different form of retrieval from visual memory. Consider Brady et al.’s (2011) hierarchical proposal, in which retrieval of visual memories for exemplars and subordinate categories depends on an initial coarse (basic-level) categorization. Completion of such a process would take longer than basic-level categorization, even for experts, consistently with the 200 ms timespan of the findings that Dustin uses. For categories new to novices but familiar to experts, novices would not have formed visual templates, preventing recognition, or would have insufficient memorized information, slowing it down. (On Brady’s model, even the experts would take longer to recognize subordinate category instances than they would to recognize basic-level category instances; novices would perform as well as experts at basic level categorizations.) But this way of distinguishing experts from novices doesn’t contribute to showing that expertise is a symptom of perceptual malleability. Expertise makes processing more fluent without making it any more subject to cognitive influence than the rest of perceptual recognition.

What about the role of attention in each framework—the malleability framework and the modular one which defines recognition as cognitively penetrable? Attention is selective so it’s closely tied to the possibility of giving different descriptions of a visual scene, and thus to potential epistemological concerns. The same scene—a striker about to shoot—supports different perceptions and descriptions by the goalie and the physio (pp. 226-27; also figure p. 191). The different features attended to by the physio and the goalie yield multiple descriptions (Boghossian’s term, 2006) without any hint that perception supports fact-relativism. For other reasons, when an expert applies a concept for a subordinate category and a novice a concept for a basic-level category, the concepts are again consistent; different descriptions at basic, subordinate, superordinate, and exemplar levels which stand in determination relations are consistent.

The modularist restricts attention-borne cognitive influence by using the attention-shift argument. The argument is a way to distinguish—among visual processes that deliver representations of basic properties—processes that support multiple descriptions from processes that would support inconsistent descriptions. That is the point of “keeping attention fixed” in definitions of cognitive penetration: cognitive influence on the processing that delivers basic property representations, but not influence on attention, would cause inconsistent perceptions. Suppose that visual memories affected shape processing so that, depending on which memories we had, visual shape computation could automatically output representations of concavities as either concave or convex. Then the same visual process would support inconsistent shape assignments equally, and inconsistent conceptual descriptions in categorization. The point of the attention-shift argument is that it doesn’t matter if attention is selective because should attention happen to focus on the depth cue, then vision will output a representation of concavity irrespective of which visual memories we have.

Dustin objects to the attention-shift argument on the grounds that it depends on a spotlight or buffer view of attention, when in fact attention should be seen as part of perception itself. I agree that many forms of attention qualify as part of perception. But for the “thought affects perception” thesis to have any kind of epistemological virtue (to be a “thought improves perception” thesis), Dustin has to presuppose that the role of attention in perception is restricted in the way modularists describe. Because otherwise the physio, the coach or the tree expert would not be experts or have improved discernment; their perceptions and categorizations would just describe new relative facts. That such epistemic predicaments don’t emerge in Dustin’s examples shows that whatever role attention plays, even as part of vision, is not a role in how vision constructs representations of basic properties from given inputs (even if the inputs are attended to, selected, under unconscious influence from visual memories). So it seems to me that Dustin’s thesis—that malleability enables epistemically virtuous expertise in categorization—is conditional on the claim that modules deliver representations of basic properties for purposes of categorization.


Boghossian, P. (2006). Fear of Knowledge Against Relativism and Constructivism. Oxford University Press.

Brady., T., Konkle, T., and Alvarez, G. (2011). A review of visual memory capacity: Beyond individual items and toward structured representations. Journal of Vision 11(5): 4, 1-34.

Fodor, J. (1983). The Modularity of Mind: An Essay on Faculty Psychology. Cambridge, MA: MIT Press.

Pylyshyn, Z. (1999). Is vision continuous with cognition? Behavioral and Brain Sciences 22: 341-65.

Marr, D. (1982). Vision: A Computational Investigation into Human Representation and Processing of Visual Information. San Francisco: Freeman.

Back to Top