Wade Munroe, University of Nebraska, Lincoln
There is much to appreciate in Bence Nanay’s monograph. In particular, Nanay’s analysis of mental imagery is especially insightful. The expression, “mental imagery,” is a technical term introduced in experimental psychology’s nascency. As Nanay rightly argues, our use of, “mental imagery,” qua term of art, should aim to maximize theoretical usefulness in cognitive explanation. Although it’s still common to treat mental imagery as a necessarily conscious phenomenon, Nanay establishes that mental imagery ought to be analyzed without recourse to conscious experience. Instead, Nanay defines mental imagery as perceptual representation that is not directly triggered by sensory input.
Although I have quibbles with various aspects of Nanay’s definition (and, in particular, his use of the expression, “not directly triggered,” and his distinction between triggering and modification), for the sake of space, I focus my comments on Nanay’s claims about the relationship between language and mental imagery in chapter 19. Nanay boldly asserts that, “language processing itself essentially involves mental imagery,” (p. 148) a claim I will call ‘LEIM’ for short. Although never explicitly stated, we can gather what Nanay means by “linguistic processing” from the evidence he cites in favor of LEIM,
- Evidence that linguistic labels facilitate object recognition, visual object detection, and visual search (e.g., Gary Lupyan’s Label-Feedback Hypothesis; Edmiston & Lupyan, 2015; Lupyan, 2012; Lupyan, Mirman, Hamilton, & Thompson-Schill, 2012; Lupyan & Swingley, 2012).
- Evidence that concrete words are easier to recall than abstract words (Walker & Hulme, 1999).
- Evidence for the use of mental and motor imagery (in Nanay’s sense of the terms) in discourse comprehension in which a comprehender constructs a discourse representation over several sentences/utterances (Zwaan, 2016).
Thus, by “language processing,” Nanay doesn’t mean the operations of the language faculty in the narrow sense (Hauser, Chomsky, & Fitch, 2002) nor the operations of the language faculty in the broad sense in, say, the selection of morphophonological elements, the bundling of phonemic segments into syllables, etc. LEIM seems to be restricted in scope to discourse comprehension and semantic processing, e.g., single word comprehension, sentence comprehension, semantic association, pre-linguistic message construction that subsequently initiates linguistic processing proper (or form encoding) in speech planning, etc.
So, how does LEIM fare when restricted to discourse or semantic processing? It’s telling that the authors Nanay cites explicitly distance themselves from LEIM. For instance, Nanay cites an article by Michelle Liu (2022) on the role of mental imagery in polysemy processing in which Liu writes,
It is worth noting that the simulation view [i.e., grounded accounts that take conceptual processing to frequently involve the simulation or ‘neural reuse’ (Anderson, 2010) of modality-specific systems—systems that subserve perception and action] need not reject the claim that language processing involves abstract symbol manipulation. Nor must it commit to the claim that simulation is necessary or sufficient for all language processing. The key idea is that, in many situations, simulation is part of understanding language. (ibid, p. 177; emphasis mine)
Similarly, Nany cites an article by Fabrizio Calzavarini (2019) on the pictorial view of meaning in which Calzavarini writes,
On the one hand, some neuroimaging and behavioural/patient data support the hypothesis that visual (imagery) cortex, particularly in the left posterior/middle fusiform gyrus, plays an active and perhaps critical role in semantic competence, as far as concrete, high IMG words are concerned [where a high IMG word, in Calzavarini’s sense, is a word that “appears, introspectively, to be quickly and spontaneously associated with a mental image”]. On the other hand…I argued that more precise anatomical evidence is needed in order to conclusively demonstrate that preservation of visual (imagery) cortex is a necessary, essential condition for the understanding of high IMG words…(ibid., pp. 53-54; emphasis mine)
Several meta-analyses find that traditional language areas are the most likely to exhibit increased activation with low IMG/abstract words (e.g., Binder, Desai, Graves, & Conant, 2009; Xiaosha Wang et al., 2018). Thus, as implicitly suggested by Calzavarini, the strongest case for LEIM can be made in the domain of high IMG/concrete words (or the processing of corresponding concepts).[1] However, even when we restrict focus to the comprehension of high IMG/concrete words, LEIM is implausibly strong. Let’s take color expressions, which are high IMG/concrete expressions par excellence. There is considerable similarity in the color knowledge of sighted and congenitally blind persons, e.g., knowledge of the color of various objects (Xiaoying Wang, Men, Gao, Caramazza, & Bi, 2020), as well as in their abilities to reason about, e.g., the structure of color space (e.g., red and orange are closer than red and blue; Saysani, Corballis, & Corballis, 2018) or why something might possess a particular color (e.g., polar bears have black skin to better absorb heat in the Artic climate; Kim, Aheimer, Manrara, & Bedny, 2020). Additionally, a variety of neuroimaging work shows that, while sighted persons engage both posterior visual areas and anterior language-related areas during tasks that tap color knowledge, congenitally blind persons engage only the latter (Bi, 2021).
We don’t merely use language as a covert cue to activate information realized in brain areas outside of those that subserve the language faculty—areas where real semantic or conceptual processing occurs. Instead, language is itself a proper part of semantic and conceptual processing. Linguistic distributional statistics—roughly, linguistic-entity-to-linguistic-context co-occurrences, e.g., lexical-item-to-lexical-item, lexical-item-to-discourse-contex, etc., co-occurrences—serves as a rich source of semantic information (e.g., information about word meanings) but also information about the extralinguistic world encoded in the utterances of language users. Color knowledge of the congenitally (or early) blind is best represented with distributional semantic models that abstract and generalize over linguistic distributional statistics to represent word meanings as vectors in a semantic space (Günther, Rinaldi, & Marelli, 2019). Congenitally (or early) blind persons extract information about color from language and reason with the information using the same faculties involved in parsing and producing utterances about color.
I am happy to accept that, in many situations, simulation is part of semantic processing. However, if there is some domain of semantic processing that essentially relies on mental imagery, I’ve yet to see much evidence for it. Pace Nanay, language isn’t the ‘icing on the cake’ (p. 146) of simulation or neural reuse of sensorimotor areas in semantic processing. Language is part of the batter.
Works Cited
Anderson, M. L. (2010). Neural reuse: A fundamental organizational principle of the brain. Behavioral and Brain Sciences, 33(4), 245-266.
Banks, B., Borghi, A. M., Fargier, R., Fini, C., Jonauskaite, D., Mazzuca, C., . . . Woodin, G. (2023). Consensus paper: Current perspectives on abstract concepts and future research directions. Journal of Cognition, 6(1), 62.
Bi, Y. (2021). Dual coding of knowledge in the human brain. Trends in Cognitive Sciences, 25(10), 883-895.
Binder, J. R., Desai, R. H., Graves, W. W., & Conant, L. L. (2009). Where is the semantic system? A critical review and meta-analysis of 120 functional neuroimaging studies. Cerebral Cortex, 19(12), 2767-2796.
Calzavarini, F. (2019). The empirical status of the pictorial view of meaning. Journal of Consciousness Studies, 26(11-12), 33-59.
Edmiston, P., & Lupyan, G. (2015). What makes words special? Words as unmotivated cues. Cognition, 143, 93-100.
Günther, F., Rinaldi, L., & Marelli, M. (2019). Vector-space models of semantic representation from a cognitive perspective: A discussion of common misconceptions. Perspectives on Psychological Science, 14(6), 1006-1033.
Hauser, M. D., Chomsky, N., & Fitch, W. T. (2002). The faculty of language: what is it, who has it, and how did it evolve? science, 298(5598), 1569-1579.
Kim, J. S., Aheimer, B., Manrara, V. M., & Bedny, M. (2020). Shared understanding of color among congenitally blind and sighted adults.
Liu, M. (2022). Mental imagery and polysemy processing. Journal of Consciousness Studies.
Lupyan, G. (2012). Linguistically modulated perception and cognition: The label-feedback hypothesis. Frontiers in psychology, 3, 54.
Lupyan, G., Mirman, D., Hamilton, R., & Thompson-Schill, S. L. (2012). Categorization is modulated by transcranial direct current stimulation over left prefrontal cortex. Cognition, 124(1), 36-49.
Lupyan, G., & Swingley, D. (2012). Self-directed speech affects visual search performance. Quarterly journal of experimental psychology, 65(6), 1068-1085.
Saysani, A., Corballis, M. C., & Corballis, P. M. (2018). Colour envisioned: Concepts of colour in the blind and sighted. Visual cognition, 26(5), 382-392.
Walker, I., & Hulme, C. (1999). Concrete words are easier to recall than abstract words: Evidence for a semantic contribution to short-term serial recall. Journal of Experimental Psychology: Learning, Memory, and Cognition, 25(5), 1256.
Wang, X., Men, W., Gao, J., Caramazza, A., & Bi, Y. (2020). Two forms of knowledge representations in the human brain. Neuron, 107(2), 383-393. e385.
Wang, X., Wu, W., Ling, Z., Xu, Y., Fang, Y., Wang, X., . . . Bi, Y. (2018). Organizational principles of abstract words in the human brain. Cerebral Cortex, 28(12), 4305-4318.
Zwaan, R. A. (2016). Situation models, mental simulations, and abstract concepts in discourse comprehension. Psychonomic bulletin & review, 23, 1028-1034.
[1] Arguably, expressions don’t lie on a one-dimensional, concrete-abstract spectrum. The notions of ‘concreteness’ and ‘abstractness’ are multidimensional (Banks et al., 2023). However, it’s beyond the scope of this post to discuss the issue further.
HE HAD VERY HAIRY EYEBROWS, THICK BLUBBI want to identify the key issue in the debate about to what extent language and thought are symbol- and/or image-based (and would welcome comments). We all know that there is an issue of symbol grounding – how can symbols connect to the real world bodies/referents they refer to. The answer is: they can’t, – there is no way to connect C-A-T and the real world body, but people clearly feel they can leave this issue to one side. So let’s identify an issue that cannot be ignored.
The issue is symbol connectivity.
The most impressive faculty we demonstrate in language is arguably not sentence generativity but body connectivity and generativity. We can endlessly reconnect and recom;pose the bodies of the world (a natural language category embracing all the kinds of bodies in the world) in new combinations/compositions and scenes. So we can say/write THE CAT SAT ON THE MAT – THE CAT SAT ON THE MAT BESIDE THE TABLE or ….THE MAT ON THE TABLE. or … UNDER THE TABLE… or THE CAT SAT ON THE HAT ON THE TABLE or…FELL THROUGH THE HOLE IN THE TABLE or …..THE HOLE IN THE FLOOR. And so on ad infinitum. We can endlessly recompose any and all bo0dies in any and all the fields of the world and scenes of those body-field interactions. We can similarly move CARS, FURNITURE, FLIES, GUNS, PLATOONS, ARMIES et all round the fields of the world such as streets, mountainside, pools, stadia etc..
The question is where does this body and body-field connectivity come from? It can’t come from symbol connectivity. You see, symbols also dont’and can’t connect to each other let alone real world referents. There is no intrinsic connection between CAT , MAT, TABLE, HAT, etc or any other words. So how am i able to endlessly keep forming new connections and new scenes of the bodies they name? Trivially, we can establish rules as LLMs and other programs will such as CATS CAN SIT ON TABLES. ..MATS CAN SIT BESIDE TABLES and ON THE FLOOR.. which may suggest more coonections .but v soon we’ll run out of connections and the human mind can make body connections endlessly ad infinitum – beyond the scope of any set of rules. THE CAT CLUNG TO THE UNDERSIDE OF THE TABLE…. SLITHERED DOWN THE TABLE LEG….. WAS CRUSHED BY THE TABLE FALLING ON IT…
There’s no problem explaining this connectivity if we assume it is based on unconscious/subconscious image connectivity – “the connectivity of body images”. Give us just pictographics of the bodies of cats, tables, mats etc and we can clearly endlessly reposition them in say a room and recompose them in a vastly greater variety of combinations fthsn lsnguage has the capacity to expresss. Ou ability to produce and understand
And there’s no problem explaining how body connectivity in images connects to real world body connectivity – how we can use images of bodies in scenes to connect rto eal bodies in real fields. Images of bodies map onto the real bodies in the real world, however loosely.
Nor is there any problem seeing how body connectivity in language could be connecting unconsciously to images of bodies in the mind and thence to real bodies in the world. When I hear CATCH THE BALL and my hand shoots out and catches the ball physically, all without any conscious imaging, it has to be very likely that my unconscious mind generates an image of a hand catching the ball which then guides my physical action of catching. There is no way that symbols CATCH and BALL could connect to my hand and the ball.
Bear in mind that this is no secondary matter. Our entire lives and activities depend on being directed by language prescriptions …CLEAN THIS ROOM…GO TO THE SUPERMARKET/TOILET/LKITCHEN …FRY THE MEAT..LET’S MAKE LOVE… which do very successfully guide real body connections in the real world. Our ability to endlessly generate new connections of bodies in language connects to both a capacity to endlessly generatie new connections in images and thence to a corporeal capacity for endless connectivity of real bodies in the real world.
So that’s the double issue of symbol connectivity – how can symbols esp words connect to each other and generate the body connectivity of language and thence the interdependent body connectivity of our activites in the real world?
If you think this is somehow possible, you must explain symbol connectivity.
The reality is that if all the mind has is pure words we are like the blind men and the elephant with no ability to connect the word descriptions of the different parts of the elephant. We have pieces oi the puzzle – the pieces being word names for different bodies in a given scene, like a puzzle scene, but no capacity to put the pieces – the different bodies – together.
HE HAD VERY HAIR EYEBROWS, BLUBBERY LIPS, AND BULGING EYEBALLS… Now try and draw a coherent picture of his face.