Welcome to the Brains Blog’s Symposium series on the Cognitive Science of Philosophy! The aim of the series is to examine the use of methods from the cognitive sciences to generate philosophical insight. Each symposium is comprised of two parts. In the target post, a practitioner describes their use of the method under discussion and explains why they find it philosophically fruitful. A commentator then responds to the target post and discusses the strengths and limitations of the method.
In this symposium, Jorge Morales makes the case for philosophers of perception conducting experiments of their own, with Jonathan Cohen providing commentary on Morales’ empirical work on perspectival shape.
* * * * * * * * * * * *
Philosophy of Perception in the Laboratory
Jorge Morales
_________
Many arguments in the philosophy of mind hinge on capturing what experiences are like and what they are about. Philosophers’ main method to learn about the nature and contents of mental states has been to introspect them. Our capacity to introspect, however, is often unreliable. To avoid the muddy waters of introspection, philosophers of mind can take two (mutually non-exclusive) approaches to support the empirical claims in their arguments. After all, introspective reports are meant to reveal empirical truths about the mind. The first strategy is to get acquainted with the empirical literature with the goal of finding evidence that is relevant for their philosophical arguments. The second strategy is to run experiments meant to produce the desired evidence (Rose & Danks, 2013). In my research, I have followed both approaches, but here I will argue in favor of directly running philosophy-inspired experiments that take advantage of the full set of methods that psychological research has to offer. In particular, I will focus on testing questions in the philosophy of perception.
I.
There is no doubt that the philosophy of mind has benefited immensely from considering findings from the empirical literature. Rather than trying to introspect what experiences are like, psychological evidence can be leveraged to populate premises with empirical content in philosophical arguments. Merely using available empirical findings, however, has an important limitation: its passive nature. All philosophers can do is hope that the questions they are interested in have been previously addressed. Psychology is a rich field, and there is a lot of relevant research that philosophers can use. However, at least in my experience with consciousness and perception research, it is often the case that there is only neighboring work relevant to the questions we philosophers ask. This might be in part because of the difficulty of studying subjectivity and phenomenology in the laboratory. But there are sociohistorical reasons too. Like any other field, psychology is often concerned with questions that stem from within psychology itself. This is not necessarily because psychologists do not care about philosophical questions about perception (some would, if they knew about them), but because research is to a large extent framework-dependent: some questions only come up while working within a particular tradition or theoretical framework. This entails that particular empirical questions that are important to philosophical debates about conscious experiences may not come up in psychology hallways at all. When this happens, both fields lose. Philosophers are limited to extrapolating results from nearby debates because their (empirical) questions are not being addressed, and psychologists miss the opportunity to test new hypotheses that are often part of a long and theoretically rich tradition. The value of philosophy for science is often laid out in terms of philosophy’s theoretical contributions when reflecting about science (Laplane et al., 2019; Thagard, 2009), but it can also feed questions directly into scientific research programs.
To overcome these limitations, philosophers of mind (and psychologists!) can embrace the strategy of designing and running philosophy-specific experiments themselves. We see this trend in Experimental Philosophy, where philosophers test people’s intuitions about different concepts in moral psychology, epistemology, and language. In psychology and neuroscience, we see significant philosophical influence in areas concerned with morality, modality, memory, imagination, and the neural correlates of consciousness. Surprisingly, however, this kind of direct involvement from philosophers in empirical research has been very limited when concerned with questions about perception. To emphasize, this is not to say that there are not lots of empirical research programs relevant to the philosophy of perception, or that some psychologists have found some inspiration in philosophical work. What seems to be lacking is an explicit effort to address problems that stem from the philosophy of perception with the full toolkit of experimental psychology.
In my research, I use vision science to directly study empirical aspects of philosophical questions about the contents of conscious experiences and what things look to us. I will next discuss a recent example where I used this approach, and then conclude with a reflection about the future of interdisciplinary work.
II.
Look at the golden “coin” in the image below. What shape do you see?
Your answer bears on a centuries-old philosophical debate on the role of subjectivity in perception and on perspectival shapes in particular. At least since Locke and to this day, philosophers have quarreled about the best way to describe the contents of perceptual experiences—for example, of what a rotated “coin” looks like—without finding much consensus. Locke would have said that we only see a flat oval, variously shadowed, not a circle. It is by means of inference that “the judgment, by an habitual custom, alters the appearance [i.e. the oval] into their causes [i.e. the circular coin].” (Locke, 1975, II, ix, 8) Whereas other philosophers think that “the suggestion that pennies look elliptical when seen from most angles is simply not true—they look round” (Smith, 2002, p. 172) and they are “inclined to say it looks just plain circular, in a three-dimensional space—not elliptical at all, in any sense” (Schwitzgebel, 2006, p. 590). And yet others think that our visual experiences are better described by a “dual” character, such that perceptual experience reflects both the true distal properties of objects and their perspectival properties (Noë, 2004) So, which is it? Do we experience the world as it is “out there”? Do we experience the way the world is given to us in sensation and only later “infer” or “judge” what it truly is like? Or is it a mix of both?
Philosophers have arrived at these incompatible positions mostly by resorting to introspection (which sometimes can be reliable (Morales, Forthcoming), but not always), and in some recent cases, by relying on available empirical evidence too. Some—recognizing the importance of designing experiments that directly address philosophical questions—have even suggested possible studies that would help make progress in our understanding of perspectival shapes (Schwenkler & Weksler, 2019). However, philosophers have not really been involved in empirical research that tackle this kind of problems in the philosophy of perception head on.
What about vision scientists? Naturally, they investigate the mechanisms and computations responsible for transforming the retinal images with which visual processing begins into the full-blown 3D representations that characterize our visual perception. In fact, one of the most foundational principles in all vision science is that vision goes beyond the retinal image. This view, popular from Helmholtz and Marr to textbooks, assumes that “perhaps the most fundamental and important fact about our conscious experience of object properties is that they are more closely correlated with the intrinsic properties of the distal stimulus (objects in the environment) than they are with the properties of the proximal stimulus (the image on the retina). This is perhaps so obvious that it is easily overlooked.” (Palmer, 1999, p. 312)
Despite the popularity of this common textbook explanation, as it is well-known in the philosophy of perception, it is far from an established fact that conscious experiences are necessarily closer to the distal stimulus. That is what the whole philosophical debate is about!
This textbook presentation of what our phenomenology is like not only seems to ignore a large philosophical debate, but it also reveals a gap between the questions psychologists seem most interested in asking (e.g. about the mechanisms of shape constancy) and the philosophical questions about what objects look to us.
In a recent project (Morales et al., 2020), my colleagues and I tried to help bridge this gap and directly addressed this philosophical question about appearances using well-established methods in vision science. Importantly, a crucial point of our experiments is that (a) we had this specific philosophical debate in mind when we designed our studies and (b) we made sure we could bypass introspection. Our experiments used “visual search”, a well-established paradigm in vision science that takes advantage of the difficulty of finding an object when it’s next to other object(s) that look similar. For example, think how easy it is to find a red book on a shelf full of green books, and compare it to how hard it is to find that same red book on a shelf full of orange books. So, the logic of our experiments was simple: if rotated circular objects look oval (in some sense), they should make finding true oval objects harder. This means that people should take longer finding the oval in trials like B in the image below, compared to trials like A. And this is true even though the distractor object (on top of pedestal “2” in A and B) is in both cases the exact same circular “coin”. By leveraging visual search, we could link reaction-time differences to the way objects look to subjects. In other words, we could investigate directly a question that has puzzled philosophers since Locke.
Subjects were indeed slower selecting the oval coin when it was next to a rotated circular coin (and hence shared the oval target’s perspectival shape) than when the exact same circular object was seen head on. It is important to note that subjects were very accurate, so they were not slower because they misperceived the shapes. To make sure this result could not be explained by some alternative factor, in eight subsequent experiments we controlled for all sorts of confounds, and every single time we found the same kind of interference. For example, we controlled for low level properties such as size and rotation; we also used moving objects that provided extra visual cues about the shapes of the objects, and we allotted extra time for viewing the stimuli ensuring subjects had all the depth cues they needed to process the shapes correctly and efficiently. We also showed that the effect generalized to other shapes such as trapezoids & squares. Importantly, we found the same effects when we used real 3D objects rather than computer-generated stimuli. After all, the realistic-looking computer-generated images like the ones above only fake 3D properties through clever computer algorithms; in the end, they are flat 2D images presented on a screen. But when selecting a laser-cut 3D oval object like the one below, subjects were slower when the distractor was a rotated circular “coin” than when the distractor was a circular “coin” seen head on. And all this in the real world with real objects!
The results from our experiments suggest that subjects do experience both the true distal shape of objects and their subjective, perspectival shapes. Recall that subjects barely made mistakes selecting the oval, which indicates that mechanisms of shape constancy were effective and allowed subjects to clearly distinguish between true ovals and rotated circles. The reaction time slowdown that we observed over and over again strongly suggests that, despite some philosophical intuitions and cognitive science’s assumptions, our conscious experiences continue to represent perspectival properties even after our visual system resolves the true shape of objects.
Naturally, the results from these experiments cannot settle the debate once and for all. But they do highlight the value of testing empirical questions with philosophical import, and they open an opportunity for further conducting further studies of the same kind. In this case study, we used visual search and reaction times with both computer-generated and real 3D stimuli. But vision science has a vast repertoire of paradigms and measurements that can be used to probe how the visual system works and, importantly, what our experiences are like. The details would depend on the question, but philosophers and psychologists could work with paradigms that manipulate attention, numerosity, color perception, and so on, all while measuring accuracy, false alarms, response trajectories, accuracy/RT trade-offs, and confidence ratings, among many others. When paired with a rich philosophical backdrop, the potential is huge.
III.
As a philosopher, running experiments can be quite fruitful, but it can be daunting too. After all, it takes time, training, and resources to run experiments like the ones vision scientists conduct. However, while this approach does require meaningful interdisciplinary collaborations to be in place, it does not necessarily require that philosophers themselves run the experiments. (This is the route I’ve taken, which is fun and rewarding but it does require time and resources.) Rather, this approach can rely on interdisciplinary teams where philosophical and scientific expertise blend together. Philosophers can become collaborators in psychology teams and help design philosophy-specific experiments that psychologists can then run. Building meaningful, long-term interdisciplinary collaborations also takes time and effort, but the potential benefits for both philosophers of perception (and for vision scientists!) who are willing to build these bridges are worth it.
* * * * * * * * * * * *
Commentary: On the Representation of Proximal Shape
Jonathan Cohen
_________
Consider this philosophical chestnut: when you look at a tilted dinner plate, what shape do you see? Answers offered in the literature to this or nearby neighboring questions include:
- ellipticality exclusively (Locke, 1975, II, ix, 8);
- roundness exclusively (Smith 2002, Schwitzgebel 2006, Hopp 2013); and
- both ellipticality and circularity (Peacocke 1983, Noë 2004, Schellenberg 2008, Cohen 2010, 2012).
Morales reports a set of experiments (Morales, Bax, & Firestone, 2020) bearing on this issue. The headline finding here is that visual search for a target oval shape seen head on was impaired by the presence of circular objects rotated in the depth dimension (relative to visual search for ovals seen head on in the presence of circles seen head on). That is (adopting Morales’s terminology), visual search for a target was impaired by the presence of distractors that shared a proximal (/perspectival) shape property, even when target and distractors differed (and participants successfully judged that they differed) in distal shape.
This result shows that our performance on at least one kind of perceptual task is predicted/explained by representations of proximal, rather than distal properties. As such, it provides compelling reason for believing that human psychologies do in fact compute over representations of proximal properties, that such representations are implicated in guiding action and belief, and a fortiori that proximal properties are indeed psychologically represented.
(For an analogous demonstration with respect to color perception–viz., that there is an aspect of perceptual performance predicted/explained by representations of proximal rather than distal color, see Arend & Reeves, 1986, and the experimental literature stemming therefrom. As in the case of Morales et. al.’s experiments on shape perception, it’s hard to see how to make sense of these findings without accepting representations of the relevant proximal properties.)
This is an important result. That said, it leaves some issues unresolved as well. (This is only a clarification; I don’t mean to suggest that Morales thinks otherwise.)
First, if the reported increase in reaction time on visual search for targets matching distractors in proximal shape gives us reason for believing that proximal shape is (/other proximal properties are) psychologically represented, this finding alone says nothing about whether there are, additionally, representations of distal shape (/other distal properties). Now, in the present case, the further observation that participants successfully judged whether targets and distractors matched in distal shape suggests that they are indeed representing distal shape as well. The crucial point, however, is that the evidence for the reality of proximal representation, considered by itself, is agnostic on the distinct question of the reality of distal representation.
Second, while Morales et. al.’s experiments give us reason for believing that proximal shape is psychologically represented, distance remains between the latter conclusion and related but distinct claims that philosophical dispute has often centered on–e.g., that proximal shape is represented perceptually rather than post-perceptually, that it is a property we see rather than make inferences about, that it is represented at a personal rather than subpersonal level, or that it figures in the phenomenal character of our experiences.
I lack space to discuss all of these concerns, but want to remark very briefly on two of the most important.
1. Take the worry about whether the representation of proximal shape is perceptual or post-perceptual. Morales et. al. report a reaction time cost to visual search for a target shape when presented with a distractor matching the target in proximal, but not distal, shape. But because reaction time is a relatively gross measure of all the psychological steps leading up to report (in this case, a keypress), this finding is limited in what it can tell us about the fine-grained structure of the psychological processing. In particular, this result doesn’t tell us whether the reaction time cost (which presumably reflects computation over representations of the target’s and distractor’s proximal shapes) is incurred at a stage of visual processing or some post-perceptual stage on the way to report (cf. classical Stroop interference). Resolving this issue will require not only further empirical investigation, but also a way of thinking about the nature of the perceptual/post-perceptual distinction itself–which is of course hotly contested.
2. A similar point applies to the question about whether proximal shape figures in our phenomenal experience. Morales suggests this is indeed part of his quarry in characterizing his aim as “capturing what experiences are like” (p1). However, once again, though the results offered give us powerful reasons for believing that proximal shape is represented somewhere in our psychologies, they are not by themselves capable of settling whether such representations are accessible to consciousness. This seems important, because philosophers and psychologists who have defended positions like (ii) above (cf. Thouless 1931 on the “phenomenal regression to the real object”) are reasonably read as repudiating proximal shape qua accessible feature of phenomenal experience, rather than qua feature represented within our psychological makeup full stop.
In sum, Morales et. al.’s result strikes me as a significant contribution to our understanding of the psychology of shape perception, though one that invites further inquiry (and funding). As supporters of the cause of full employment for cognitive scientists, surely this is something we can all applaud.
* * * * * * * * * * * *
References
_________
Arend, L., & Reeves, A. (1986). Simultaneous color constancy. Journal of the Optical Society of America, A, Optics, Image & Science, 3(10), 1743–1751.
Cohen, J. (2010) Perception and computation. Philos. Issues 20, 96–124.
Cohen, J. (2012) Computation and the ambiguity of perception, in Hatfield and Allred (ed), Visual Experience: Sensation, Cognition and Constancy, 160-176. New York: Oxford University Press
Hopp, W. (2013). No Such Look: Problems with the Dual Content Theory, Phenomenology and the Cognitive Sciences 12(4), 813–833.
Laplane, L., Mantovani, P., Adolphs, R., Chang, H., Mantovani, A., McFall-Ngai, M., Rovelli, C., Sober, E., & Pradeu, T. (2019). Why science needs philosophy. Proceedings of the National Academy of Sciences, 116(10), 3948–3952.
Locke, J. (1975). An Essay Concerning Human Understanding (P. H. Nidditch, Ed.). Clarendon Press.
Morales, J. (Forthcoming). Introspection Is Signal Detection. The British Journal for the Philosophy of Science.
Morales, J., Bax, A., & Firestone, C. (2020). Sustained representation of perspectival shape. Proceedings of the National Academy of Sciences, 117(26), 14873–14882.
Noë, A. (2004). Action in Perception. MIT Press.
Palmer, S. E. (1999). Vision Science. MIT Press.
Peacocke, C. (1983). Sense And Content: Experience, Thought, And Their Relations. Oxford: Clarendon Press.
Rose, D., & Danks, D. (2013). In defense of a broad conception of experimental philosophy. Metaphilosophy, 44(4), 512–532.
Schwenkler, J., & Weksler, A. (2019). Are perspectival shapes seen or imagined? An experimental approach. Phenomenology and the Cognitive Sciences, 18, 855-877.
Schwitzgebel, E. (2006). Do Things Look Flat? Philosophy and Phenomenological Research, 72(3), 589–599.
Smith, A. D. (2002). The Problem of Perception. Harvard University Press.
Thagard, P. (2009). Why Cognitive Science Needs Philosophy and Vice Versa. Topics in Cognitive Science, 1(2), 237–254.
Thouless, R. H. (1931). Phenomenal regression to the real object. II. British Journal of Psychology 22(1): 1-30.
Hi Jorge,
Your experiment is fascinating and inspiring. It’s way more elegant that the experiment design Schwenkler and I have proposed (which you mention in the post). As you know, I have been trying to challenge it in the past, but failed. Recently I have been discussing this with John Schwenkler and Ben Henke, and one of the things that came up in the discussion is the following potential worry. It is in a way an elaboration of Cohen’s claim that “though the results offered give us powerful reasons for believing that proximal shape is represented somewhere in our psychologies, they are not by themselves capable of settling whether such representations are accessible to consciousness.”
Consider the following two claims that seem to be well-supported by scientific findings: (a) Early vision (e.g., V1) represents proximal shape (or at least it represents certain proximal properties such as proximal orientation); (b) Early vision (e.g., V1) participates in visual search (see e.g. Li, Z. (1999). Contextual influences in V1 as a basis for pop out and asymmetry in visual search. Proceedings of the National Academy of Sciences, 96(18), 10530-10535).
Taken together, it seems that (a) and (b) could (at least in principle) account for the slowing down effect that you have found. The basic idea would be that since early vision has difficulty distinguishing between the oval and the rotated circle, its contribution to visual search (of the oval) would be impaired by the presence of the rotated circle.
I take it that (a) and (b) by themselves do not entail that we consciously represent proximal shape (if they did, then they would settle the philosophical debate by themselves). The reason for this is that it is not clear whether representations in early vision are conscious (for example Prinz thinks that only midlevel representations are conscious). So my question is: in what way do your findings go beyond what (a) and (b) entail?
Great idea to use stimulus-similarity as measured by reaction time in a visual search as a way of determining the content of visual appearance. This is really clever, and it must be agreed that it tests a familiar set of philosophical issues that are not always the focus of psychological experiments in this area. So: wonderful work here. Congratulations.
I’m still a bit resistant to the conclusion, though. Proximal shape, I take it, is a pattern of retinal stimulation. I want to resist the idea that retinal stimulation is represented anywhere in the system, or that visual appearance has anything to do with it. So let me offer an alternative view of this experiment.
In 2010, I wrote an article entitled “How Things Look. And What Things Look That Way.” It was mostly about colour, though it also discussed shape. In that article, I argued that in everyday environmental colour vision, we are usually aware not only of the colour of reflecting surfaces but also of the illumination in which they stand. For example, when I am looking at a piece of paper at sunset, what visually appears to me is not only its whiteness, but also the redness of the light that falls on it. Assuming ideal conditions, then, the paper looks as white things do when red light falls on them. No mention of a proximal property here.
Extending this to shape, I would suggest that one might apprehend angle of view separately from distal shape. (The appearance of the rim of the coin in the above displays is one indication of this; often you’d have shadows too.) Now, in a real-life three-dimensional scene, A2 above would look as a circle does when it is viewed head-on, while B2 looks as a circle does when it is viewed obliquely. Again, the difference in visual appearance is captured without mentioning proximal properties.
Now, what about the experiment? Well, remember that it is based on similarity. Two objects that look alike take longer to discriminate than two objects that look less alike. Now, one can talk about similarity relations not just among shapes-as-such, but among the appearances of shapes-viewed-from-angles. From this perspective, what Morales shows is that a circular disc-viewed-obliquely looks more like an oval-viewed-head-on than it looks like a circular-disc-viewed-head-on. No need to invoke proximal properties.
I suspect much of the disagreement here turns on how one understands proximal features.
I’m not sure whether Morales would agree, but in my view, the presence of retinal features — curvature, the blindspot, etc. — that don’t correspond to anything in our proximal representations of shape, color, and the rest give us reason for not thinking about proximal shape (/other proximal visual representations) as anything so directly understood in terms of what’s going on on the retina (or for that matter, anything characterized physiologically).
rather, we can think about proximal representations as parameters over which certain psychological computations are defined. while there are many different ways to think about such parameters, one traditional view is that the value of the parameter for shape is something like the geometrically defined 2d projection of the distal shape onto a plane approximately parallel to the front of the viewer’s face (obviously computed somehow from retinal projections, but in a way that abstracts from details of the actual retinal projection).*
in any case, the idea is that the circle viewed at an angle and oval viewed head on take the same values with respect to this computational parameter, which would form the core of the prediction/explanation of the results Morales reports.
on this way of thinking about proximal shape, there’s no longer any worry that the representation of proximal shape requires representing our own retinas.
moreover, on this view, I’m not sure that there’s much of a disagreement between your view and Morales’s/mine. you say: what’s represented is that circle-viewed-obliquely and oval-viewed-head-on are similar. I say: right, and the dimension over which that similarity relation is defined just is proximal shape, so representing that similarity isn’t an alternative to representing proximal shape.
*for another, pretty different, version, see my 2010 “Perception and computation,” listed in the references above.
Thanks for your comment Mohan! Jonathan beat me posting his comment, but I also think that our view (and Jonathan’s) are very close to yours. The notion of representing proximal properties is indeed how Jonathan describes one side of this debate, but it’s actually not our own preferred way of talking (notice that the word “proximal” doesn’t even appear in my initial post, except once in quoting a textbook). But, as Jonathan explains in his comments, I think we are basically in agreement beyond the difference in terminology. Here’s how we prefer to characterize the conclusion that we think is supported by our experiments (this is a quote from our paper):
“The explanation of these results seems clear and straightforward: An elliptical coin is harder to distinguish from a rotated circular coin (vs. a head-on circular coin) because the two objects appear to have something in common. More precisely, when subjects see the rotated circular coin and the head-on elliptical coin, it can be said that they bear a representational similarity to one another. To be even more precise, the results here indicate such representational similarity even without specifying the dimension of such similarity, or the specific features that ground this similarity. For many philosophical issues at stake here, it may be important to distinguish between interference caused by matching perspectival shapes vs. by persisting retinal images themselves vs. by independent representations of ellipticity (31). Our results here cannot adjudicate between these extremely subtle options; but all imply some notion of representational similarity, which is what we take our results to demonstrate.”
So, if anything, our take is neutral about the dimensions over which the representational similarity we found operates (but I’d be surprised if it is at the literal level of retinal stimulation which, as Jonathan and you correctly point out, is an unlikely candidate).
Perhaps now our view seems more in line with yours? In any case, we think we more or less accept your take here!
Thanks Jorge. This is indeed very helpful, and I think you are right that we are in close alignment. The only divergence is perhaps that I don’t want to say that the representational similarity is definable in terms of the geometrical optics of projection.
Thanks again for this lovely work.
Hi Assaf,
Thanks so much for your comment—and for your inspiring work on perspectival shape and previous discussions as well! I think that we can agree that something our experiments show, against mainstream views in psychology and part of philosophy, is that perspectival shapes are represented even after distal shapes are computed and that these representations affect how subjects behaved. Now, I think that you and Jonathan are highlighting an important question regarding the further question of whether these representations are part of conscious experiences.
We really aren’t making any implementation claim, but I don’t think we can go from “X brain region is involved in task Y” and “activity in X brain region is unlikely to be conscious” to “task Y can be done unconsciously”. Certainly, activity in early visual cortex is important for representing perspectival properties (and for vision in general obviously), but I don’t think this entails early visual cortex is sufficient for producing the behavior we observe (in our experiments in particular but also just in general). So, although V1 might be important for representing perspectival shapes and for visual search (but see my comments below) it is far from clear that it is sufficient for driving behavior. But I reckon your point may survive if we bracket particular implementation claims. Could our behavioral results be explained by representational similarity that, however, does not reach consciousness? I suspect it would be very hard to produce irrefutable evidence of whether something is part of conscious experience or not (otherwise we wouldn’t be having these discussions!). However, I think some of our experiments do reinforce the idea that our results are driven by phenomenology and not merely by an unconscious representation of perspectival shapes. I didn’t mention this in the post, but in Experiment 6 in our PNAS paper we forced subjects to see the stimuli for one whole second before they were able to even start preparing a response. In the real world experiment, subjects saw the stimuli in front of them for minutes. This is far from definitive evidence that the representational similarity that drove the RT differences existed at the level of conscious experience, but it does make it rather plausible. Unconscious effects tend to be subtle and hard to measure in the lab. Our results are robust and anything but subtle. I would even dare say that when doing the task one can introspectively feel that one type of trial is harder than the other. I don’t think this is so because subjects were able to introspect a few milliseconds reaction time difference—rather it might be because they experience the representational similarity between target and distractor when this is rotated and with it the increased difficulty of those trials.
I also wanted to make few (admittedly nitty-gritty) remarks about your two claims. I agree with (a): proximal properties are most likely represented in V1. (Interestingly, there’s evidence that also distal properties are represented in V1, e.g., Murray et al. 2006 Nat Neuro; Sperandio et al. 2012 Nat Neuro, Zeng et al 2020 JNeuro). I’m not sure what to make of (b) though. On one hand, the Li paper you cite is a computational modeling paper that takes their model to mimic V1 physiological characteristics, but doesn’t really uses neural data. On the other hand, the author themselves admits that the impoverished inputs to their model means that it “cannot yet generate spatially precise stimuli, such as ellipses and circles of exactly the same size, nor can it yet simulate many of the more complex stimuli used in psychophysical experiments”. Moreover, “the search ultimately requires decision making and often visual attention or topdown control (especially when the subject knows the target identity), and many attentive and quantitative aspects, e.g., conjunction detections and search times, cannot be modeled in my model without assumptions about these additional, probably extrastriate, mechanisms.” So, even if V1 is relevant for creating pre-attentive saliency maps supporting visual search and pop-out, I don’t think that activity in early cortex is sufficient for explaining visual search in general (even when perspectival shapes are involved) or for explaining the specific RT effects that we observed using realistic-looking, fully-3D stimuli. Rather, visual search (and its full behavioral profile) is likely to operate at a conscious level (e.g. Rauschenberger & Yantis, Nature, 2001).
So I think that you (and Jonathan and Ben) make an interesting point insofar as our results don’t once and for all prove that perspectival shapes are part of our conscious experiences. But, besides showing representational similarity of perspectival shapes that affect attention and behavior, I do think that our experimental design and the stimuli we used make it a likely conclusion that this representational similarity operates at the level of conscious experience.
Thanks Jorge, for this nice and detailed response.
There is a lot for me to think about here. Let me try to make a few quick comments. This is super interesting!
It now looks like an issue of inference to the best explanation, of your RT effect in visual search. One hypothesis is that conscious representations of proximal shape explains it; another is that V1 representations of proximal shape explains it (and I’m assuming, for the sake of discussion, that V1 representations are unconscious). The challenge for me is to show that the V1 explanation is not worse that the “consciousness” explanation you propose. In such a case, both explanations would get roughly equal credence, which will block your argument. Does this sound right?
(This is why I’m intentionally focusing on issues of implementation. It’s more difficult for me to think, in the abstract, of whether your effect can be driven by an unconscious representational similarity. Having a concrete hypothesis in mind makes it easier for me.)
So let me add something to strengthen the V1 hypothesis (based on a point Ben Henke raised in a conversation). You seem to be correct to claim that merely appealing to pre-attentive saliency maps is not sufficient for explaining your RT effect. However, V1 is also modulated by feature-based attention (e.g.: Liu, T., Larsson, J., & Carrasco, M. (2007). Feature-based attention modulates orientation-selective responses in human visual cortex. Neuron, 55(2), 313-323.). And feature-based attention seems to be very important for visual search. If you are looking, say, for a certain orientation, then feature based attention would (ideally) amplify representations of stimuli with that orientation. This will facilitate visual search in a way that depends on the task requirements (i.e., on what you intend to find).
So now, consider the difference between trial A and trial B from your post. Since the task is to find the oval, feature-based attention will amplify representations of oval-ish proximal features in V1. Thus, in trial A, certain proximal features of the oval will be amplified but not those of the head-on circle (since they significantly differ in proximal features). In contrast, in trial B, proximal features of the oval as well as of the rotated circle will be amplified (since they share lots of proximal features). In other words, in trial B, attention will “select” both target and distractor (for amplification, for resource allocation, etc.), but in trail A it will select only the target. (of course, I’m describing an overly “clean” picture just for illustrative purposes. It might be that in trial A certain proximal features of the distractor are also selected, but not to the same extent as in trial B). This predicts that in trial B search time should be longer than in trial A. In thus seems that V1 hypothesis has the resources needed to explain the RT effect.
But maybe the heart of the matter lies in deeper details:
You write that “visual search (and its full behavioral profile) is likely to operate at a conscious level.” I grant that many aspect of parts of visual search operate on a conscious level, but since V1 participates in it, and since (let us assume) representations in V1 are not conscious, it follows that visual search does not entirely operate at a conscious level. The V1 component of visual search operates (we are assuming) at an unconscious level. I take it that you don’t wish to deny that. What you wish to deny is that the V1 component of visual search is significant enough to explain the 50ms RT difference between trial B and A (in experiment 1). Is there any reason to think that V1 activity (via feature-based attention) cannot by itself explain a 50ms RT difference? Is there evidence that effects of V1 on behavior (when comparing between two conditions) tends to be smaller than 50ms?
You write that “unconscious effects tend to be subtle and hard to measure in the lab.” Which unconscious effects do you have in mind? Do they involve visual search and/or feature-based attention? I’m basically wondering what is the evidential basis for this general claim. It could be that paradigmatic unconscious effects are typically measured by a certain kind of experiments, say using flash suppression or backward masking, and maybe (I take your word for it) such effects are subtle and hard to measure, but the V1 hypothesis is apparently different (since it doesn’t involve something like continuous flash suppression or backward masking), so it is not clear that it is kosher to generalize from the former to the latter. (Also: what does “subtle” mean here? Is it a claim about how statistically significant the result is? Or about the size of RT?)
A final point concerns experiment 6. Can the V1 hypothesis (with feature-based attention) explain its results? I haven’t thought of this carefully yet, but maybe an explanation might go as follows: feature-based attention is active throughout the trial (let us suppose), and so it continuously selects both target and distractor in trial B, in V1, but not in trial A. So even after a full second of watching the stimuli, your attention is drawn to the distractor, in V1, in trial B, but not in trial A (some resources are allocated to the distractor instead of only to the target, etc.). This might explain the difference in RT (which is smaller in experiment 6: it is only a 16ms difference). But this is just quick proposal, from the top of my head (also note that experiment 6 is somewhat different from the others: it is not strictly speaking measuring the time it takes to find the target since the target it already found (in both trial A and B) well before the 1000ms is over. So maybe the V1 hypothesis can get away with not explaining it – since it is not a “classic” visual search task).
Thanks again for your response and for this great work!
Assaf, thanks again for a wonderful reply. I’ll just focus on a brief point about the crux of our conversation. Our experiments show that representations of perspectival shapes are part of our psychology and that there is a representational similarity between objects with different distal shapes that share a perspectival shape. When I write that this representational similarity is something that we experience, I’m just relying on the standard understanding of visual search: the slowdown we observed is likely due to the similar appearance of target and distractor in subjects’ experiences. This is, of course, not definitive proof, and that’s why I appreciate your push back but also why I appealed to an inference to the best explanation. I am using visual search and similarity of appearance in a very “normal”, straightforward sense. Just as finding a red book in your shelf is easier when other books are green than when they are orange (because red and orange appear similar to each other), an elliptical target is harder to find next to a rotated circular object because they have a similar appearance.
Does finding a red target require activity in V1 and pre-conscious feature-based attention to generate the characteristic pop-out effect of visual search? Most likely. But I don’t think this entails that because activity in V1 is not accessible to consciousness therefore the pop-out effect that takes place when looking for a red target next to green distractors is not experienced. Moreover, red and orange may produce similar feature maps and may attract feature-based attention similarly, but my answer would be the same: that these cognitive operations start operating at an unconscious level or that their effects are observed in early visual cortex does not entail that the difficulty of finding a red object among orange objects isn’t driven by how closely they appear to us in our conscious experiences of them. After all, all sorts of unconscious/subpersonal processes are involved in all vision, but that doesn’t immediately show that it is those processes (alone) what most directly drive complex behavior and attention. So even though there’s evidence that attention has an effect in early visual cortex, I don’t think it’s so easy to argue that it is that unconscious part of a complex process what explains the behavior we observe. I think we’d need direct evidence to support that claim. More importantly, I don’t immediately see why our case at hand should be significantly different from other cases of visual search: finding an object that looks similar to other is hard because we experience them as being similar. This is true of colors and I suspect it’s true of perspectival shapes as well!
Very briefly, Jonathan, if the similarity space of shapes-at-viewing angles were defined by 2d projections, then an oval-viewed-head on would be indistinguishable from a circular disc-viewed-obliquely (at the appropriate angle). But in ideal circumstances, i.e., when you have sensory information about the viewing angle, this is not true. In fact, this is the original fact about shape-constancy that Smith, Schwitzgebel, and others (including me) wanted to highlight. Am I missing the point?
hey Mohan:
the claim is that an oval-viewed-head-on and circle-viewed-at-an-angle match on the parameter of proximal shape (and that this explains the interference Morales et al bring out). it doesn’t follow that the two are indistinguishable unless the parameter of proximal shape/the similarity measure defined over proximal shape is the only parameter the visual system has access to. but Morales and I both pointed out that there are reasons to think the visual system additionally has access to the parameter of distal shape, on which the oval-viewed-head-on and circle-viewed-at-an-angle do not match.
indeed, I agree w your claim, defended in your “How things look,” and also in my “Perception and computation,” that the visual system also represents viewing angle (at least in many such cases) as a third parameter. (not sure if Morales would agree here.)
but if there’s more than one visually represented parameter/similarity space at work here, the claim that there’s a match on one of them — and that that match explains one of the attested behavioral reactions — doesn’t commit one to the claim that there’s indistinguishability/matching on all of them. no?
That’s really interesting, Jonathan. I know that Will Davis at Oxford is working on the idea that the similarity space of distal colour is 9-dimensional — 3 for reflectance times 3 for illumination. This accommodates my colour-as viewed in-illumination format — C-reflectance-in-C’ illumination allows for three dimensions of variation in both the C and C’ spots. I think you are suggesting an idea somewhat like his.