Dustin Stokes: Reply to Commentators

Dustin Stokes

Again I want to thank Becko, Jonna, and John. I truly respect their research and it’s flattering that they each took the time to read the book and offer such insightful commentary. It has given me a lot to think about. The comments are distinctive, each taking up rather different points of emphasis. One consequence is that it is difficult to thread together a unified response. Instead I will address the comments in the order they were posted. Each commentary is rich and extensive and so I won’t be able to cover every point. I focus on a central point or two for each.

The short of my initial reply to Becko’s instructive comments is, “You’re right! We agree! On so much!”. And upon further reflection I’m not sure we disagree about much. Becko usefully lays out a three-part distinction between perceptual development (PD), perceptual learning (PL), and cognitive permeation (CP). I agree entirely with Becko’s characterization of each phenomenon and moreover that they all occur. And I agree that some cases of PL are cases of expertise. But Becko is correct that I could be clearer about why and how I think this. Part of the problem, or at least presentation, is that although I want to move away from talk of modularity and cognitive penetration, I am especially interested in cases of genuine change in experience that require explanatory appeal to cognition. So my emphasis tends to be on cases of Becko’s CP and at the cost of equal discussion of PL. I also much prefer Becko’s second characterization of PL, in particular as being active and involving practice. I may have been less than perfectly clear on this point, but when I mention passivity, it was to suggest that expertise effects of the kind I discuss “…would not occur if vision were a strictly passive process of sensory reception” (165: emphasis added). What I should add here, given Becko’s pressure, is that many of those effects are instances of PL (and they also require perception to be active).

I agree that some cases of PL are or can manifest instances of epistemic virtue. Becko rightly identifies part of my analysis that suggests otherwise, since it places emphasis on how the agent is herself responsible, at least in part, for the cognitive etiology of her perceptual improvement. In that discussion I think I probably draw too tight a connection between agency and cognition. But I think there are at least two ways, along distinct dimensions, to maintain that some cases of PL are cases of epistemic virtue. First, if one is a faculty virtue theorist then perception is already amongst the virtuous mental faculties. What experts (of both the PL and CP variety) have done is improve that faculty in domain-sensitive says. Second, and related, even in non-cognitive involving cases of PL the agent is still very much responsible for the motivations, drives and domain-sensitivities operative in the improvement of her perceptual skills; it is the agent who practices after all. This is epistemic virtue qua (improved) cognitive or intellectual trait, and the agent still gets some of the credit. I’m inclined then (and I have a hunch that Becko will agree) that we can think about the virtues of expertise on a continuum, running from the less cognitive/“demand-side” to the more cognitive/“supply-side”. In spite of differences along those dimensions, what puts an agent on the continuum is the active role she plays in generating changes in her perceptual skills. This dovetails with something I highlight in the book: theorizing the mind as broadly malleable reveals just how deeply we, qua agents, can change our perceptual contact with the world.

Finally, I too want to generalize beyond empirically studied perceptual experts. I agree that we are many of us (maybe all…) actual perceptual experts. I do try to indicate this in the book, identifying how much of our lives and habits are deeply sensory, how we can change and improve our bodies and minds, how the body and brain is “plastic”. And I do suggest there that perceptual expertise is pervasive and “not an isolated or rare phenomenon (like, say, the Müller- Lyer illusion)” (210). But the cheeky jab there may have overshadowed the substantive point.[1]  

This brings me to Jonna’s useful comments, which highlight the possibility of epistemically vicious perceptual expertise. Before addressing her plausible case/s of vice, I want to make a few clarifications about how I am thinking about expertise and virtue.

Jonna worries that to attribute epistemic virtue to perceptual expertise itself is a category mistake, since the former is more general than the domain-specific instances of the latter. I’m willing to grant this point, and indeed I later put things in a way more sensitive to the distinction (as Jonna notes): noting that “the [intellectual] virtue resides in the expert’s cognitively sensitive perceptual skill” (217). This was part of the motivation to ground the virtue theoretic account in a teleological analysis of perception (Ch. 7). Experts improve their perceptual systems and skills in line with the natural norms of perception. A perceptual faculty like vision has the biological function of providing useful representations. One, but not the only, mode of utility or success is accuracy. Also included is sensitivity to behaviorally relevant features and patterns, speed and efficiency, and with less distraction. To improve in these ways is to improve the faculty virtue of vision. But, Jonna is right, this does not mean that one acquires, say, a bird-specific epistemic virtue any more than it amounts to acquiring novel biological functions for accurately seeing birds. Nonetheless, the improvements made are causally dependent on the specialized domain and what’s interesting is that some such improvements are sensitive to the concept-rich cognitive learning within that domain.

Jonna then provides a clear and plausible case of vicious perceptual expertise. My emphasis on implicit bias and facial recognition is supposed to highlight these possible risks, but Jonna is right to substantiate a case that is both expert-involving and epistemically problematic. My view can’t rule out these cases; with perceptual malleability comes possible epistemic vice, trade-offs and opportunity costs. But I also suggest that with malleability comes the potential to offset or correct some of those risks, and there are epistemic standards and norms specific to the domain of specialization. On that front, I think the final questions Jonna raises are exactly the ones to address in future work (mine or others). My hope is that the case made for perceptual improvements that manifest virtue (and thus, potentially vice) would open up conceptual space for pursuit of those very answers and analyses.

Finally, I turn to John and his penetr…permeating commentary. Centrally, John observes that my account of malleability, and its emphasis on perceptual recognition, requires modular early visual processing. As I understand it, this implies an adequate defence of modularity. I agree with the observation; I resist the implication.

Early vision can be demarcated neurophysiologically or functionally-computationally, or it may be a temporal notion (say <100ms stimulus onset). In any case, I think John is right that it is probably strongly modular and that this stability is important for the overall visual system. Further, it is not implausible that feature detecting components like groups of simple and complex cells in the primary visual cortex are informationally encapsulated, as well as many other neural circuits and low-level components in the overall visual system. This is compatible with the malleable architecture that I advocate. And this is because a defence of the modularity of sub-components of visual processing, say of early vision, does not amount to a defence of the modularity of perception. Indeed, I think some of the “classic” rhetoric used by modularists on this front (e.g. Pylyshyn 1999) is misleading at best. Put another way, this defence keeps the strength of modularity—informational encapsulation—at the cost of the scope. It does not secure a thoroughgoing modular architecture of perception because, and theorists like Pylyshyn admit as much, outputs of components like early vision do not alone determine perceptual experience.[2]   

Can the modularist accept this, maintaining that the goal is not more broadly scoped modular architecture? Can modularity be restricted and eschew concerns about perception at the level of experience? I don’t think so, and that is because I don’t think the modularist’s motivations are just ones that concern computational processing. Projects of mental architecture are not just ones of psychological modelling. Indeed, modularity of the sort in question is partly motivated on epistemic grounds: encapsulated perceptual systems are supposed to afford a preferable epistemology, where the representations provided by such a system are argued to be more objective and reliable. The “function of perception is to deliver to thought a representation of the world.”, “[n]ot the distant past, not the distant future and not…what is very far away…it is understandable that perception should be performed by fast, mandatory, encapsulated, etc. systems…” (Fodor 1985: 5; emphasis added). That’s the argument. Making good on this epistemic promise thus requires a defence of more broadly scoped encapsulated perceptual processing—whatever physical processing that experience is identified with, constituted by, supervenient upon, or the output of. And early vision is narrower than this.

The modularist may then retort that cases of perceptual expertise are more properly identified as some kind of “recognition”. And one might then categorize recognition as late vision or, perhaps stronger, as perceptual judgment or seeming or belief. If that is right, the opponent contends, then I’ve just provided evidence for cognitive influence on something cognitive or post-perceptual. Generally, I’m not compelled by such a proliferation of distinctions. Here in brief is my line of thinking. (And to foreshadow, this may just come to a theoretical impasse between John and myself. And that isn’t meant to be dismissive: John’s views and criticism on these issues have and do regularly challenge me to re-evaluate my own views).     

As I developed my thinking in the book, I converged more and more on thinking about perception as a process. By contrast, I find it more and more alien to think about perceptual experience as a static state or as an output of a computational process that can be “fixed” and inspected for content. And that is most simply because I just don’t think that is how we experience the world, how we act on the world, how we know about the world. Perception is a process that takes time, and to some degree parts of that process can be delineated neurally or computationally. But when it comes to experience, as we have it, I think things are much more fluid, active, and integrated. Expert “recognition” of patterns, gestalts, holistic features and the like is (sometimes) experienced, perceptually and with phenomenal character. In these cases it doesn’t present as something late or post-perceptual or somehow separable from perception. And the convergence of empirical study and evidence corroborates this “ordinary” observation. So, finally, while I can grant that some components of our perceptual systems are modular, my view remains that perception is generally malleable and cognition can play an important role in shaping it.     


Fodor, J. (1985) ‘Precis of The Modularity of Mind’, The Behavioural and Brain Sciences 8: 1-5

Pylyshyn, Z. (1999) ‘Is vision continuous with cognition? The case for cognitive impenetrability of visual perception,’ Behavioural and Brain Sciences 22 (3):341-365.

Stokes, D. & Bergeron, V. (2015). Modular architectures and informational encapsulation: A dilemma. European Journal for Philosophy of Science 5 (3):315-38.

[1] And I agree further with Becko about the possibility of perceptual expertise in non-human animals. A footnote to the discussion reads, “it is plausible that there are analogues throughout the biological world. Although in less diverse ways, many animals are experts in detecting behaviourally relevant objects and features in their environment. And the ones that do this exceptionally well are, all else being equal, fitter. It is entirely plausible that some of this “expert” performance occurs at the level of the animal’s senses, whether it is seeing a ripe fruit, smelling an appropriate mate, or hearing a dangerous predator” (210, fn. 3).

[2] One can trade the opposite way, as the massive modularist does, and give up strength for scope. For what it’s worth, Vince Bergeron and I have argued (2015) that this abandons a key motivation for modularity—functional independence—since it gives up encapsulation in favor of abundant cross-talk across modules. The dilemma for modularity that we propose hinges around the choice between strength or scope of encapsulation.


  1. Jonna Vance

    Hi Dustin! Thanks for these generous replies to the commentaries. I have a question about the relativity in perceptual expertise. Perceptual expertise is an enhancement or increase in various sensitivities (recognitional, discriminatory, organizational, etc.). One could think of the enhancement as being relative to a “normal” perceiver. But I worry that risks counting the mundane forms of expertise that Becko highlights as non-expertise. Or one could think of the enhancement as being relative to a fully untrained perceptual system. A worry for that option is that it would count virtually every aspect of current human perceptual systems as expert, since virtually every aspect has been trained during evolution and lifespan. I’m interested in whether you have a preferred way of thinking about the baseline (or baselines) against which expertise is an enhancement. And, if you’re a pluralist about the relevant baselines (which sounds plausible enough to me) you have a way to spell out some of the most useful baselines to think of expertise in terms of.

  2. Jonna Vance

    For some dimensions of expertise, it might be best to think of it relative to a ceiling. For example, an expert capacity for perceptual discrimination with respect to some feature could be conceptualized in terms of 100% reliability in recognition of that feature. For that dimension, expertise could be approximation to an ideal, rather than enhancement above a baseline. But with, say, expert discriminability, there doesn’t seem to be an ideal. There’s indefinite potential for finer and finer discrimination (e.g. between shades of red). So with that dimension of expertise, capacity above a baseline seems more promising as a way to get at what the expertise consists in, and what it’s relative to.

    • dustinstokes

      Hi Jonna. Many thanks for all of these additional comments. This is something that I really need to think about. I really don’t have a singular threshold worked out for perceptual enhancement (whether reaching the expert level or not). I do, though, think that this is very likely a continuum, and perhaps along multiple dimensions (so, not “all-or-nothing” as Becko suggests). I think your ceiling proposal sounds plausible but, by the same token, I’m very much with Becko in thinking that this is unlikely to be a matter we determine just on conceptual grounds. I think it will be highly context-sensitive, not least because the activities within domains of expertise are rather varied (e.g. identifying a novel instance of a stimulus type vs. performing a visual search; categorizing an instance of a stimulus type vs. acting on an object or event in the environment), and accordingly the norms for performance will vary. So I think attempts to determine a threshold or ceiling from the armchair will likely draw arbitrary lines and potentially introduce worrisome biases in the vicinity of some of Becko’s concerns. But I really do think there are interesting questions here…. Y’all wanna write a paper?

  3. John Zeimbekis

    Dustin, when you write that recognition is not “post-perceptual or somehow separable from perception”, who are you objecting to? Sorry to press this point but it’s important to get the modularist’s claims right. I don’t know any modularists who deny that visual recognition is a perceptual phenomenon. Instead they hold that recognition is a part of perception that draws on conceptual information.

  4. Becko Copenhaver

    Jonna, forgive me for weighing in on my own view relative to your question. I am more than happy to think of virtually every aspect of current human adult perceptual systems as expert, since virtually every aspect has been trained during evolution and lifespan. Indeed, part of my problem with the orthodox view is that it takes what I take to be a mere theoretical abstraction (call it “Humean” perceptual experience) to be an accurate empirical claim about actual adult humans experience the world.

    This will predict significant cross-cultural and cross-temporal differences in experience. And it will entail some kind of ethics-of-perception. On my view, the ethics of perception will concern mainly the built environment and the material conditions of unjust hierarchies: our perceptions ought to change by changing the ways we curate our environment to shield dominant groups from perceiving the contingent conditions that maintain their domination; and our perceptions ought to change by changing the unjust hierarchies that result in pathological perception.

    As for a ceiling, I think it’s really important not to approach this as an a priori question (not that I am suggesting you are). In the old way of thinking about the cognition-perception divide (that I resist) there’s a lot of back-and-forth about just what kinds of properties could be represented via PL or CP. “Yes” to tomatoes, but “no” to Churchlandian properties like “being the aperiodic atmospheric compression waves produced as the coherent energy of the ocean waves is audibly redistributed in the chaotic turbulence of the shallows…” I think this is not going to be a neat, all-or-nothing, a priori fact but a more complicated, context-sensitive set of empirical facts about humans and non-human animals in environments.

  5. dustinstokes

    Hi John and thanks for your comment. (I attempted to embed this comment in reply to yours…but something is amiss with WordPress.) I think this may be a place where we disagree about where the “goalposts” are, or should be. I don’t think that a defence of the modularity of early vision (however one characterizes it, which I think comes with its own controversy) is a sufficient defence of modularity. And that’s for the reasons I gave in my replies to commentators: I think any interesting version of modularity is a broad scoped architecture, and one that concerns perceptual experience. For similar reasons, I do not think that one can accept that the kinds of cases of expertise I discuss in the book can be defended (by the modularist) as a perceptual phenomenon (whether called ‘recognition’ or something else) and, in the same breath, say that perception is modular. So, if the modularist wants to maintain that recognition is conceptually influenced, as you suggest, then I think they must also maintain that it is somehow not part of perception.

    I propose we have another night out in Crete…ahhh the good ol’ pre-pandemic days…and work this all out!

Comments are closed.

Back to Top