Welcome to the Brains Blog’s Symposium series on the Cognitive Science of Philosophy! The aim of the series is to examine the use of methods from the cognitive sciences to generate philosophical insight. Each symposium is comprised of two parts. In the target post, a practitioner describes their use of the method under discussion and explains why they find it philosophically fruitful. A commentator then responds to the target post and discusses the strengths and limitations of the method.
In this symposium, Pascale Willemsen discusses the experimental study of thick ethical terms, with Bianca Cepollaro providing commentary.
* * * * * * * * * * * *
Metaethics and Experimental Philosophy: 
A Journey Through Thick and Thin
Pascale Willemsen
_________
I.
Experimental philosophy is an interdisciplinary approach to philosophical questions and problems that uses empirical methods from various cognitive-scientific disciplines such as psychology, experimental linguistics, and neurosciences. Even though experimental philosophy is a relatively recent movement and has only been around for 25 years, its practitioners have shown remarkable productivity. In January 2021, roughly 2000 papers were listed as ‘Experimental Philosophy’ on PhilPapers.org. Roughly one-fourth of these papers were categorised as ‘Ethics’. In this article, I would like to focus on one specific sub-discipline of moral philosophy that, I argue, can benefit greatly from engaging with experimental philosophy, namely metaethics.
As Sayre-McCord (2014) said, ‘Metaethics is the attempt to understand the metaphysical, epistemological, semantic, and psychological presuppositions and commitments of moral thought, talk, and practice’. Central to metaethics are, among others, questions about the meaning of ethical terms such as ‘good’ and ‘bad’, whether moral statements containing these terms can be true or false, what it is that people express or do by using moral language, and how moral language relates to motivation and behaviour. It is clear that these questions demand at least partly empirical answers. It seems absurd to claim that we can properly understand the meaning of ethical terms and what people express and do with them without looking at the way people actually talk. Additionally, it would be highly questionable to make any claims about moral language and its relationship to motivation and subsequent behaviour without consulting or conducting empirical studies.
For the sake of this article, it is impossible to cover all there is to say about the relevance of experimental methods for metaethics. What I wish to do instead is draw attention to a recent project in experimental metaethics that aims to understand thick ethical terms.
II.
Concepts such as ‘rude’, ‘cruel’, ‘friendly’, and ‘compassionate’ are what philosophers call thick ethical concepts. They are characterised by their provision of both evaluative and descriptive content. They communicate that an action, behaviour, custom, person, or character trait is viewed with approval or disapproval, and they further communicate in virtue of what descriptive features they are evaluated in this way. In contrast, thin ethical concepts, such as ‘good’ and ‘bad’, are said to be merely evaluative. With such a rough-and-ready notion of thick concepts in mind, philosophers have sought to provide a proper definition for these concepts and spell out more clearly how thick concepts differ from thin concepts, as well as from descriptive concepts, such as ‘green’ or ‘round’.
As many philosophers in the debate readily admit, the attempt to define thick concepts is a long way from being accomplished. Discussions and significant disagreement have revolved around five central questions:
- Separability Question
- Location Question
- Centrality Question
- Variability Question
- Action-Guidingness Question
It is beyond the scope of this article to do justice to the complexity of the entire debate. For now, let us focus on two questions in more detail, determine what empirically testable predictions they make, and analyse how experimental philosophers have started examining them.
III.
The Location Question asks where exactly we can find the evaluative dimension of a thick term or concept. The two options discussed are (a) that the evaluation is part of the semantic content of a thick term or (b) that it belongs to what is pragmatically conveyed beyond what is literally said. The philosophical literature often relies on intuitions about whether statements such as ‘Tom is cruel, but he is not bad’ sound contradictory. If such a statement does sound non-contradictory, we can conclude that a negative evaluation of Tom is not intrinsic to saying that he is cruel and, thus, not semantically conveyed. Despite the widely accepted relevance of such ‘linguistic data’ (that is the author’s own intuitions about what linguistic intuitions most people have), no systematic empirical studies have been conducted on thick concepts and their evaluative dimension. This is even more surprising, given that in experimental linguistics, empirical means to test whether a statement sounds contradictory are readily available. Perhaps the most widely used test is the Cancellability Test: take a bit of information for which it is unclear whether it is merely conversationally implicated or semantically connected to a concept or sentence, then explicitly cancel this bit of information and see if the resulting phrase sounds contradictory.
In two recent papers (2020, 2021), Kevin Reuter and I aimed to determine whether the evaluation of thick concepts is communicated by semantic or pragmatic means by using the cancellability paradigm. We reasoned that if pragmatists are correct and the evaluative aspect is only conversationally implicated, cancelling the evaluation should not lead to a contradiction. Take, for instance, the sentence: ‘There is the door’. This statement not only communicates the location of a door but in some contexts carries the particularised conversational implicature that the addressee is asked to leave the room. Still, saying ‘There is the door, but by that, I am not saying you should leave’ does not yield a contradiction. Generalised conversational implicatures work similarly but depend less on the specific context. If the evaluation of a thick term is conversationally implicated, cancelling the evaluation should be equally non-contradictory. If such empirical evidence were to be found, this would count as direct support for the pragmatist position, which treats the evaluation as a conversational implicature. In contrast, semantic separabilists claim that the evaluative component cannot be cancelled. For example, a person who says, ‘What Tom did was rude, but by that, I am not saying something negative about Tom’ makes an infelicitous statement.
Our study yielded surprising results. First, neither the prediction of the pragmatist view nor the semanticist view were met. Against the pragmatists’ prediction, the evaluation of a thick concept was significantly harder to cancel than the conversationally implicated content and resulted in higher contradiction ratings. This effect persisted for two different embeddings of thick terms (Behaviour and Character). Challenging the semanticist, the evaluation of thick concepts was significantly easier to cancel than the semantically entailed content.
Second, going beyond the philosophical dispute and each side’s respective predictions, we assumed that polarity (positive vs. negative) might play a role in how simple it would be to cancel the evaluation of a thick concept. We have not seen any suspicion along these lines in the metaethical literature, but given what we know from the experimental philosophy of morality, this was a possibility worth exploring. Our study revealed a strong polarity effect on contradiction ratings. For positive thick terms, contradiction ratings were significantly lower than those of negative thick terms as well as semantic entailments. This polarity effect is hitherto unknown and has not been predicted by any of the various accounts of thick concepts. In fact, the effect challenges the tacit assumption that thick terms and concepts form a homogenous group of which we can ask broad questions about separability and how evaluation and description are connected.
Kevin Reuter and I happily admit that for the time being, we can only speculate as to why the polarity effect occurs. We outline several possibilities in our paper. Together with Lucien Baumgartner, we are in the process of exploring these explanations more systematically and we also developing new experimental approaches to evaluative language, including other experimental paradigms and corpus linguistic approaches (see Reuter, Baumgartner & Willemsen, ms). The investigation of thick concepts is very much in its infancy. However, by applying a very simple experimental paradigm to sentences that so far have only been used as ‘thought experiments’ and philosophical ‘intuition pumps’, we have already empirically challenged two of the most prominent views on thick concepts, as well as the shared assumption that positive and negative thick concepts communicate evaluation in the same way. This seems to be enough to motivate a much larger empirical investigation of thick concepts and normative language more generally.
IV.
Independent of the linguistically-driven debate in metaethics, it has been argued that thick concepts possess an important connection to actions. This is what I have called the Action-Guidingness Question. Take as an example a friend telling you that your behaviour at the party last night was rude. In addition to simply communicating her disapproval, you might infer an even more far-reaching communicative goal: your friend does not want you to behave in the same way at the next party. What she tries to do is make you change your behaviour. Bernard Williams (1985) offered one of the earliest and most influential attempts to define thick concepts in terms of their potential to guide actions:
The way these notions are applied is determined by what the world is like (for instance, by how someone has behaved), and yet, at the same time, their application usually involves a certain valuation of the situation, of a person or actions. Moreover, they usually (though not necessarily directly) provide reasons for actions.
(Williams, 1985, p. 143 f.; own emphasis)
Many have adopted this suggestion. In addition to being endorsed by many scholars, it seems that the idea of thick concepts being connected to actions adequately captures what people mean to communicate by calling someone else ‘rude’ or ‘cruel’.
After speaking at length about how metaethics could profit from experimental results, it might not come as a surprise that I believe that even an idea this plausible requires empirical support. Ultimately, whether thick concepts have the disposition to guide actions is a matter of their psychological effects on people. Judith Martens and I have started to investigate the action-guidingness of thick concepts. We believe that to develop a proper metaethical theory of thick concepts and their relationship to actions, we need to understand a) whether there are circumstances in which thick concepts provide reasons for action, b) whether there are circumstances in which thick concepts do not provide reasons for action, and c) how these two classes of circumstances differ from one another.
As a first study (Willemsen & Martens, ms), we tested the idea that thick concepts have the disposition to provide reasons for action, especially when a person is reasoning about what would be best to do. Participants were presented with the following prompt:
“Sally is struggling with a decision on how she should act. The situation is tricky, and Sally has several alternative options. To decide which option she should choose, Sally makes a list of things that count against and in favour of each of these options, and also of things that speak neither against nor in favour of these options. Sally thinks about Option A and writes down ‘Doing this would be [term]’.”
The results suggest that in contexts of self-reflection when an individual is attempting to determine the best course of action, thick terms strongly count in favour of or against an option. Descriptive terms do not share this disposition in this context. It seems that philosophers have been right all along in their assmption that thick concepts have the potential to guide actions.
Building on Willemsen and Reuter (2000, 2021), Judith Martens and I also wondered how exactly reasons for actions are communicated. Again, there are at least two obvious options on the table. First, reasons for actions are communicated as a matter of the semantic meaning of thick concepts. Alternatively, uttering a thick term might simply conversationally implicate a reason for action. Judith Martens and I used a variation of the cancellability test to determine how reasons for actions are communicated. We asked participants to ‘Please imagine that Sally said the following sentence: “What Jim did last week was cruel, but by that, I am not saying that Jim should have acted in a different way.”’
To our and potentially many philosophers’ surprise, we made two findings. First, the contradiction ratings for every single thick concept we tested were significantly below the neutral midpoint and even lower than the cancellability ratings of conversational implicatures. The action-related, reason-giving component of a thick concept seems to be only very loosely connected to a thick concept. Second, mirroring the findings of Kevin Reuter, Judith Martens and I also found a significant polarity effect, such that negative terms received higher contradiction ratings.
Again, I do not pretend to have a sufficient grasp on what is going on here, and I believe that the journey into the exciting world of evaluative concepts has just begun. The reason I believe that this journey is worth travelling is that even in the first few steps, we have made exciting, surprising discoveries that suggest more questions and more answers.
* * * * * * * * * * * *
Commentary: Testing the Loaded Side of Language
Bianca Cepollaro
_________
I share Pascale’s enthusiasm for the new insights that an empirical approach to the philosophical study of language can offer, especially in the domain of what we may call loaded language, i.e. speech that does not only describe the world, but evaluates it. Within this broad field, there are entire areas of investigation that have been explored only relatively recently on theoretical grounds, let alone on experimental ones. As a matter of fact, the domain of loaded language encompasses not only moral discourse – on which Pascale’s post focuses – but also expressive speech, ranging from insults (jerk, bastard), interjections (shit!, fuck!), intensifiers (damn, fucking), to slurring terms, that is, derogatory words that target groups and individuals on the basis of their belonging to a certain category (think of racial and homophobic epithets, for instance). Many crucial questions arise around expressive discourse: what expressive speech is, how it works, how expressive content is encoded in language, what functions it fulfils with respect to the speaker and to their audience, in what relation it stands to morality (if any), when it should be censored (if ever).
In the very last few years, philosophers and linguists have started experimenting on expressive language, while until recently only psychologists and cognitive scientists have done so (see among others Bowers and Pleydell-Pearce 2011, Fasoli et al. 2015). This has been a pivotal turn in the study of expressives, just as much as for the investigation of moral terms discussed by Pascale: so far, scholars have solely relied on their own intuitions, assuming they are a reliable source of information. But of course philosophers’ linguistic intuitions can diverge very much, they are imbued with theoretical assumptions, and are ultimately a very biased source of information. In contrast, testing hundreds of untrained participants seems to be a more promising strategy to get a better picture of loaded language. Resorting to these empirical approaches has proved very interesting and fruitful for my own research on expressives, leading to surprising findings as to how the context affects the way in which slurs and insults are perceived (Cepollaro et al. 2019, Cepollaro et al. 2020).
However, having voiced enthusiasm for testing the loaded side of language, I feel like sounding a note of caution. When we experimentally investigate loaded language, we typically try to squeeze precise empirical predictions out of theoretical accounts. What we then find doesn’t often match any of those simplified predictions. Pascale mentioned how in two recent papers with Kevin Reuter (2020, 2021), they employed the cancellability paradigm in order to test semantic and pragmatic theories of thick terms. They found that the evaluative content of these expressions is – roughly speaking – harder to cancel than conversational implicatures (which goes against pragmatic views) and easier to cancel than semantically entailed contents (which speaks against semantic accounts). These findings are of great interest, but they leave room for some hermeneutical frustration nevertheless. In fact, an advocate of the semantic view can take these results as showing that the evaluative content of thick terms can’t be suspended like pragmatically implicated contents; they will then add that the reason why these evaluations are nevertheless easier to cancel than semantic entailments is that, in the absence of suitable non-evaluative counterparts, we are willing to force a non-literal reading of thick terms in utterances which would otherwise sound totally contradictory. A supporter of the pragmatic approach, on the other hand, can interpret these data as suggesting that the evaluative content of thick terms can be cancelled (they are much easier to suspend than semantic entailments), but since evaluations are so routinely associated with these expressions, it can be hard to get rid of them altogether. In other words, each theorist can stress the importance of a certain finding, by taking it as the primary and reliable result and then come up with a secondary mechanism that explains those results that fail to align with one’s favourite theoretical approach.
This is to say that these (and similar) studies should be seen as a preliminary exploration into a relatively unknown domain and – because of the subtleties of the matter – we should expect to often run into similar hermeneutical aporias. I’ve found myself in a similar situation too. In How Bad Is It to Report a Slur? (2019), together with psychologist Simone Sulpizio and philosopher Claudia Bianchi, we looked at how slurs are perceived in reported speech. Scholars disagree on whether a speaker who reports a slurring utterance is herself engaging in slurring (take an utterance like “My boss said that they aren’t going to hire a S” – where S is a slur, for instance a racial or homophobic epithet). This question is interesting for a bunch of reasons. First, it is a clue for understanding how slurs encode their pejorative content: is it semantically encoded or rather pragmatically conveyed? Second, whether slurs can be reported without being derogatory affects our online and offline language policies. Now, when armchair philosophers and linguists have examined their own intuitions, they came to diverging conclusions. According to some, when a speaker reports a slurring utterance, they – and not necessarily the reported speaker – are perceived as slurring. For these scholars, slurs should be banned not only from direct but also from reported speech (let’s call them prohibitionists; see Anderson and Lepore, 2013; Anderson, 2016). According to others, only the reported and not the reporting speaker is taken to be slurring; slurs don’t need to be banned from reported, but only from direct speech (let’s call them non-prohibitionists; see Schlenker 2007).
In order to shed some light into this debate between prohibitionists and non-prohibitionists, we asked participants to rate the offensiveness of utterances featuring slurs in two conditions: (i) direct speech of the form Y: ‘X is a S’, and (ii) reported speech of the form Z: ‘Y said that X is a S’, where X, Y, Z are proper names and S is a slur. We found that reported speech decreases the offensiveness of utterances featuring slurs, without entirely deleting it. Once again, these results do not exactly fit within any of the theories on the market. The prohibitionist could read these findings as showing that the pejorative content of slurs is ascribed to the reporting speaker and then appeal to a supplementary mechanism to explain why the derogatory power is nevertheless diminished: participants grant the possibility that the reporting speaker didn’t fully mean to slur and thus perceive slurs in reported speech as less derogatory than in direct speech. The non-prohibitionist, on the other hand, can take these very results as showing that the derogatory content of slurs is significantly diminished by report, and then appeal to a secondary mechanism to explain why it is not entirely cancelled: since competent speakers are supposed to know that slurs are tabooed to a certain extent, they are expected to avoid them in reported speech, or they will run the risk of sounding derogatory. Here comes the same pattern again: each theorist could in principle take one result as the primary and reliable finding and then come up with a secondary mechanism that explains those results that fail to align with one’s favourite approach.
Of course conflicting interpretations are not so unconstrained, diverging readings of empirical results are typically not on a par, and any supplementary hypothesis concerning secondary mechanisms needs to be tested further and experimentally supported. However, this couple of cases of open-ended interpretation show how we should always keep in mind that when we look into loaded language on empirical grounds, the matter at stake is so complex and full of subtleties, that one study will neither prove nor disprove a theory by itself, but can at best suggest promising lines of investigation and further thought.
With this – and many other – caveat in mind, long life to experimental philosophy of (loaded) language!
* * * * * * * * * * * *
References
_________
Anderson, L. 2016. “When Reporting Others Backfires.” In Indirect Reports and Pragmatics, edited by Capone, Alessandro, Kiefer Ferenc, and Franco Lo Piparo, 253–64. Cham: Springer International Publishing.
Anderson, L., and E. Lepore. 2013. “Slurring words.” Nous 47, no. 1: 25–48.
Bowers, J.S., and C.W. Pleydell-Pearce. 2011. “Swearing, euphemisms, and linguistic relativity.” PloS one 6, no. 7: e22341.
Cepollaro, B., S. Sulpizio, and C. Bianchi. 2019. “How bad is it to report a slur? An empirical investigation.” Journal of Pragmatics 146: 32-42.
Cepollaro, B., F. Domaneschi, and I. Stojanovic. 2020. “When is it ok to call someone a jerk? An experimental investigation of expressives.” Synthese.
Fasoli, F., A. Maass, and A. Carnaghi. 2015. “Labelling and discrimination: Do homophobic epithets undermine fair distribution of resources?” British Journal of Social Psychology 54, no. 2: 383–93.
Reuter, K., Baumgartner, L. & Willemsen, P. (ms). Tracing Thick Concepts Through Corpora.
Schlenker, P. 2007. “Expressive presuppositions.” Theoretical Linguistics 33, no. 2: 237–45.
Willemsen, P., Martens, J. (ms). Do Thick Concepts Provide Reasons for Action?
Willemsen, P., Reuter, K. (2020). Separability and the Effect of Valence. In Denison, Mack, Xu, Armstrong (Eds.), Proceedings of the 42th Annual Conference of the Cognitive Science Society 2020, pp. 794-800.
Willemsen, P., Reuter, K. (2021). Separating the Evaluative from the Descriptive: An Empirical Study of Thick Concepts. Thought: A Journal of Philosophy.
Williams, B. (1985). Ethics and the Limits of Philosophy, Cambridge, MA: Harvard University Press.
 
					
				
									
				
			 
			
Thanks for these really interesting thoughts, Pascale and Bianca!
Pascale, I was interested in your application of a cancellability test to the question about the action-guidingness of thick terms — whether such terms give reasons in virtue of their semantics or via pragmatic implicature. You explain that you and Martens found that contradiction ratings were very low, suggesting that the “action-related, reason-giving component of a thick concept seems to be only very loosely connected to a thick concept.” I wanted to offer a potential reply on behalf of the theorist who thinks that thick terms do communicate reasons for action through their semantics. Someone could argue that the reasons communicated by the meaning of a thick term are pro tanto reasons, not all-things-considered reasons. If that’s the case, it’s no surprise that it’s not hard to cancel the action-guidingness of a thick term. By saying something like, “What Jim did was cruel, but I’m not saying Jim should have acted differently,” you’re suggesting that Jim had other competing (and ultimately stronger) pro tanto reasons to do what he did. So, low contradiction ratings don’t provide evidence against the semantic hypothesis.
Maybe this is another example of what Bianca points out in her insightful commentary: there will often be conflicting interpretations of results in this area! The reason it struck me as interesting, though, is that it suggests that cancellability tests might be limited in what they can tell us about the semantics of thick terms (assuming that the evaluative component has a contributory or pro tanto character). I’d be interested to hear either of your thoughts about this.
And second, Pascale, a self-serving question (given my own interests): I’m curious about (4) The Variability Question, which you didn’t have time to discuss. Could you give a quick gloss on what that’s about?
Dear Zina,
thanks for your comments and questions. Very good points!
I believe you’re absolutely right to worry about a re- of misinterpretation of our test sentences and I think that your suggestion is extremely plausible. Judith had very similar concerns, and we are working on ways of testing pro-tanto and all-things-considered goodness and badness more directly. To be honest, this is quite a challenge if the sentence is still supposed to sound natural and be understandable without a lot of introduction. Judith and I would be very grateful for your suggestions of how to discriminate these two kinds of reasons from each other?
Here is at least a bit of evidence that you might find helpful. In our study, Judith and I did not only test formulations like “What Jim did was cruel, but I’m not saying Jim should have acted differently”. This formulation, we believe, leaves a lot of room for the kind of interpretation you suggest. But now suppose that we change the embedding of the thick term so that it no longer qualifies the agent’s behaviour but their character: “Jim is a cruel person”. We believe that this is a very natural way to use thick terms, and it is at least one that philosophers have been interested in. It might be true that cruelty provides a pro-tanto reason not to perform a certain action, but the situation can sometimes call for a cruel action anyway – possibly because all other options are even worse. This interpretation should not be available if we say that a person is cruel in general. As a consequence, “Jim is a cruel person, but by that I am not saying that Jim should change in this respect” should sound highly contradictory if a semantic interpretation of reasons for action were correct.
Our results do not support this prediction (you can find the results in the slides of a talk we recently gave https://www.pascalewillemsen.com/wp-content/uploads/2021/03/Empirically-Investigating-thick-concepts.pdf, slide 40 and 41). Contradiction ratings are even lower for the character condition for negative terms (Behaviour: 3.57; Character: 2.54), and they are the same for positive terms (Behaviour: 2.16; Character: 2.26). I believe this makes the pro-tanto vs- all-things-considered worry a bit less likely to explain the results. Or at least I don’t see how this explanation can account for the results.
We also asked participants to explain their contradiction ratings. This impression might be biased and we did not ask an independent coder to take a look at the responses, but I believe it is fair to say that we did not get any explanations that suggest a shift of attention from a pro-tanto to an all-things-considered evaluation. Personally, I would expect explanations like “Sometimes you just have to be cruel to help others” or “The truth might be considered rude, but that’s not a reason to lie”. We didn’t find responses like that.
I believe that this can hardly be the end of the story. I think that your comment, Zina, is indeed an extension of Bianca’s very insightful warning about the reliability and testability of linguistic intuitions. Knowing what’s going on in people’s heads is difficult. Also, it seems that we can often make sense of extremely weird and even contradictory utterances. If I say “Tom is a bachelor, but by that I’m not saying that he is unmarried”, I’m contradicting myself – at least at the level of what is said. However, you might interpret this statement as being ironic or sarcastic. You might also believe that what I really try to communicate is that even though Tom is married and not a bachelor in the literal sense of the word, he still behaves as if he were a bachelor. So far we still lack evidence on how our test sentences are interpreted and, possibly, contextually enriched. In addition to our efforts to shed more light on these and related questions, I would like to suggest the application of more diverse methods, such as vignette-based experiments in which more context is provided or corpus-linguistic approaches. I strongly believe that philosophers who have already committed themselves to any theoretical position should not use our data and interpret them in whatever way they like. The fact that no clear evidence can be provided in either direction is just all the more reason to run additional experiments and to dig deeper. The data being messy and confusing can never be a reason to just take from it whatever suits you and then stop investigating.
Thanks a lot for your question about Variability. Happy to elaborate!
The Variability Question is very closely connected to the Location Question. Although so much is unclear about how to define thick terms and concepts, the field generally agrees that they somehow communicate evaluative content. However, philosophers disagree about whether thick concepts are variable concerning the polarity – positive or negative – of the evaluations they communicate. In order to be a thick concept, is it necessary to communicate a specific evaluation with a specific polarity or simply any evaluation with indeterminate polarity?
According to the standard view, a thick concept such as ‘cruel’ necessarily and exclusively communicates a negative evaluation as part of its semantic meaning. ‘Cruel’, so the standard view holds, is a single-attitude concept (Väyrynen, 2019). ‘Cruel’ communicates a negative evaluation, and never a neutral or a positive one. The standard view holds that thick concepts are invariable in this sense. On the other hand, a minority position yet holds that thick concepts are variable-attitude concepts and can communicate different evaluations in different contexts (Blackburn, 1992; Hare, 1982). Take as an example what has been called an objectionable concept, such as ‘lewd’. A thick concept is objectionable if it carries an evaluation with a polarity that a speaker finds unfit. For instance, the statement that ‘Rihanna’s show is lewd’ carries a negative evaluation among people with a certain sexual morale prohibiting overt sexual displays. However, another group of people who do not take issue with overt sexual displays might believe that ‘lewd’ should not communicate a negative evaluation – they consider lewdness something neutral or even worthy of approval. Examples like these suggest that at least some terms do not have lexically fixed, invariable evaluations. Even advocates of the standard view recognise the existence of objectionable thick concepts which seem to have some variability (Väyrynen, 2011, 2019), although this variability is considered an exception that can be explained (away). However, philosophers such as Blackburn doubt that variability is just an exception. He suggests that we find this sort of variability not only with objectionable thick concepts. Instead, all thick concepts, including the most prototypical ones such as ‘cruel’, ‘rude’, ‘honest’, and ‘friendly’, are evaluatively variable (Blackburn, 1992).
So far, we lack empirical evidence for whether thick terms can carry different evaluations depending on context. The examples discussed in the literature make an intuitively convincing case that there are at least some, but the extent to which thick concepts are variable is unclear. In addition, based on this lack of evidence, we do not know yet how many concepts of ‘cruelty’ there actually are – perhaps it is just one concept that always carries a negative evaluation, perhaps there are two, cruelty(negative) and cruelty(positive). Such evidence would not only directly answer the Variability Question, but would also inform the Centrality, Location, and Separability Questions in important ways.
Thanks, Pascale, really interesting! I think your shift to character evaluations rather than action evaluations to avoid the problem is very clever. I guess I’m not sure if that ends up changing the target too much, since adjectives that describe character might have a different relationship to (reasons for) action than adjectives applied to action. But I would have expected much higher contradiction ratings, so I’m intrigued and surprised by your results! I wonder if people are telling themselves some sort of story to make the utterance make sense – e.g., for the example on p. 40, “Well, maybe Sally loves Jim, and so even though she recognizes his flaws, she doesn’t want him to change – so it need not be contradictory.” But maybe that’s grasping at straws on behalf of the proponent of the semantic view… I totally agree about the importance of applying more diverse methods, like vignettes with extra context, to these sorts of issues.
Thanks for elaborating on the variability question! Presumably there also might be terms that communicate evaluative content (either negative or positive) for some, but no such content (evaluatively neutral) for others.
Thanks for your thoughts, Zina. I think you are absolutely right that people will enrich these very isolated and contextually poor sentences with some kind of background story. This often makes the interpretation of results tricky. However, I would like to stress that we are just at the beginning of our research. As a starting point, we decided to follow the strategy that most philosophers have relied on in their theoretical, intuition-based work, namely to use as little contextual information as possible. Why would that be promising?
Philosophers who discuss the characteristics of thick concepts are usually interested in the stable, context-independent features of thick concepts. Of course, there might be cases in which we evaluate a person’s character as selfish or rude or egoistic, but we do not want that person to change. Sometimes, as you say, we can accept a person’s flaws because we love this person or because this little bit of rudeness is what makes this person so interesting and fun to be around. Alternatively, you might not want to communicate that a rude person ought to change, because you are not in the position to make demands like that. We usually accept criticism from family members and close friends but might find it very weird to be told how to behave by a total stranger. So many situational factors might affect whether a statement is understood as reason-giving.
However, it might be argued that these factors merely overwrite reasons for action that would otherwise be communicated. The more context we provide, a critic might worry, the more people will respond to the contingent situational factors, and the less we can learn about the stable, context, independent features of thick concepts. Personally, I believe that this is a risk we should take. The next step in our research is to vary the provide more contextual information, systematically manipulate factors that might affect people’s intuitions, and see what happens.
Yes, I totally agree that experimental work is essential here, despite interpretive challenges. Thanks for your replies — I’m very much looking forward to seeing what else comes out of this research!