Recently there has been a lot of discussion in our profession (e.g., on philosophy blogs) about the underrepresentation of women in philosophy. Most of the proposed solutions to this problem have focused on problems about, and solutions for, the underrepresentation of female graduate students and professors. Hiring practices are being revised, conferences with more female speakersare being advocated, climate surveys are being given to faculty and graduate students, and sexism and sexual harrassment are being called out. (Needless to say, we think these are important developments.) However, less attention has been paid to the underrepresentation of women at the undergraduate level—especially before students choose a major. This lack of attention is problematic, since a recent study by Paxton, Figdor, and Tiberius (2012) shows that the most significant drop-off for women in philosophy is between Intro courses and majoring. Given that less than one third of philosophy majors are women, we must address this under-representation in order to increase the proportion of female grad students and professors in philosophy.
In “Gender and Philosophical Intuition,” Wesley Buckwalter and Stephen Stich (2010) offered a partial explanation of the underrepresentation of women in philosophy that focused on the undergraduate level drop-off. They presented evidence that women have different intuitions than men about thought experiments typically used in Intro courses. And they proposed that instructors may treat female students’ intuitions as “incorrect” because they differ from mainstream accepted philosophical views. Buckwalter and Stich explain: “[t]he more courses a woman takes, the more likely it is that she will be exposed to thought experiments on which her intuitions and those of her instructor diverge – and the more likely it is that she will decide not to take another course” (2010: 29). So, on Buckwalter and Stich’s hypothesis, the underrepresentation of women in philosophy is partially explained by the weeding out of those students with “incorrect” intuitions. Their view has received a lot of attention (and some responses). We had our doubts about their hypothesis, decided to test it more fully, and found that our doubts had merit.
In their paper, Buckwalter and Stich provide numerous examples of gender differences on thought experiments, though they do not propose that these gender differences are due to biological differences. Their evidence for gender differences in philosophical intuitions was garnered from philosophers and psychologists contacted by Buckwalter and Stich as well as from thought experiments they tested themselves.
Although Buckwalter and Stich’s hypothesis seems well supported by the evidence they provide, we question their methodology. First, for the data solicited from other philosophers, Buckwalter and Stich do not report the total number of measures checked (including those without gender differences), or whether they obtained information about studies that showed no gender differences. So, they have a high risk for Type-I errors. Suppose there were 100 measures total in the sets of surveys among the experimenters solicited, and these experimenters reported to Buckwalter and Stich that 5 of those measures showed gender differences. With a standard Type-I error rate of 0.05, we would expect 5 responses to indicate significant gender differences by chance alone. Buckwalter and Stich did not use any statistical technique for adjusting the significance level in order to account for multiple comparisons. It is hard to know whether the gender differences on the solicited data are due to actual gender differences or due to chance. In addition to their presentation of the solicited data, Buckwalter and Stich present four gender differences from their own studies. However, even on these four thought experiments there appears to be no statistical correction for multiple comparisons. Further, Buckwalter and Stich seem to have tested more than four thought experiments while looking for gender differences. Their own series of thought experiments on Amazon’s M-Turk had a total of 1,836 subjects, but they only report four cases with gender differences, which account for only 384 of the subjects. If each case used around 95 participants, it is likely Buckwalter and Stich ran 15-20 studies, of which only four were reported as showing evidence of gender differences. It is unclear how many of the putative gender differences would be statistically significant had they accounted for multiple comparisons, assuming they checked for gender differences on all 15-20 measures.
Meanwhile, even if there are some gender differences on thought experiments, there are at least three reasons to doubt that women drop out of philosophy because of these differences. First, even in cases where Buckwalter and Stich report a small but significant gender difference in responses, it is unclear that the difference implies that men and women actually make different judgments. For example, they report a gender difference on the Plank of Carneades thought experiment, in which one shipwrecked sailor, Ricki, pushes another sailor, Jamie, off a plank that could not support both sailors. On a seven-item scale, women attributed a greater degree of blameworthiness to Ricki than men did. Yet, both women and men agreed that Ricki is morally blameworthy. Second, in many cases, there is no accepted philosophical intuition. Buckwalter and Stich present a gender difference in intuitions about Compatibilism regarding free will and determinism. However, there is heated debate between Compatibilists and Incompatibilists and it is unlikely that most instructors present one side of the debate as correct. Third and finally, when there is an accepted intuition by the philosophy profession, the gender difference reported sometimes suggests that women reported the accepted intuition. Take the thought experiment on Putnam’s Twin Earth; women are reported to be less likely than men to agree that Oscar and Twin-Oscar mean the same thing when they say ‘water’.
But, in any case, we wanted to test whether any of the just-described gender differences were genuine. In an attempt to replicate Buckwalter and Stich’s findings, we gave a survey to over 300 critical thinking students at Georgia State University (see here for summary of findings). Note that our sample is more representative of the population of undergraduates taking philosophy for the first time than Buckwalter and Stich’s M-Turk sample. We re-ran nearly all of the thought experiments discussed in their paper, using the same wording, but we found a gender difference only in one case (women were more likely than men to agree that George knew he was not a virtual reality brain, which is consistent with Buckwalter and Stich’s report). Yet, when we performed a Sidak correction for multiple comparisons, the gender difference is not significant. Our colleague and statistics aid, Sam Sims, has conducted a power analysis for our replication. Our study has an 80% chance of detecting the effect of gender on responses to any given thought experiment provided that gender explains at least 9% of the variance in responses. Given the concerns expressed above, we doubt that any smaller gender effects, even if they exist, would be enough to contribute to women leaving philosophy.
Because we greatly appreciate Buckwalter and Stich’s attempt to find explanations for the early drop-off of women in philosophy, we decided to look for others. We developed a climate survey for undergraduates (over 700 in Intro to Philosophy at Georgia State) to look for other explanations for why women say goodbye to philosophy so early in the game. We found many differences between genders—and also between white and black students, another issue that needs to be addressed—in their perceptions of their Intro class, some of which provide clues for where to look further (e.g., the number of women on the syllabi, as discussed here and here). One interesting finding from our results is that the students’ perceptions of themselves as having different opinions from their classmates is a “partial mediator” between gender and the intent to persist in philosophy. That is, students’ intentions to persist in philosophy are partly driven by whether they perceive themselves as having different opinions from other students, and whether they perceive themselves as having different opinions from other students varies by gender. Buckwalter and Stich’s hypothesis seems to suggest that women are less likely to persist in philosophy in part because they are less likely to perceive themselves as having similar opinions to their classmates. However, our data suggests that women are less likely to persist in philosophy in part because they are actually more likely to perceive themselves as having similar opinions to their classmates. This finding seems to provide evidence against Buckwalter and Stich’s (2010, 34) claim that “differences in intuition tout court” makes one less likely to continue in philosophy.
We have some emerging hypotheses for why women, and black students, don’t go into philosophy right from the start. But it’s likely going to be a complicated story with a lot of contributing causes. We hope that more effort will be made to understand these causes and, where appropriate, to counteract them. We suspect that most of these efforts will make undergraduate philosophy courses more relevant, useful, and enjoyable for all students.
-Toni Adleberg,* Morgan Thompson,* and Eddy Nahmias, Georgia State University
* Primary (and equal) authors of this work (including this blog post), both GSU MA 2013