We’re not new to studying women in science. Nor are we insensitive to the prejudice some women experience in academic science.
For the past decade, we have researched some of the challenges female scientists face that their male counterparts often don’t, such as balancing work and life demands like child care (see our 2012 article in American Scientist). Our guiding principle has been to follow the data wherever it takes us. We have found, for example, that women and men have comparable rates of success with grant and article submissions, and that affirmative action doesn’t lead to a preference for less-competent women (forthcoming), and that women have a harder time getting tenure in biology and psychology and are less satisfied with their jobs than men in the social sciences.
However, none of those findings have aroused the passions that our most recent research has. In a recent issue of the Proceedings of the National Academy of Sciences, we published an article on data from five national studies that took us to an unexpected destination. The data showed that, in tenure-track hiring, faculty prefer female job candidates over identically qualified male ones.
Because that finding runs counter to claims of sexist hiring, it was met in the news media and in academe with incredulity and often panic. We have responded to those criticisms in five pieces in the Huffington Post (parts one, two, three, four, and five), as well as another essay in American Scientist and one on the website of the Society for Personality and Social Psychology.
Some critics saw in our findings a disavowal of their own experiences with academic sexism. Even though our study examined only entry-level hiring, they viewed it as invalidating biases they faced outside the hiring context and as an attack on their advocacy for women. But data from multiple studies using different methods kept revealing the same striking preference for hiring women. So we reported the empirical data, hoping to generate an honest, productive dialogue about modern discrimination in the academy. Since hiring is no longer a roadblock, where else might we need to direct efforts and advocacy to help more women succeed?
In the latest critique of our results, Joan Williams, a law professor, and Jessi Smith, a psychology professor, claimed that our hiring study was "plagued by five serious methodological flaws" that negated our conclusions. None of their claims are valid. Let’s examine them individually.
Claim: We ignore sexism before and after hiring. Their first complaint is that sexism in science is not limited to the point of hiring. Williams and Smith argue that women face stereotypes related to motherhood and/or walk a tightrope between being seen as too feminine to be competent or too masculine to be likable. We agree, and said explicitly that these recent findings focused only on entry-level hiring in STEM fields at the rank of assistant professor.
Our study did not examine, nor did it claim to examine, bias in early gender socialization, high school, college, graduate school, mentoring, tenure, promotion, persistence, job satisfaction, or remuneration. However, last year we (along with two colleagues) conducted more than 200 analyses of those possible types of bias, finding that gender differences in them were, with notable exceptions such as the ones we mentioned above, small or nonexistent (see our lengthy report in Psychological Science in the Public Interest). Thus, Williams and Smith criticize our current study for something it was not designed to test, and ignore hundreds of analyses we have published elsewhere in peer-reviewed journals and books on the very issues they say we ignore.
Claim: How could we do valid work if we aren’t social psychologists? Williams and Smith claimed we were unqualified to conduct this research because we are not card-carrying experimental social psychologists, and that, they argued, led us to make critical mistakes in the design of our experiments. Williams and Smith also criticized the way we portrayed our hypothetical finalists for tenure-track jobs as outstanding, which they said "conveniently eliminated" the bias that confronts more ordinary people.
They are wrong on both counts. We have both taught and conducted research in social psychology for decades, published experimental social-psychology studies — and back in the Pleistocene, one of us (Williams) got a Ph.D. in social psychology from Yale University, advised by three eminent social psychologists. But the attempt to discredit our work because of presumed guild membership is just as inappropriate as it would be for us to impugn the first author of the commentary because her training is in law rather than behavioral science. The social sciences are interconnected, often sharing statistical methods and concepts to examine gender discrimination.
Throughout our research, we have sought advice from numerous social psychologists, several acknowledged in our article. Our study was reviewed by seven experts — six of whom appear to be social psychologists (the seventh is a statistician). The editor is one of the most eminent living social psychologists (and a member of the NAS), and he provided an eighth full-length review. That many reviews of a scholarly article is exceptional for a journal in which the average number of reviewers is two.
Thus, our study was subjected to an unusually high level of scrutiny by social psychologists, some of whom praised its methods as "heroic" and trendsetting. Our methods included the national randomized sampling of 872 tenure-track faculty members from 371 colleges and universities across all of the 50 states, and the use of sample weights to control for nonresponses — two things that are rare in experimental social psychology, where the modal practice is nonscientific college-student convenience samples from a few institutions at best.
So, in making this complaint, Williams and Smith must acknowledge the burden they have taken on: They are calling into question not only our competence, but also that of every consultant, reviewer, and statistician who scrutinized our methods and replicated our results. Do they wish to extend the charges leveled at us to these scholarly colleagues? Are they implying those experts missed flawed concepts and methods in our research?
What about the claim that we described the hypothetical finalists for tenure-track positions in our study as outstanding? We did that because shortlisted candidates usually are outstanding — having risen to the top of a pool of 50 to 300 applicants, successfully completed doctorates, published multiple articles, and garnered strong letters. Indeed, in one of our experiments we used the CVs of actual recent finalists for assistant professorships. We did not alter those documents to make people look like superstars, we used the CVs as they were. And still, in our study, faculty preferred the CV with a woman’s name on it.
We also wrote that evidence shows that when bias occurs it is typically when candidates are not stellar, such as in hiring of undergraduates whose competence was depicted ambiguously: "Thus, one possible reason the present study did not find anti-female bias among faculty raters of tenure-track applicants is because the applicants for the tenure-track positions are, in fact, unambiguously excellent: The three shortlisted applicants were described as possessing the sort of competence that shortlisted candidates for tenure-track positions usually possess according to faculty respondents." Respondents told us there weren’t enough positions to hire all of the outstanding candidates in their searches, and that the top 20 to 30 (out of hundreds of applicants) were excellent. In short, to describe finalists as less than outstanding would have been unrealistic.
Claim: Study participants voted in a socially desirable manner. The next criticism Williams and Smith made was that people in our study faked their responses to mask their sexism. That is an issue from Experimental Design 101, and we conducted a number of tests aimed at detecting that (we detailed those tests in the online supplement the commentators apparently missed). As one example, 30 faculty members were asked to guess the purpose of the study, and none guessed the real purpose. Also, in one experiment, participants were sent only one applicant to rate (with no explicit comparison), yet they still preferred women to identical male candidates. Although there was an overall preference to hire women, there were circumstances in which the preference switched, or did not arise at all. If the design was transparent, how can critics explain those reversals? All of these empirical findings refute the claim that faculty participants altered their responses to hide their sexism.
Furthermore, as we wrote, in large-scale, real-world audit studies involving actual hires, women are more likely to be offered jobs than men. Apparently, if faculty in our experiments were attempting to fake their gender-egalitarianism, then faculty making real-world hiring decisions are faking theirs as well.
Claim: It’s the message, not the method. Fourth, these critics claimed we fatally mixed up too many variables to separate out the impact of each: "Williams and Ceci do something no social psychologist would do. They compare scientists in a design that varies across five axes. … It is impossible to tell which of the factors they included in producing their result. The study design means we just don’t know. Social psychologists call that the problem of ‘confounding variables.’ What we have here is not clear findings; it’s mud."
We are well aware of the statistical requirements for assigning causal power to specific variables. We know which variables are responsible for the results — and so did the eight reviewers, one of whom is a senior statistician; as did our group of advisers who include five statisticians, all of whom we acknowledged for their statistical advice; as did a statistician who requested our raw data and replicated our findings.
Despite our critics’ lack of awareness of it, the experimental strategy we used is quite common, taught in many research-methods courses, and is uncontroversial across the behavioral sciences. There was plenty of studywide statistical power to support the analyses described in our paper.
If our study had revealed anti-woman hiring attitudes, would all of those methodological critiques and questions have been raised? We suspect the real objection of Williams and Smith was to our message: that tenure-track hiring favors women.
Unfortunately, we cannot control the empirical results of our studies. We share the concerns about women’s opportunities in STEM fields. However, we believe it is more productive to create an accurate dialogue about academe today, to understand where women may be flourishing, where they may face roadblocks, which aspects of women’s careers are subject to discriminatory practices, and the ways we — as teachers, mentors, and scholars — can contribute to a continuing culture of success for both genders.
Finally, Williams and Smith accuse us of offering "comfort food" rather than serious scholarship: "We know it is comforting to believe that sexism in science is over, and … women are now the preferred item on the menu. Fine, whatever: Enjoy your comfort food. Just don’t call it scholarship." As noted, we have written numerous articles and books, and more often than not concluded that barriers for women in the sciences still exist — but not in entry-level hiring.
In their zeal to impugn our methods and analyses, these commentators invoked the specter of methodological flaws to dismiss a message, ratified by real-world hiring, that they seem to find personally threatening. Until there is full gender-fairness, we cannot enjoy any "comfort food." In the interim, we hope our critics realize there is plenty of crow to eat.