Judging the quality of a research paper by the rating of the journal that published it is a very poor measure of scientific merit. And yet, a study of more than 6,500 published papers has concluded, the leading alternatives are even worse.
The analysis, published on Tuesday in PLoS Biology, considered whether it’s better to judge an article by the number of citations it receives or by the average number of citations for articles in the journal in which it appears. The comparisons also included two sets of postpublication reviews by subject experts.
All three of those methods showed serious deficiencies, wrote the two authors, Adam Eyre-Walker, a professor of biology at Britain’s University of Sussex, and one of his former postdoctoral students, Nina Stoletzki.
“None of the measures of scientific merit that we have investigated are reliable,” they wrote.
Their analysis was accompanied by a commentary by Jonathan A. Eisen, a professor of evolution and ecology and of medical microbiology at the University of California at Davis, and two co-authors, who call for more attention to evaluating scientific merit.
Mr. Eyre-Walker and Ms. Stoletzki have produced one of the most comprehensive assessments of the reliability of evaluating research, said Mr. Eisen and his co-authors, Cameron Neylon and Catriona J. MacCallum, who both work in advocacy roles at PLoS, the Public Library of Science. But the Sussex researchers gave in too easily when they suggested a journal’s average citation rating—known as its impact factor—is the best possible solution, Mr. Eisen and his colleagues said.
“We have sequenced the human genome, cloned sheep, sent rovers to Mars, and identified the Higgs boson (at least tentatively); it is surely not beyond our reach to make assessment useful, to recognize that different factors are important to different people and depend on research context,” they wrote.
Influenced by Reputation
For their project, Mr. Eyre-Walker and Ms. Stoletzki analyzed 716 papers on research financed, at least in part, by the Wellcome Trust, and 5,811 papers from the database of Faculty of 1000, a company that uses leading scientists and clinicians to rate the quality of published journal articles. All of the papers date from 2005, and both databases included postpublication reviews of the articles by panels of experts. The authors also reviewed citation data for each paper and the impact factors for the journals in which they appeared.
None of the measures—postpublication assessment, citation data, or impact factor—gave consistent information about the quality of the papers, Mr. Eyre-Walker and Ms. Stoletzki wrote.
Citation rates appeared to have little correlation with expert assessment, expert assessment appeared to be heavily influenced by impact factor, and impact factor appeared to be based largely on subjective measures, the researchers said.
It’s not immediately obvious what might be a better tool for scientists hoping to identify the best work by colleagues, for universities making hiring and promotion decisions, and for governments trying to allocate research dollars, Mr. Eyre-Walker said in an interview.
“It’s difficult,” he said, “We currently don’t have any very good ways to measure merit, and it’s not very clear what we should do in the future.”
Mr. Eisen was less pessimistic, saying Mr. Eyre-Walker and Ms. Stoletzki “greatly underestimate the potential for postpublication review.” And for that, he said, the growth of open-access journals could play a critical role, as freely available articles could be subject to reviews by an unlimited range of experts without any publisher-imposed time constraints.