Scientific-misconduct accusations are leading to retractions of high-profile papers, forcing reckonings within fields and ending professorships, even presidencies. But there’s no telling how widespread errors are in research: As it is, they’re largely brought to light by unpaid volunteers.
A program launching this month is hoping to shake up that incentive structure. Backed by 250,000 Swiss francs, or roughly $285,000, in funding from the University of Bern, in Switzerland, it will pay reviewers to root out mistakes in influential papers, beginning with a handful in psychology. The more errors found, and the more severe they are, the more the sleuths stand to make.
The tech industry has long paid bounty hunters to unearth bugs in code, but the scientific enterprise has not had an equivalent — to its detriment, many say.
“When I build my research on top of something that’s erroneous and I don’t know about it, that’s a cost because my research is built on false assumptions,” said Malte Elson, a psychologist at the University of Bern who is leading the new program with Ruben C. Arslan, a postdoctoral researcher at the University of Leipzig, in Germany.
About 20 percent of genetics papers that contain Microsoft Excel lists of genes are thought to have errors introduced by the software, while an estimated one in four papers in general science journals have incorrect citations. Errors can be unintentional, but 2 percent of surveyed scientists admit to the more serious charges of fabricating or falsifying data. In just the last year, researchers at the Dana-Farber Cancer Institute, Harvard Medical School, Stanford University, and the University of Rochester, to name a few, have faced scrutiny over their work.
Peer reviewers for journals are primarily tasked with evaluating how original and important a finding is, not how accurate. So once a paper is out, mistakes tend to be surfaced by scientists scouring the literature on their own time — and at their own risk. The behavioral scientist Francesca Gino has filed a $25-million defamation lawsuit against a trio of professors who reported finding data fabrication in four of her papers, concerns that led to those papers’ retraction and Harvard Business School to put her on an unpaid administrative leave. (Gino has denied ever falsifying data.)
Over the next four years, the ERROR program — short for Estimating the Reliability and Robustness of Research — will aim to pay experts to scrutinize 100 widely cited papers that fit their technical or subject expertise. Psychology will be first up, but the organizers hope to branch out to other subjects, like economics, political science, and medicine.
Errors can take all forms, whether differences between how experiments were done versus reported, or discrepancies between analyses and conclusions. Some errors could be clear miscalculations, and others more subjective and context-dependent, the organizers acknowledge, so reviewers will be allowed to determine how to look for them. They’ll also be allowed to ask the authors for help in fact-checking. Each will generate a report of any errors found, which will eventually be posted publicly.
An ERROR staffer overseeing the process, known as the “recommender,” will review the report before it is sent to the authors, who can respond. The recommender will then write a summary of the alleged concerns and suggest a course of action, which could include correcting or retracting articles with major errors.
A crucial caveat: A paper will be reviewed only if its authors agree. That’s because without full access to the underlying data, code, and other materials, there will always be questions the reviewer cannot answer, Elson said. “At this point, many people will be skeptical, and they will maybe rightfully think they can only lose if they say yes to this — all they do is put their paper at risk,” he said.
On the other hand, the prospect of a reputational boost may attract participants. “People can then point to the error report that will be public and say, ‘They checked my work and it’s fine,’” Elson said.
Research That Pays Off
Cold, hard cash is another incentive. Participating authors will get a small fee of about 250 francs, or roughly $285, plus more if no errors are found (or if they’re minor). Reviewers can make up to the equivalent of $1,135, and more depending on what they find. If their work results in a recommended retraction, they’ll net an additional $2,835.
ERROR will start with three papers, including a 2020 article that identified a strategy to discourage online sharing of Covid-19 misinformation. Gordon Pennycook, the lead author and an associate professor of psychology at Cornell University, said he was happy to have it selected. Having started his Ph.D. in 2011, Pennycook trained during an era in which a “replication crisis” unraveled some of his field’s buzziest findings and highlighted the importance of reproducible scientific practices.
“If someone’s replicating your work, they’re basically putting in their own work, adding data and information to something you obviously care about,” he said. “You should actually be excited someone’s going to replicate it.”
Not every scientist approached will be as willing and open-minded. So far, Elson said, the authors of two nominated papers have turned down invitations, and two others are undecided.
The ERROR website acknowledges that the dynamic between reviewers and authors may get “adversarial.” But it insists that the process should ultimately be “a collaborative one in the service of improving our collective scientific knowledge and fostering a culture of error checking and error acceptance.”
Elisabeth Bik, a scientific-integrity consultant who specializes in detecting manipulated images, said she welcomed what ERROR was trying to do. “There are enormous amounts of money funding novel research, and almost nothing going towards reproducibility or quality control,” said Bik, who is not involved with the program, by email.
At the same time, she can see potential problems with the setup — for instance, if ERROR reviewers were direct competitors of the researchers whose work they’re critiquing.
Elson said that if authors have a reason to believe a reviewer can’t be impartial, they can raise that concern to the organizers throughout the process. In addition, while the reviewer may find problems, the recommender decides on their severity, which is what determines the scope of the payout. “We will take the utmost care to monitor this,” he said by email.
Lawsuits are another possible concern, noted Bik, who has faced legal threats over her own sleuthing in the past. Elson said ERROR has no legal insurance, citing the complications of insuring an international project. All participants are doing this at their own risk, he said.
ERROR is, in other words, a big science experiment. Can it make error-finding less stigmatized, more standard? Will it inspire the rest of the enterprise to take a look under the hood?
“My goal,” Elson said, “is to take funding organizations, like the National Science Foundation in Switzerland, and tell them, ‘Look, if you take a tiny portion of your funding and pour this into it and do a random review of the projects you funded, that would go a long way.’”