Skip to content
ADVERTISEMENT
Sign In
  • Sections
    • News
    • Advice
    • The Review
  • Topics
    • Data
    • Diversity, Equity, & Inclusion
    • Finance & Operations
    • International
    • Leadership & Governance
    • Teaching & Learning
    • Scholarship & Research
    • Student Success
    • Technology
    • Transitions
    • The Workplace
  • Magazine
    • Current Issue
    • Special Issues
    • Podcast: College Matters from The Chronicle
  • Newsletters
  • Virtual Events
  • Ask Chron
  • Store
    • Featured Products
    • Reports
    • Data
    • Collections
    • Back Issues
  • Jobs
    • Find a Job
    • Post a Job
    • Professional Development
    • Career Resources
    • Virtual Career Fair
  • More
  • Sections
    • News
    • Advice
    • The Review
  • Topics
    • Data
    • Diversity, Equity, & Inclusion
    • Finance & Operations
    • International
    • Leadership & Governance
    • Teaching & Learning
    • Scholarship & Research
    • Student Success
    • Technology
    • Transitions
    • The Workplace
  • Magazine
    • Current Issue
    • Special Issues
    • Podcast: College Matters from The Chronicle
  • Newsletters
  • Virtual Events
  • Ask Chron
  • Store
    • Featured Products
    • Reports
    • Data
    • Collections
    • Back Issues
  • Jobs
    • Find a Job
    • Post a Job
    • Professional Development
    • Career Resources
    • Virtual Career Fair
    Upcoming Events:
    A Culture of Cybersecurity
    Opportunities in the Hard Sciences
    Career Preparation
Sign In
The Review

Seeing Into the Future

Does Philip Tetlock hold the key to accurate predictions?

By Angela Chen October 5, 2015
Seeing Into the Future 1
Stuart Bradford for The Chronicle Review

Every night for nearly a month last year, Reed Roberts woke up at 3 a.m. to check the newest real-time data on something not very many people think about: the melting of Arctic Sea ice.

This unusual interest is part of what makes Roberts — then a Ph.D. candidate in organic chemistry at the University of Cambridge and now an analyst with the Economist Intelligence Unit — a superforecaster, or a nonexpert who can make extraordinarily accurate predictions about future political events. His existence and methodology have been a curiosity of sorts for researchers like Philip E. Tetlock, the psychologist who ran experiments proving that people like Roberts not only exist but can be trained to become even more accurate.

To continue reading for FREE, please sign in.

Sign In

Or subscribe now to read with unlimited access for as low as $10/month.

Don’t have an account? Sign up now.

A free account provides you access to a limited number of free articles each month, plus newsletters, job postings, salary data, and exclusive store discounts.

Sign Up

Every night for nearly a month last year, Reed Roberts woke up at 3 a.m. to check the newest real-time data on something not very many people think about: the melting of Arctic Sea ice.

This unusual interest is part of what makes Roberts — then a Ph.D. candidate in organic chemistry at the University of Cambridge and now an analyst with the Economist Intelligence Unit — a superforecaster, or a nonexpert who can make extraordinarily accurate predictions about future political events. His existence and methodology have been a curiosity of sorts for researchers like Philip E. Tetlock, the psychologist who ran experiments proving that people like Roberts not only exist but can be trained to become even more accurate.

At first glance, it might seem surprising that Tetlock, a professor at the University of Pennsylvania, is the one to champion such ideas. He has been best known for telling the world, in his 2005 book Expert Political Judgment: How Good Is It? How Can We Know?, that the average expert is roughly as accurate as a dart-throwing chimp.

But for the past five years, he has been involved with and studying the Good Judgment Project, a team in an annual forecasting tournament funded by a government agency called Iarpa, for Intelligence Advanced Research Projects Activity. Led by Barbara Mellers, also a psychology professor at Penn, who is Tetlock’s wife, the researchers found that people are not quite as hopeless at prediction as initially thought. Tetlock himself has become an “optimistic skeptic.”

“The earlier work focused on cursing the darkness, and the Iarpa tournament focused on lighting candles,” says Tetlock, whose book on the tournament research, Superforecasting: The Art and Science of Prediction (Crown; written with a journalist, Dan Gardner), came out last month.

Tetlock has been interested in forecasting since the 1980s, he says during an interview at his home in Philadelphia. He’s soft-spoken, gestures frequently with his hands, and often talks in terms of trade-offs: rigor versus relevance when generating questions, or seeing belief in a “true/false” framework versus seeing it as a continuum that always needs to be updated. He’s also careful to separate what we do know from what we can only speculate.

Though Tetlock is cautious about overgeneralizing, his research has shown that, in certain conditions, people with no specific background knowledge can outperform specialists with access to classified information. People like Roberts, who says, “For me and for most ‘supers,’ the method of making predictions always starts the same way: The primary resource is Google.”

In 2010, Jason Matheny, director of Iarpa, invited Mellers and Tetlock to participate in the forecasting tournament, which began in 2011 and ran until earlier this year. It pitted five research teams, including Tetlock’s, against one another and against a control team. Tetlock’s team comprised members crowdsourced from around the world, willing to work hours each week for little more than an Amazon gift card.

All the teams made predictions about key events in the so-called Goldilocks zone of questions: both precise enough to answer and relevant to national security. For example, What will be the highest reported monthly average of Mexican oil exports to the United States between February 5, 2014, and April 1, 2014? Will Angela Merkel win the next election for chancellor of Germany?

What Tetlock shows is that experts are overconfident and could do a lot better, but not that they’re useless.

The hope was that the teams could beat the combined “wisdom of the crowd” forecast of the control group by 20 percent in the first year and 50 percent in the fourth year. Tetlock’s team emerged the clear winner, beating Iarpa’s 50-percent goal in the first and subsequent years.

ADVERTISEMENT

Within all the teams, researchers ran experiments — for example, pitting individuals against groups — to see which methods improved accuracy. The essential insight? Prediction accuracy is possible when people participate in a setup that rewards only accuracy — and not the novelty of the explanation, or loyalty to the party line, or the importance of keeping up your reputation. It is within this condition that the “supers,” the top 2 percent of each group, emerged.

Every prominent pundit who was invited to participate in the tournament declined. Put it this way, Tetlock says: “If you have an influential column and a Nobel Prize and big speaking engagements and you’re in Davos all the time — if you have all the status cards — why in God’s name would you ever agree to play in a tournament where the best possible outcome is to break even?”

Over the past decade, an unfortunate game of telephone has mangled the message of Expert Political Judgment, in which Tetlock argued that experts were, to put it mildly, not very accurate. He did not claim that the public and the experts were equally knowledgeable, says Bryan Caplan, an economist at George Mason University who studies expertise and has been influenced by Tetlock. “He very clearly said that he asked the experts harder questions, and ones that he wasn’t sure they were going to get right. So what he shows is that experts are overconfident and could do a lot better, but not that they’re useless.”

Accordingly, the optimism of the newer research doesn’t contradict the assertion that experts can’t see what will happen in 2025. In the Iarpa tournament, nearly all of the questions asked forecasters to look ahead one year or less. Previous work had asked experts to look three to five years into the future.

ADVERTISEMENT

“The biggest limitation of the earlier work was that it really wasn’t designed to let people shine,” Tetlock says. “The Iarpa work gives people an opportunity to see how good they can become when they’re allowed to make short-term probability judgments and update those judgments in response to news, and when they’re not given hopelessly difficult tasks like what the state of the global economy is five years from now.”

Scholars have long viewed forecasting with skepticism. Since even the best theory can be thrown off by outlier events in the real world, Tetlock notes, theorists are understandably reluctant to put their intellectual reputations at risk in forecasting exercises. Forecasting tournaments, however, are not about testing particular social-science theories. They are about “testing the ingenuity of individual human beings in cobbling together good explanations as opportunistically and effectively as possible.”

Compared with other major figures in the field, Tetlock is more sanguine about the powers of prediction. He is less convinced than Daniel Kahneman — the psychologist known for his work with cognitive biases, who has collaborated with Tetlock — that training people to let go of faulty beliefs is a losing game. And while he agrees with risk analyst Nassim Nicholas Taleb that outliers, or “black swans,” can throw off a prediction, he still thinks that the pursuit of prediction, as opposed to “anti-fragilizing,” or making everything immune to collapse, is a worthy goal.

Just as not everyone can be a Mozart or an Einstein, probably not everyone can be a superforecaster. Superforecasters think in very granular, or detailed, ways. While a normal person making a forecast will adjust the likelihood of her prediction’s coming true up or down 20 percentage points with each new development, a superforecaster tends to work in the realm of single digits and decimal points. The intelligence community suggests that the average person can distinguish between three and seven levels of uncertainty (such as “not going to happen,” “maybe,” and “probably going to happen”). Superforecasters can distinguish far more.

ADVERTISEMENT

They are actively open-minded and score higher than average, though still far from genius territory, on measures of fluid intelligence: the ability to think logically and match patterns. They are highly numerate and enjoy wrestling with hard intellectual problems. It’s a group with specific skills, yes, and also a highly self-selecting one.

Take Ryan Adler, a budget analyst for the city government of Arvada, Colo. “At one point, I saw a question on parliamentary elections in Guinea-Bissau, and I realized that this is great, this will be the first opportunity I get to have reason to learn more about Guinea-Bissau,” he says. “It’s like my dream version of fantasy football.”

Adler achieved superforecaster status after his first season in the tournament. Others, like Jennifer L. Erickson, an assistant professor of political science at Boston College, became supers after multiple seasons, suggesting that accuracy really can be improved by learning the ropes.

One year, a 60-minute tutorial outlining basic concepts about cognitive biases improved participants’ accuracy by about 10 percent. One module reminded them to be aware of “duration neglect,” or the tendency not to take duration into account when making judgments. It suggested breaking down a three-month question into one-month increments.

ADVERTISEMENT

Over the four years that Erickson participated, she learned that her best strategy included forecasting a little more aggressively than felt comfortable, since she tended to be underconfident, and staying away from questions when she “knew enough to know that the question was too messy to deal with.”

The setup of the tournament, which rewards only accuracy, was invaluable in teaching supers like Erickson to pick the best strategy. It also led to a finding that teams are more successful than individuals. Researchers divided the Good Judgment Project team into individuals competing alone and groups competing together. Teams were, on average, 23 percent more accurate in their predictions.

Tetlock attributes team success to “life skills” training on how to give constructive criticism, and to teams’ ability to become resources without producing groupthink. “To get those nasty groupthink effects, you have to have an opinionated leader, and here you didn’t have opinionated leaders because the only thing that mattered was the accuracy score,” he says. There was little else to prove, and no need to come up with detailed explanations, which often do more harm than good.

Thanks to the Good Judgment Project, the forecasting tournament has been solidified as one of the best modes of prediction, says Matheny, the Iarpa director. “Tetlock’s research really did help to inform how we should run forecasting tournaments,” he says. “Not just that we should run them, but that there’s best practices in how to run them, and especially in picking questions that are neither too easy nor too hard.”

ADVERTISEMENT

Though Iarpa’s funding has ended, the team has formed a for-profit entity, Good Judgment Incorporated, which is recruiting members for a public tournament to begin later this fall. Corporate, nonprofit, government, and media clients can sponsor forecasting “challenges” on the public site, and the company will offer custom forecasts and training. It is also studying the potential of machine-human hybrids — like having IBM’s Deep Blue collaborate with Garry Kasparov in chess — that could prove more accurate than either one alone.

Now that we know some limitations, and strengths, of forecasters, Tetlock wants to focus on asking the right questions. He hopes to create what Kahneman has called “adversarial collaboration tournaments” — for instance, bringing together two politically opposed groups to discuss the Iran nuclear deal. One group thinks it’s great, one group thinks it’s terrible, and each must generate 10 questions that everyone will answer.

The idea is that each side will generate questions with answers that favor their position, and that, with everyone forced to consider all questions, a greater level of understanding will emerge. Maybe, in time, this will become the new norm for punditry, public debate, and policymaking.

The ultimate goals? Intellectual honesty. Better predictions. And, says Tetlock, “I hope we can avoid mistakes of the Iraq-war magnitude.”

A version of this article appeared in the October 9, 2015, issue.
We welcome your thoughts and questions about this article. Please email the editors or submit a letter for publication.
Share
  • Twitter
  • LinkedIn
  • Facebook
  • Email
ADVERTISEMENT
ADVERTISEMENT

More News

Harvard University
'Deeply Unsettling'
Harvard’s Battle With Trump Escalates as Research Money Is Suddenly Canceled
Photo-based illustration of a hand and a magnifying glass focusing on a scene from Western Carolina Universiy
Equal Opportunity
The Trump Administration Widens Its Scrutiny of Colleges, With Help From the Internet
Santa J. Ono, president of the University of Michigan, watches a basketball game on the campus in November 2022.
'He Is a Chameleon'
At U. of Michigan, Frustrations Grew Over a President Who Couldn’t Be Pinned Down
Photo-based illustration of University of Michigan's president Jeremy Santa Ono emerging from a red shape of Florida
Leadership
A Major College-President Transition Is Defined by an About-Face on DEI

From The Review

Illustration showing a valedictorian speaker who's tassel is a vintage microphone
The Review | Opinion
A Graduation Speaker Gets Canceled
By Corey Robin
Illustration showing a stack of coins and a university building falling over
The Review | Opinion
Here’s What Congress’s Endowment-Tax Plan Might Cost Your College
By Phillip Levine
Photo-based illustration of a college building under an upside down baby crib
The Review | Opinion
Colleges Must Stop Infantilizing Everyone
By Gregory Conti

Upcoming Events

Ascendium_06-10-25_Plain.png
Views on College and Alternative Pathways
Coursera_06-17-25_Plain.png
AI and Microcredentials
  • Explore Content
    • Latest News
    • Newsletters
    • Letters
    • Free Reports and Guides
    • Professional Development
    • Virtual Events
    • Chronicle Store
    • Chronicle Intelligence
    • Jobs in Higher Education
    • Post a Job
  • Know The Chronicle
    • About Us
    • Vision, Mission, Values
    • DEI at The Chronicle
    • Write for Us
    • Work at The Chronicle
    • Our Reporting Process
    • Advertise With Us
    • Brand Studio
    • Accessibility Statement
  • Account and Access
    • Manage Your Account
    • Manage Newsletters
    • Individual Subscriptions
    • Group and Institutional Access
    • Subscription & Account FAQ
  • Get Support
    • Contact Us
    • Reprints & Permissions
    • User Agreement
    • Terms and Conditions
    • Privacy Policy
    • California Privacy Policy
    • Do Not Sell My Personal Information
1255 23rd Street, N.W. Washington, D.C. 20037
© 2025 The Chronicle of Higher Education
The Chronicle of Higher Education is academe’s most trusted resource for independent journalism, career development, and forward-looking intelligence. Our readers lead, teach, learn, and innovate with insights from The Chronicle.
Follow Us
  • twitter
  • instagram
  • youtube
  • facebook
  • linkedin