The dissertation prospectus began by quoting a statistic -- a “grabber” meant to capture the reader’s attention. The graduate student who wrote this prospectus undoubtedly wanted to seem scholarly to the professors who would read it; they would be supervising the proposed research. And what could be more scholarly than a nice, authoritative statistic, quoted from a professional journal in the student’s field?
So the prospectus began with this (carefully footnoted) quotation: “Every year since 1950, the number of American children gunned down has doubled.” I had been invited to serve on the student’s dissertation committee. When I read the quotation, I assumed the student had made an error in copying it. I went to the library and looked up the article the student had cited. There, in the journal’s 1995 volume, was exactly the same sentence: “Every year since 1950, the number of American children gunned down has doubled.”
This quotation is my nomination for a dubious distinction: I think it may be the worst -- that is, the most inaccurate -- social statistic ever.
What makes this statistic so bad? Just for the sake of argument, let’s assume that “the number of American children gunned down” in 1950 was one. If the number doubled each year, there must have been two children gunned down in 1951, four in 1952, eight in 1953, and so on. By 1960, the number would have been 1,024. By 1965, it would have been 32,768 (in 1965, the F.B.I. identified only 9,960 criminal homicides in the entire country, including adult as well as child victims). By 1970, the number would have passed one million; by 1980, one billion (more than four times the total U.S. population in that year). Only three years later, in 1983, the number of American children gunned down would have been 8.6 billion (nearly twice the earth’s population at the time). Another milestone would have been passed in 1987, when the number of gunned-down American children (137 billion) would have surpassed the best estimates for the total human population throughout history (110 billion). By 1995, when the article was published, the annual number of victims would have been over 35 trillion -- a really big number, of a magnitude you rarely encounter outside economics or astronomy.
Thus my nomination: estimating the number of American child gunshot victims in 1995 at 35 trillion must be as far off -- as hilariously, wildly wrong -- as a social statistic can be. (If anyone spots a more inaccurate social statistic, I’d love to hear about it.)
Where did the article’s author get this statistic? I wrote the author, who responded that the statistic came from the Children’s Defense Fund, a well-known advocacy group for children. The C.D.F.'s The State of America’s Children Yearbook 1994 does state: “The number of American children killed each year by guns has doubled since 1950.” Note the difference in the wording -- the C.D.F. claimed there were twice as many deaths in 1994 as in 1950; the article’s author reworded that claim and created a very different meaning.
It is worth examining the history of this statistic. It began with the C.D.F. noting that child gunshot deaths had doubled from 1950 to 1994. This is not quite as dramatic an increase as it might seem. Remember that the U.S. population also rose throughout this period; in fact, it grew about 73 percent -- or nearly double. Therefore, we might expect all sorts of things -- including the number of child gunshot deaths -- to increase, to nearly double, just because the population grew. Before we can decide whether twice as many deaths indicates that things are getting worse, we’d have to know more. The C.D.F. statistic raises other issues as well: Where did the statistic come from? Who counts child gunshot deaths, and how? What is meant by a “child” (some C.D.F. statistics about violence include everyone under age 25)? What is meant by “killed by guns” (gunshot-death statistics often include suicides and accidents, as well as homicides)? But people rarely ask questions of this sort when they encounter statistics. Most of the time, most people simply accept statistics without question.
Certainly, the article’s author didn’t ask many probing, critical questions about the C.D.F.'s claim. Impressed by the statistic, the author repeated it -- well, meant to repeat it. Instead, by rewording the C.D.F.'s claim, the author created a mutant statistic, one garbled almost beyond recognition.
But people treat mutant statistics just as they do other statistics -- that is, they usually accept even the most implausible claims without question. For example, the journal editor who accepted the author’s article for publication did not bother to consider the implications of child victims doubling each year. And people repeat bad statistics: The graduate student copied the garbled statistic and inserted it into the dissertation prospectus. Who knows whether still other readers were impressed by the author’s statistic and remembered it or repeated it? The article remains on the shelf in hundreds of libraries, available to anyone who needs a dramatic quote. The lesson should be clear: Bad statistics live on; they take on lives of their own.
Some statistics are born bad -- they aren’t much good from the start, because they are based on nothing more than guesses or dubious data. Other statistics mutate; they become bad after being mangled (as in the case of the author’s creative rewording). Either way, bad statistics are potentially important: They can be used to stir up public outrage or fear; they can distort our understanding of our world; and they can lead us to make poor policy choices.
The notion that we need to watch out for bad statistics isn’t new. We’ve all heard people say, “You can prove anything with statistics.” The title of my book, Damned Lies and Statistics, comes from a famous aphorism (usually attributed to Mark Twain or Benjamin Disraeli): “There are three kinds of lies: lies, damned lies, and statistics.” There is even a useful little book, still in print after more than 40 years, called How to Lie With Statistics.
Statistics, then, have a bad reputation. We suspect that statistics may be wrong, that people who use statistics may be “lying” -- trying to manipulate us by using numbers to somehow distort the truth. Yet, at the same time, we need statistics; we depend upon them to summarize and clarify the nature of our complex society. This is particularly true when we talk about social problems. Debates about social problems routinely raise questions that demand statistical answers: Is the problem widespread? How many people -- and which people -- does it affect? Is it getting worse? What does it cost society? What will it cost to deal with it? Convincing answers to such questions demand evidence, and that usually means numbers, measurements, statistics.
But can’t you prove anything with statistics? It depends on what “prove” means. If we want to know, say, how many children are “gunned down” each year, we can’t simply guess -- pluck a number from thin air: 100, 1,000, 10,000, 35 trillion, whatever. Obviously, there’s no reason to consider an arbitrary guess “proof” of anything. However, it might be possible for someone -- using records kept by police departments or hospital emergency rooms or coroners -- to keep track of children who have been shot; compiling careful, complete records might give us a fairly accurate idea of the number of gunned-down children. If that number seems accurate enough, we might consider it very strong evidence -- or proof.
The solution to the problem of bad statistics is not to ignore all statistics, or to assume that every number is false. Some statistics are bad, but others are pretty good, and we need statistics -- good statistics -- to talk sensibly about social problems. The solution, then, is not to give up on statistics, but to become better judges of the numbers we encounter. We need to think critically about statistics -- at least critically enough to suspect that the number of children gunned down hasn’t been doubling each year since 1950.
A few years ago, the mathematician John Allen Paulos wrote Innumeracy, a short, readable book about “mathematical illiteracy.” Too few people, he argued, are comfortable with basic mathematical principles, and this makes them poor judges of the numbers they encounter. No doubt this is one reason we have so many bad statistics. But there are other reasons, as well.
Social statistics describe society, but they are also products of our social arrangements. The people who bring social statistics to our attention have reasons for doing so; they inevitably want something, just as reporters and the other media figures who repeat and publicize statistics have their own goals. Statistics are tools, used for particular purposes. Thinking critically about statistics requires understanding their place in society.
While we may be more suspicious of statistics presented by people with whom we disagree -- people who favor different political parties or have different beliefs -- bad statistics are used to promote all sorts of causes. Bad statistics come from conservatives on the political right and liberals on the left, from wealthy corporations and powerful government agencies, and from advocates of the poor and the powerless.
In order to interpret statistics, we need more than a checklist of common errors. We need a general approach, an orientation, a mind-set that we can use to think about new statistics that we encounter. We ought to approach statistics thoughtfully. This can be hard to do, precisely because so many people in our society treat statistics as fetishes. We might call this the mind-set of the Awestruck -- the people who don’t think critically, who act as though statistics have magical powers. The awestruck know they don’t always understand the statistics they hear, but this doesn’t bother them. After all, who can expect to understand magical numbers? The reverential fatalism of the awestruck is not thoughtful -- it is a way of avoiding thought. We need a different approach.
One choice is to approach statistics critically. Being critical does not mean being negative or hostile -- it is not cynicism. The critical approach statistics thoughtfully; they avoid the extremes of both naive acceptance and cynical rejection of the numbers they encounter. Instead, the critical attempt to evaluate numbers, to distinguish between good statistics and bad statistics.
The critical understand that, while some social statistics may be pretty good, they are never perfect. Every statistic is a way of summarizing complex information into relatively simple numbers. Inevitably, some information, some of the complexity, is lost whenever we use statistics. The critical recognize that this is an inevitable limitation of statistics. Moreover, they realize that every statistic is the product of choices -- the choice between defining a category broadly or narrowly, the choice of one measurement over another, the choice of a sample. People choose definitions, measurements, and samples for all sorts of reasons: Perhaps they want to emphasize some aspect of a problem; perhaps it is easier or cheaper to gather data in a particular way -- many considerations can come into play. Every statistic is a compromise among choices. This means that every definition -- and every measurement and every sample -- probably has limitations and can be criticized.
Being critical means more than simply pointing to the flaws in a statistic. Again, every statistic has flaws. The issue is whether a particular statistic’s flaws are severe enough to damage its usefulness. Is the definition so broad that it encompasses too many false positives (or so narrow that it excludes too many false negatives)? How would changing the definition alter the statistic? Similarly, how do the choices of measurements and samples affect the statistic? What would happen if different measures or samples were chosen? And how is the statistic used? Is it being interpreted appropriately, or has its meaning been mangled to create a mutant statistic? Are the comparisons that are being made appropriate, or are apples being confused with oranges? How do different choices produce the conflicting numbers found in stat wars? These are the sorts of questions the critical ask.
As a practical matter, it is virtually impossible for citizens in contemporary society to avoid statistics about social problems. Statistics arise in all sorts of ways, and in almost every case the people promoting statistics want to persuade us. Activists use statistics to convince us that social problems are serious and deserve our attention and concern. Charities use statistics to encourage donations. Politicians use statistics to persuade us that they understand society’s problems and that they deserve our support. The media use statistics to make their reporting more dramatic, more convincing, more compelling. Corporations use statistics to promote and improve their products. Researchers use statistics to document their findings and support their conclusions. Those with whom we agree use statistics to reassure us that we’re on the right side, while our opponents use statistics to try and convince us that we are wrong. Statistics are one of the standard types of evidence used by people in our society.
It is not possible simply to ignore statistics, to pretend they don’t exist. That sort of head-in-the-sand approach would be too costly. Without statistics, we limit our ability to think thoughtfully about our society; without statistics, we have no accurate ways of judging how big a problem may be, whether it is getting worse, or how well the policies designed to address that problem actually work. And awestruck or naive attitudes toward statistics are no better than ignoring statistics; statistics have no magical properties, and it is foolish to assume that all statistics are equally valid. Nor is a cynical approach the answer; statistics are too widespread and too useful to be automatically discounted.
It would be nice to have a checklist, a set of items we could consider in evaluating any statistic. The list might detail potential problems with definitions, measurements, sampling, mutation, and so on. These are, in fact, common sorts of flaws found in many statistics, but they should not be considered a formal, complete checklist. It is probably impossible to produce a complete list of statistical flaws -- no matter how long the list, there will be other possible problems that could affect statistics.
The goal is not to memorize a list, but to develop a thoughtful approach. Becoming critical about statistics requires being prepared to ask questions about numbers. When encountering a new statistic in, say, a news report, the critical try to assess it. What might be the sources for this number? How could one go about producing the figure? Who produced the number, and what interests might they have? What are the different ways key terms might have been defined, and which definitions have been chosen? How might the phenomena be measured, and which measurement choices have been made? What sort of sample was gathered, and how might that sample affect the result? Is the statistic being properly interpreted? Are comparisons being made, and if so, are the comparisons appropriate? Are there competing statistics? If so, what stakes do the opponents have in the issue, and how are those stakes likely to affect their use of statistics? And is it possible to figure out why the statistics seem to disagree, what the differences are in the ways the competing sides are using figures?
At first, this list of questions may seem overwhelming. How can an ordinary person -- someone who reads a statistic in a magazine article or hears it on a news broadcast -- determine the answers to such questions? Certainly news reports rarely give detailed information on the processes by which statistics are created. And few of us have time to drop everything and investigate the background of some new number we encounter. Being critical, it seems, involves an impossible amount of work.
In practice, however, the critical need not investigate the origin of every statistic. Rather, being critical means appreciating the inevitable limitations that affect all statistics, rather than being awestruck in the presence of numbers. It means not being too credulous, not accepting every statistic at face value. But it also means appreciating that statistics, while always imperfect, can be useful. Instead of automatically discounting every statistic, the critical reserve judgment. When confronted with an interesting number, they may try to learn more, to evaluate, to weigh the figure’s strengths and weaknesses.
Of course, this critical approach need not -- and should not -- be limited to statistics. It ought to apply to all the evidence we encounter when we scan a news report, or listen to a speech -- whenever we learn about social problems. Claims about social problems often feature dramatic, compelling examples; the critical might ask whether an example is likely to be a typical case or an extreme, exceptional instance. Claims about social problems often include quotations from different sources, and the critical might wonder why those sources have spoken and why they have been quoted: Do they have particular expertise? Do they stand to benefit if they influence others? Claims about social problems usually involve arguments about the problem’s causes and potential solutions. The critical might ask whether these arguments are convincing. Are they logical? Does the proposed solution seem feasible and appropriate? And so on. Being critical -- adopting a skeptical, analytical stance when confronted with claims -- is an approach that goes far beyond simply dealing with statistics.
Statistics are not magical. Nor are they always true -- or always false. Nor need they be incomprehensible. Adopting a critical approach offers an effective way of responding to the numbers we are sure to encounter. Being critical requires more thought, but failing to adopt a critical mind-set makes us powerless to evaluate what others tell us. When we fail to think critically, the statistics we hear might just as well be magical.
Joel Best is a professor of sociology and criminal justice at the University of Delaware. This essay is excerpted from Damned Lies and Statistics: Untangling Numbers From the Media, Politicians, and Activists, just published by the University of California Press and reprinted by permission. Copyright © 2001 by the Regents of the University of California.
http://chronicle.com Section: The Chronicle Review Page: B7