Remaking the Grade, From A to D

James Yang for the Chronicle

September 14, 2009

Try the following experiment at your next faculty meeting. First ask, "What is the difference between those students who earn A's and B's and those students who earn D's and F's?"

You will hear a litany of responses including work ethic, organization, high-school preparation, and class attendance.

Next ask your colleagues to calculate the final grade for a student whose 10 assignments during the semester had received the following marks: C, C, MA (missing assignment), D, C, B, MA, MA, B, A. Then calculate the distribution of the final grades.

I've done that experiment with more than 10,000 faculty members around the world and, every time, bar none, the results include final grades that include F, D, C, B, and A. It turns out that the difference between the student who earns A's and B's and the one who earns D's and F's is not necessarily a matter of work ethic, organization, high-school preparation, or class attendance. The difference is the professor's grading policy.

Now change the scene from the faculty meeting to a crisp fall day in the football stadium. As the afternoon shadows fall on the goal posts, a pass is thrown to a receiver who lunges for the ball and tumbles into the end zone. One official signals a touchdown, the second official signals an incomplete pass, and the third official scratches his head in bewilderment. Faculty members, students, alumni, and trustees rise as one, complaining bitterly of the unfairness and incompetence of athletics officials who seem unable to view the same student performance and make a consistent judgment.

Professorial prerogatives notwithstanding, we ought to have a standard for grading policies that at least rises to the basics we expect of officials on the athletics field: accuracy, fairness, and effectiveness. Professors are typically granted wide latitude to establish and enforce grading policies within certain boundaries. It is not acceptable for faculty members to make mathematical errors in grading or routinely award grades that reflect gender or racial bias. But many grading policies often fall short of the three basic standards:

Accuracy. The first great assault on accuracy is the use of the zero on a 100-point scale. If the grade of A represents a score of 90-100, B is 80-89, C is 70-79, and D is 60-69, then the interval between each letter grade, A to B to C to D, is 10 points. But if a student fails to submit an assignment and receives a zero, then the interval from D to zero is 60 points, a sixfold penalty compared with the other grading intervals.

Let us stipulate that work receiving a D is wretched, and that the failure of a student to submit work at all is abysmal. The use of the zero, however, requires us to defend the proposition that abysmal is six times as bad as wretched. Students who fail to turn in work deserve a punishment that fits the crime; perhaps they should be required to do the work, suffer constraints on their free time, or be denied Facebook and Frisbee privileges. But should they lose an entire semester of credit, which can be the ultimate impact of receiving zeros for missing assignments, because of an irrational and mathematically incorrect grading policy? Even Dante's worst offenders were consigned to the ninth—not the 54th—circle of hell. Poets, it seems, understand interval data better than professors in the hard sciences do.

The use of the arithmetic mean, or average, to calculate final grades—often the consequence of computerized grading technologies—is another offense against accuracy in grading. I have reviewed math standards in more than 100 countries and noticed that most students understand early that the average is not necessarily the best way to represent a data set. They understand alternative representations, including the mode, median, and weighted averages, to name a few. They learn that politicians and marketers, among others, will use averages in taxes, employment, and income to mislead voters and consumers. But a decade later, as 19- and 20-year-olds, they are sitting in college classes in which grading policies worship at the altar of mathematical accuracy—engineering, statistics, French literature, educational psychology (the similarities in grading policies can be eerie)—and the use of the average is pervasive.

I've taught graduate statistics courses in which mathematicians are seated next to nurses, teachers, marketers, and biologists. My task was not to evaluate where they started but where they finished. For some of them, multivariate analysis was a recent memory, while for others, high-school algebra was a distant and painful one. The mathematicians soared at the start of class but were challenged a month or two later; their colleagues struggled to remember the basics of algebra at first, but reached their "Now I get it" moments during the final days of the class. They argued quite persuasively that the professor should not use the average of their scores to calculate their final grade, but rather should consider their proficiency at the end of the term. I worried, however, that the same graduate students and instructors who argued against the average in that class would return to their own students and, within a few hours, casually apply the average to calculate final grades.

Fairness. While I would not automatically extrapolate my research findings to other settings, the results are sufficiently alarming to invite introspection. I have found that faculty members sometimes conflate quiet compliance with proficiency. That sends the message to students—female students in particular—that the path to success is acquiescence rather than achievement.

My observation comes first from a simple analysis of the membership of the National Honor Society. In the high school where I volunteer, the gender balance of the student body is equal. Yet the ratio of women to men in the National Honor Society in this high school is eight to one. I have checked hundreds of coeducational institutions since that observation and found all of them to have a female-to-male advantage. A gender imbalance is also found in the college-matriculation rate of women to men: 58 percent to 42 percent, respectively. I've lost enough debates with women to stipulate that it may be true that they are smarter than men, but I doubt that they are eight times as smart. Some other factor is at work here, and it may be the societal value that elevates behavioral submission over academic performance.

I also analyzed the results of students who received A and B grades but failed external examinations in literacy and math. Those students were disproportionately female and self-identified as ethnic minorities. A cynic might label this the "bless her heart" effect, as in, "She really isn't very proficient, but bless her heart, she showed up every day, participated in class, and didn't give me any trouble."

That may not apply to your student body, but I would ask only that you find out if the dropout and failure rates of your students are equally distributed by gender and ethnicity. If not, it is at least possible that students were lured into the challenge of your institution based upon rewards for quiet compliance, and that then they were punished for not having the skills required for college-level work. Conversely, some minority male students may have never reached the front door of your institution because, as high-school students, they were not rewarded for academic proficiency but punished for abrasive behavior that was unrelated to academic performance.

The most perplexing part of unfair grading policies is that they are rarely intentional. I know of no college or school system that has an affirmative-action policy to secure more bigots and sexists on the faculty. On the contrary, the "bless her heart" effect (pronoun very deliberate) stems not from malice but from compassion.

Effectiveness. Finally, we should consider whether the impact of grading policies has led to improved student performance. A basic question that faculty members must ask is, "Were my students last semester more engaged, responsive, and successful than students in previous years?" If the answer is "yes," then present grading policies are fine. I am astonished, however, at the number of professors who complain loudly that students are disrespectful, inattentive, disengaged, and unresponsive—and yet who wish to pursue the same grading policies they have used for a decade or more.

Fortunately, the solution to the quandary of effective grading practices is close at hand. On the athletics field, I've never seen a coach with a grade book and red pencil, yet I have witnessed many a coach who provides feedback designed to improve performance. Similarly, I've noticed, while watching the conductor of the collegiate orchestra or chorale lead a rehearsal, how infrequently quizzes and tests are administered and how rarely grades are awarded. Instead, the conductor frequently provides feedback for the singular purpose of improving student performance.

The Class of 2013 grew up playing video games and received feedback that was immediate, specific, and brutal—they won or else died at the end of each game. For them, the purpose of feedback is not to calculate an average or score a final exam, but to inform them about how they can improve on their next attempt to rule the universe.

Imagine a class in any other subject, from science to classics, conducted in the same way. The students wail, "Does it count?" and the professor responds, "I'm just giving you feedback to improve your performance—try to do better next time." I have never heard students thank their Nintendo machine for its insightful feedback, but I have observed many of them respond more attentively to those machines than to their professors.

Now is the time to make modest but important improvements in grading policy. Without leave of administrators or permission from grading-system programmers, professors can stop the use of the zero. They can suspend the use of the average. They can override the deterministic mentality that drives so many grading systems and provide regular feedback designed to help students actually learn. They can, in brief, be accurate, fair, and effective. It is no more than our students demand of athletics contests and video games. As teachers, we should do no less.

Douglas B. Reeves is chairman of the Leadership and Learning Center, a consulting firm in Englewood, Colo., and author of more than 20 books on leadership and education.