Researchers Rate RateMyProfessors and Find It Useful, if Not Chili-Pepper Hot

November 18, 2011

The Web site RateMyProfessors evokes skepticism among faculty members. Some view the anonymous evaluation site as a haven for rants and odd remarks ("He will crush you like an academic ninja!"), or a place where students go to grade instructors based on easiness or attractiveness (a chili-pepper icon distinguishes professors that are "hot" over those that are "not").

But new research out of the University of Wisconsin at Eau Claire suggests the popular service is a more useful barometer of instructor quality than you might think, at least in the aggregate. And the study, the latest of several indicating RateMyProfessors should not be dismissed, raises questions about how universities should deal with a site whose ratings have been factored into Forbes magazine's college rankings and apparently even into some universities' personnel evaluations.

"There is the possibility that people may feel legitimized to use the information in potentially dangerous ways," says April Bleske-Rechek, an associate professor of psychology at Eau Claire, who is a co-author of the new study. They might, for example, give too much weight to comments on the site in deciding whether to hire someone or grant the person tenure.

RateMyProfessors, which debuted in 1999 and boasts over 10 million student-produced comments and ratings, calculates an instructor's quality by averaging how the site's users score the professor in two categories, "helpfulness" and "clarity."

Ms. Bleske-Rechek and her co-author, Amber Fritsch, a student at Eau Claire, described their study in "Student Consensus on," a paper published this month in the journal Practical Assessment, Research & Evaluation.

In their study, they probed the reliability of the site's ratings by focusing on the level of consensus among students for 366 instructors at their state university, each of whom had at least 10 evaluations.

The idea is that, if students rate professors based on idiosyncratic personal reactions—to a rude comment made in class, say—then it should take a lot of posts to reach a consensus. By contrast, if students are consistent in their ratings, then a consensus should emerge with a small number of evaluations.

Earlier studies of traditional paper-and-pencil evaluations have documented significant consensus with as few as 25 raters, Ms. Bleske-Rechek says. That's one rationale for using online evaluations; people argue that you will get the same distribution of responses even if everyone doesn't fill out the form.

So how did RateMyProfessors stack up?

Ms. Bleske-Rechek found professors with 10 evaluations displayed "the same degree of consensus in their quality ratings" as those with 50.

"Degree of student consensus about an instructor occurs very early on, in terms of how many raters there are," she said in an interview. "This is similar to what you see on traditional student evaluations of instruction. In other words, it seems like students are homing in on the same experiences in the classroom, because otherwise they wouldn't be showing consensus."

That suggests faculty members with at least 10 ratings "may be able to extract crude judgments" of how students perceive their "clarity and helpfulness," Ms. Bleske-Rechek and Ms. Fritsch write in their paper. 

 Ms. Bleske-Rechek got curious about trends on the site in part because of interactions with students. Initially, she figured RateMyProfessors attracted "spiteful" students or those who wanted to share stories about sliding by. But when she asked her own students whether they had posted, several said they had. What's more, those students were the ones she most respected.

In 2010, Ms. Bleske-Rechek published a paper that punctured some assumptions about the site's users. The article, co-written by Kelsey Michels, found students who post do not differ in relevant ways from those who don't. They are the same in grades, attitudes toward their educations, and learning goals.

"They don't seem to be heated people," Ms. Bleske-Rechek says. "Or the easy-A people—they aren't the only ones posting."

Ms. Bleske-Rechek isn't the first researcher to mine this territory. An earlier paper from New York's Marist College, "'He Will Crush You Like an Academic Ninja!': Exploring Teacher Ratings on," concluded that the site's evaluations "closely matched students' real-life concerns about the quality of instruction in the classroom."

The paper added, "While issues such as personality and appearance did enter into the postings, these were secondary motivators compared to more salient issues such as competence, knowledge, clarity, and helpfulness."

And another study, conducted by researchers at the University of Maine, found strong correlations between ratings on RateMyProfessors and formal in-class evaluations.

Other researchers have blasted RateMyProfessors, however. In a 2009 paper, Elizabeth Davison and Jammie Price, of Appalachian State University, in North Carolina, faulted the site's category system for fostering an "anti-intellectual tone that manifests itself in comments about instructors' personality, easiness of workload and entertainment value rather than knowledge attained."

"My biggest validity issue with the site is that Overall Score is being perceived as 'teaching effectiveness' and yet is only based on perceptions of helpfulness and clarity," Ms. Davison says in an e-mail. "I believe teaching effectiveness is more complex and should include more-robust measures such as how much did a student learn, preparedness of the instructor, or the challenging nature of the material."