Software Catches (and Also Helps) Young Plagiarists

Sarah Weeden for The Chronicle

Paulette Swartzfager reviews papers in her writing class at the Rochester Institute of Technology. Both she and her students use Turnitin software to screen their work.
November 06, 2011

The spread of technology designed to combat academic cheating has created a set of tricky challenges, and sometimes unexpected fallout, for faculty members determined to weed out plagiarism in their classrooms.

In the latest development, the company that sells colleges access to Turnitin, a popular plagiarism-detection program that checks uploaded papers against various databases to pinpoint unoriginal content, now also caters directly to students with a newer tool called WriteCheck, which lets users scan papers for plagiarism before handing them in.

Meanwhile, faculty members at some colleges are adopting a reverse image-search program called TinEye, which lets them investigate plagiarism in ­visual materials like photos and architectural designs.

Cheating is nothing new. But as the ­frontiers of academic policing continue to advance—some 2,500 colleges now use Turnitin—faculty members are being pushed to confront classroom conundrums: Should they scan all papers for plagiarism, and risk poisoning the classroom atmosphere? Should they check only suspicious texts, and preserve harmony at the risk of missing clever cheaters? Could Turnitin and technologies like it lead to more plagiarism, since professors might depend on their imperfect results rather than vigorously investigate suspicious material on their own?

One expert on plagiarism, Rebecca Moore Howard, worries that the widespread adoption of antiplagiarism programs is putting professors in the role of police officers. "When used as a default, they also set up a default climate of suspicion in the classroom," says Ms. Howard, a professor of writing and rhetoric at Syracuse University.

Gauging the extent of plagiarism is difficult. Roughly 62 percent of undergraduates and 40 percent of graduate students admit to having cheated on written work, according to the latest figures from a long-running national survey by Clemson University's International Center for Academic Integrity. The infractions range from cut-and-paste copying to buying a custom-written paper from an essay mill. Despite the perception that cheating has gotten out of hand with so much online content available to copy, the numbers have not changed much over the two decades that the survey has been conducted.

What has changed is how much easier it is to find plagiarism. Once, instructors who suspected cheating had to trek off to the library and hope they could track down the book a student had plagiarized from, recalls Susan D. Blum, a professor of anthropology at the University of Notre Dame. Now they can start the investigation with software that resembles a spell-checker. Some colleges even mandate that all written assignments be subjected to a digital pat-down. The growth of online education also favors such scanning, since papers must be submitted electronically anyway.

The results of all this digital sleuthing can be devastating, a fact that was driven home in a widely discussed blog post that drew fresh attention to the issue in recent months. In an essay headlined, "Why I will never pursue cheating again," Panagiotis Ipeirotis described what happened after he started using Turnitin in his "Introduction to Information Technology" class last fall at New York University. By the end of the semester, 22 out of the 108 students had admitted cheating.

That statistic made headlines, but the post was just as compelling for what it showed about the painful consequences of Turnitin. Anxious students, contentious discussions, time-sucking investigations—all of it made for an unpleasant class, and, ultimately, lower course evaluations for Mr. Ipeirotis, a computer scientist who teaches in NYU's Stern School of Business.

Provoked by the professor's story, other faculty members have since engaged in a lively conversation about how to avoid his fate. Their suggestions run the gamut: Force students to write all drafts in class, one says. Adopt a zero-tolerance, flunk-'em attitude, says another. Hold oral exams. Shift to assignments that rely on libraries and printed texts rather than online materials. Build creative projects that require students to turn in interview recordings or blend personal narratives into their research. Remove the temptation to cheat at the last minute by breaking up assignments into multiple pieces with discrete due dates.

Student Use of Software

Another strategy is to let students use Turnitin on their own drafts. That's the approach taken by Paulette Swartzfager, a lecturer at the Rochester Institute of Technology who has taught freshman writing for 40 years. In her case, the idea came from an unlikely source: a student.

Ms. Swartzfager was discussing inadequate paraphrasing last year when the student asked if she had used Turnitin. She replied that she had not; like other Turnitin resisters, she wanted to teach, not police. But the student explained that Turnitin can be set to allow students to read its reports. The lecturer gave it a try, and now all of her students screen their work as a default.

"What's happened as a result of this has just been wonderful," Ms. Swartzfager says. "They use it as a tool. They keep resubmitting it and working on it until it gets appropriately in their own words, or in quotations, or cited." The students, she adds, are "not nearly so nervous."

Some students, like Hassan Alsaffar, even use the access that Ms. Swartzfager gives them to scan papers for classes taught by professors who don't use Turnitin. "Because English is my second language, I'm afraid that I may take some words or sentences" from sources and use them in a way that "could be plagiarism," says Mr. Alsaffar, a soft-spoken junior from Saudi Arabia.

He tries to paraphrase but sometimes doesn't do it well enough. Turnitin shows him where. "I would like to have it in almost every class," he says.

But many professors don't use the software, or don't allow students to screen drafts. So iParadigms, the company that makes Turnitin, rolled out a new product geared directly to students: WriteCheck. For a base price of $7 per paper, students can vet their work against the same databases professors use with Turnitin. (Those repositories comprise 14 billion Web pages, 150 million student papers, and about 110 million periodical articles.)

WriteCheck has attracted 250,000 student users since its debut, in 2008, according to the company. But recently it has aroused fear and outright hostility from some professors and experts on cheating.

For Teresa A. Fishman, director of Clemson's academic-integrity center, the software signals the escalation of a technology battle that has parallels in law enforcement.

As a police officer in Missouri in the 1980s, Ms. Fishman would watch her agency buy the latest speed gun. Then a new radar detector would come out to help drivers evade it. Sometimes the same company made both—just as iParadigms makes both WriteCheck and Turnitin.

"In that case, it turns out to work pretty well, because what the police want is for the people to slow down anyway," she says. "But in our case, we're trying to teach people something, and we don't want them to be able to avoid learning the lesson."

Ms. Fishman worries that students might simply scramble the words in spots that trip WriteCheck, rather than use it to flag where they neglected to cite something.

David E. Harrington, an economics professor at Kenyon College, blasted the product in a September blog post. He argues that WriteCheck undermines professors' ability to use Turnitin as a deterrent against plagiarists. That's especially true in large classes like introductory economics, he says, where an essay assignment is likely to involve a generic topic—consumer reaction to gas prices, say—that has been written about in papers available online.

Students can find an obscure article, steal it, submit it to WriteCheck, and see if it creates a high similarity score. If it doesn't, they can hand it in. "Without much effort, you can find out whether something that you're going to plagiarize heavily is in the Turnitin database," Mr. Harrington says.

The company defends its product by pointing to various features aimed at thwarting students who hope to game the system. WriteCheck limits the number of resubmissions, so you can't keep running a document through to figure out how to write a plagiarized paper that won't get picked up by Turnitin, says Chris Harrick, vice president for marketing at iParadigms. Also, while the software highlights matched content, it doesn't reveal the sources of that material.

"We built WriteCheck in a way that would make it more onerous for a student with improper intent to game Turnitin than it would be to just write an original paper," Mr. Harrick says.

By his account, WriteCheck users tend to be graduate students checking work for mistakes like accidental plagiarism, "Type-A personalities" who want to ensure papers are clean, and non-native users of English.

A Writing Problem

Some researchers point to better teaching, not technology, as the key to preventing plagiarism.

Ms. Howard is a leader of the Citation Project, a national study of 174 student papers from 16 colleges and universities. The project, conducted in collaboration with Sandra Jamieson, chair of the English department at Drew University, brings data to claims about student writing by studying how students use the texts they cite. What it has uncovered so far is "not happy news," she says.

Students are "dragging sentences out of random, simplistic sources and pasting them together in an often incomprehensible pastiche" of sentences, Ms. Howard said in presenting the data at a conference this year.

"How much plagiarism goes away if students actually know how to read and write from sources?" she asks The Chronicle. "My guess is: a lot."

At the conference, Ms. Howard elicited gasps from her audience when she showed one student paper from the project. The eight-page paper, about genetically modified foods, drew nearly half of its source material from the first page of a three-page WebMD article. Basically, the student took that text and broke it up with nine citations and a few original phrases.

The example illustrates broader trends. Students use books and journals, and they generally know how to cite them. But what they cite tends to come from the first page of a source, the project found. They pull "killer quotes" rather than engage with the overall argument. Almost half the time, they cite sources four pages or fewer in length.

On the basis of that research, Ms. Howard calls for a "fundamental shift" in how writing is taught. Professors should focus more on starting the research process collaboratively with students, she says. They should select a few complex sources and explore them with the whole class.

"What that means is not rushing students quite so quickly in their first semester in college into writing a 25-page research paper written from 15 sources," she says, "but rather taking them through the process of engaging with those sources first."

Others echo her emphasis on pedagogy and communication. Ryan Cordell, an assistant professor of English at St. Norbert College, in Wisconsin, describes citation by putting it in terms that students understand. You wouldn't steal somebody's post on Twitter, he explains to them. Instead you mark it with "RT," for retweet. Same with Facebook: "If you get something cool from someone, you tag them."

"Most students get why you would do that," says Mr. Cordell, who directs his college's writing program and contributes to The Chronicle's ProfHacker blog. And, translating to the world of scholarship, Mr. Cordell tells his students that academics cite one another for the same reasons: "He had a cool idea. It's his idea. I'm reusing it. And so I need to let people know that I got it from him."

And while some professors might gripe about the bottomless well of ready-to-plagiarize online content, others see the Internet as a way to design assignments that mitigate plagiarism.

In a Twitter exchange following the NYU cheating episode, Ira Socol, a graduate student in teacher education at Michigan State University, talked about having students do all their work publicly on the Web. That way, he wrote, plagiarism "tends to get revealed 'naturally.'"

Michael Wesch, an associate professor of cultural anthropology at Kansas State University, calls the Internet the "largest collaboration machine ever created." His students have used that asset to analyze YouTube culture, probe the role of anonymity online, and study the culture of higher education.

"The key is to create a classroom environment where students feel like they are part of a real and relevant project that they care about," Mr. Wesch says in an e-mail to The Chronicle. "Students borrow ideas from each other and from the vast pool of scholarship all along the way, and they 'cite it,' not because it is required or appropriate, but because it helps us all find the resource and find out more. It is only through such an authentic exploration that students can really see why there are rules against plagiarism at all."

Students Fail to Read Sources Deeply

The Citation Project, a national study of 174 student papers from 16 colleges and universities, is examining how students use sources in their research papers. Here are some highlights of the preliminary findings, released this year:

Students rarely cite material located very far into sources:

46% of all of the citations that students made are to the first page of the source, and 23% are to the second page.

77% of all of the citations are to the first three pages of the source, regardless of whether the source is three pages or more than 400 pages long.

9% of the citations are to Page 8 of a source or beyond.

Sources are misused in one of five citations, and citations almost always draw on very short passages:

Of the 1,911 student uses of sources that the project coded, 4% are copied and cited but not marked as quotations from a source; 42% are copied and marked as quotations; 16% are "patchwritten," defined as "restating a phrase, clause, or one or more sentences while staying close to the language or syntax of the source"; 32% are paraphrased; and 6% are summarized.

20% of the source uses represent a misuse of materials, with students failing to mark them as quotations or patchwriting.

96% of the source uses show students working with two or fewer sentences from the text rather than engaging with a sustained passage in the source.

More than half of the papers misuse sources:

Of the 174 papers the project reviewed, 19% included at least one instance of copied material that is cited but not marked as a quotation; 91% included at least one instance of copied and cited material marked as a quotation; 52% included at least one instance of patchwriting; 78% included at least one instance of paraphrasing; and 41% included at least one summary.

56% of papers misuse sources by either failing to mark copied words as a quotation or by patchwriting. Of those, 15% did both.

Source: The Citation Project