Developers say the technology saves time and improves the assessment of students To a professor facing a stack of ungraded essays, this may sound like the story of the shoemaker’s elves -- a computer program that can scan an essay and, in a few seconds, reliably identify what material the student has learned and what he has not. The program is real, however -- created by researchers who have spent nearly a decade developing the technology that makes it possible. They call the result the Intelligent Essay Assessor. And when their program and human graders have been given the same essays to grade, the results have been remarkably consistent, they say -- except, of course, that grading each essay takes the program almost no time at all.
The assessor joins a host of other devices and software packages -- from bubble sheets to computer-based tests -- that professors, universities, and testing services have used for years to streamline student assessment. And while every announcement of a new grading technology prompts another round of alarm among those who worry that computers will start putting professors out of work, faculty members who have used the technologies say such fears are exaggerated. At most, they say, grading programs and various other high-tech offerings allow them to take a more “modular” approach to their jobs, concentrating on what they do best and letting machines handle some of the most tedious chores. In fact, the same technologies that underlie automated grading are also making increasingly sophisticated automated tutors possible. In regular courses and especially in distance-education programs, computer-administered assignments can provide valuable feedback to students -- without taking up a lot of the instructor’s time. Many academics remain skeptical of grading technologies. Andrew Feenberg, a professor of philosophy at San Diego State University, says companies in the educational-technology business “have been promising us a replacement for teachers for a long time.” He points to instructional television and computer-assisted learning as technologies that have promised much -- and disappointed many. “When you go to college, you’re looking for a much more complex experience” than what technological teachers can provide, he says. “Nobody knows how to write a program smart enough to actually do what even the most mediocre teacher does when he reads and grades a paper.” But the creators of the essay assessor, which is among the most sophisticated grading programs, say there are clear advantages to having a computer grade essays. In a large lecture class, for example, a professor and several teaching assistants might have to split up a stack of essays and spend hours grading them by hand. Not only would each grader have a different grading style, but he or she might grade each paper on a slightly different basis.The computerized essay assessor, say its creators, can grade each essay using exactly the same standards in a matter of seconds. The time saved can be spent giving personal help to students. “We think of it as giving the professors another tool,” says Thomas Landauer, a psychology professor at the University of Colorado at Boulder, who is leading the development of the essay grader. To use the program, a professor must first teach it to recognize a good or bad essay by feeding it examples of both varieties, which have been manually graded. The program can also be “trained” with what Mr. Landauer calls a “gold standard” -- passages from textbooks or other materials written by experts on the same subject as the essays to be graded. In many cases, a professor who assigns similar topics each year could use essays from past semesters to train the program, although the material used to do so must be in electronic form. Earlier digital essay graders -- most notably the Project Essay Grader, a technology that has been in development for more than 30 years -- also work by learning how humans grade and comparing new essays to those examples. But while that grader analyzes the sample essays mechanically -- looking at sentence structures and counting commas, periods, and word lengths -- Mr. Landauer says his program can actually “understand” student writing. It does so using a sophisticated form of artificial intelligence that the researchers call “latent semantic analysis” to look at how words in the reference material and sample graded essays are used in relation to one another. By examining the words in essays it is asked to grade, the computer can tell what subject the writer is discussing even if the writer hasn’t used the very same words as the source material. That process, says Mr. Landauer, allows the computer, “to a good approximation, to understand the meanings of words and passages of text.” The researchers get into the nuts and bolts of the process on their World-Wide Web site (http://lsa.colorado.edu), which also offers interactive demonstrations of the essay grader. The program compares the patterns of word usage in ungraded essays with the usage patterns it has learned from the initial samples. If an essay appears to convey the same knowledge as verifiably good essays, the computer gives it a high score. If a student’s work looks similar to a poor essay, it gets a low score. The program can also point out what a student has omitted. For each essay, the professor is presented with a report that gives a score and notes any irregularities that a human grader should check on. For example, if a student attempted to beat the computer by packing an essay with keywords relevant to the subject matter, the grading software would flag the essay for the professor to read. The grader can also tell if an essay has been plagiarized from material it has already learned, even if the student has attempted to paraphrase the source. “It’s looking for content,” says Mr. Landauer. “You can change the words, but the content’s still there.” But the program cannot recognize clever turns of phrase or creative approaches to an assignment. Mr. Landauer says it is not intended to be used for English-composition or creative-writing assignments, in which a student is being graded more on writing skill than on knowledge of a subject. The essay assessor works best on essays assigned to check students’ factual knowledge in such subjects as history, political science, economics, and the sciences. The program tests knowledge “by the student’s ability to put it into words,” he says. So far, the program has been used in middle-school and high-school courses, at the University of Colorado and New Mexico State University, and on essay questions from the Graduate Management Admission Test. Peter Foltz, one of the other developers of the essay assessor, used the program in his psycholinguistics class at New Mexico State University. For one essay assignment, he set up a Web site through which his students submitted their essays to the assessor, which gave them scores and pointed out information they had missed. Then, rather than take the first grade the computer assigned, Mr. Foltz let the students edit their work and resubmit it as many times as they liked. He told them he would do the final grading himself for those who didn’t like the computer’s evaluation, but none of the students took him up on the offer. Mr. Foltz, an assistant professor of psychology, sees such tutoring as the best use for the essay grader. “Students don’t get enough writing in their classes, and this is a way to incorporate more writing where professors don’t have to evaluate every essay they see,” he says. “It frees the professor up to spend more time with the students, because the professor’s not spending all the time grading essays.” Still, San Diego State’s Mr. Feenberg is concerned about “the tail wagging the dog.” If professors come to rely on computers for testing students, he says, the limits of the technology will control what the students are taught. “The students are being taught only how to pass the test,” he says. If an essay grader cannot monitor creative problem-solving or deeper analysis, students will be required to do no more than remember facts to get good grades, he says. While the creators of the essay grader have raised eyebrows by claiming that it can evaluate writing reliably, programs that make such claims are far less common than is software that simply administers and scores quizzes and tests, and can tell students where they’ve made mistakes. Malcolm Duncan, an instructional-computing expert at Purdue University, says programs like these can be even more valuable when used to tutor students. When students leave the classroom, he says, “they may not have gotten it in the lecture today -- but if they can go to the computer lab and go through a tutorial session on-line, they may.” Mr. Duncan says such on-line tutorials can be as useful -- and just as marketable -- as textbooks. That’s exactly what worries some professors who have incorporated lectures, class materials, quizzes, and exams into on-line courses. Once a professor stores his or her knowledge on a Web site or a CD-ROM -- whether in lecture form or in a test format -- the material could take on a life of its own as a stand-alone on-line course. And it would not necessarily be under the control of the professor. “A lot of professors are not going to be all that excited about buying somebody else’s course,” says Mr. Duncan, but they might buy question banks to aid in teaching specific topics. Such an approach would mean that the professors “still have control over how their course is taught, but they can have some good assistance in presenting that to their students.” Roy Rada, a professor of information systems and computer science at Pace University, already thinks of his job as being modular, at least to some degree. For example, he notes, while he has taught some classes using textbooks he has written himself, more often he uses the work of other authors. He still has to lecture, create and grade tests, and run the classroom, but instead of creating a textbook as well, he plugs in an outside source. He could just as easily plug in someone else’s tests, and he has already plugged in other people’s grading: He asks his students to grade one another’s papers on some essay assignments. Mr. Rada, who says he has taught on line since long before there was a World-Wide Web, says automated tools could have their greatest impact in distance education, in which it is hard for professors to communicate frequently with individual students. “You can put material on the Web and people can read it, but that’s not very helpful,” he says. Tools like self-grading quizzes and the essay grader, he says, can provide feedback to students who would otherwise feel isolated in the absence of a traditional classroom. But in an automated course with automated feedback, does the professor still play a role? “The pedagogy is in the people who put it in the system,” says Mr. Rada. He says professors will always be necessary, although they may find themselves concentrating their skills more on some aspects of teaching and letting computers handle others. And, he says, there will always be classrooms. “There is an enormous need for traditional education. I don’t think automated grading will have any impact on it.” “We’re social animals,” he says. “We go to class every Wednesday at 6 p.m., and it suits us.” http://chronicle.com |
We’re sorry, something went wrong.
We are unable to fully display the content of this page.
This is most likely due to a content blocker on your computer or network.
Please allow access to our site and then refresh this page. You may then be asked to log in, create an account (if you don't already have one), or subscribe.
If you continue to experience issues, please contact us at 202-466-1032 or help@chronicle.com.