Beginning in 2011, the 331 universities that participate in the Voluntary System of Accountability will be expected to publicly report their students’ performance on one of three national tests of college-level learning.
But at least one of those three tests—the Collegiate Learning Assessment, or CLA—isn’t quite ready to be used as a tool of public accountability, a scholar suggested here on Tuesday during the annual meeting of the Association for Institutional Research.
Braden J. Hosch is director of institutional research and assessment at Central Connecticut State University, one of four institutions in that state that participate in the accountability system. He scrutinized the performance of students on the CLA at his institution over a three-year period and discovered something that made him queasy: Students’ performance on the test was strongly correlated with how long they spent taking it.
To understand why that pattern might make an assessment specialist uncomfortable, here is a brief primer on the test. In the CLA, students write essays or memoranda in response to material that they haven’t seen previously. The goal is to measure students’ skills in critical thinking, problem solving, analytical reasoning, and writing.
The CLA is one of three tests that have been endorsed by the Voluntary System of Accountability, a three-year-old effort by public, four-year universities to supply basic, comparable information on the undergraduate student experience online. Besides the CLA, which is sponsored by the Council for Aid to Education, other tests that participants in the voluntary system may use are the Collegiate Assessment of Academic Proficiency, from ACT Inc., and the Measure of Academic Proficiency and Progress, offered by the Educational Testing Service. The paper that Mr. Hosch presented here on Tuesday concerns only the CLA, and not the other two tests.
Colleges that participate in the CLA typically administer the test to approximately 100 first-year students and to approximately 100 seniors each year. If the seniors’ scores are higher than those of the first-year students, that is taken as evidence that students at that college gain fundamental skills while they are there.
The test has sometimes been criticized for relying on a cross-sectional system rather than a longitudinal model, in which the same students would be tested in their first and fourth years of college. The test’s creators say that the cross-sectional model is valid and that a longitudinal model would be severely cumbersome for most colleges, because many students transfer or take longer than four years to graduate. Several papers that defend the CLA’s framework can be found at the project’s Web site.
Mixed Motives
But beyond that basic question of design, there have long been concerns about just how motivated students are to perform well on the CLA. Why sit there and carefully craft an essay, after all, if there is no particular reward or punishment for your performance?
At Mr. Hosch’s university, freshmen are often recruited to take the test in conjunction with a “first-year experience” course that all students take. “But across the sections of that course, there’s a lot of variation in how instructors approach it,” Mr. Hosch said. “Some instructors really integrate the CLA into the course, and ask students to write reflective essays after they take the test. Others just say casually, Hey, here’s a test you can take to get five points of extra credit.”
Seniors at Central Connecticut State, meanwhile, are recruited to take the test through entirely different mechanisms. Mr. Hosch and his colleagues originally tried sending e-mail messages that appealed to students’ sense of institutional loyalty: Help us improve our curriculum and instruction, they said. But that approach yielded a grand total of zero students after six weeks. So the university instead turned to low-level bribery. Seniors who volunteer to take the test now have their $40 cap-and-gown fees waived.
With those very different motivations, will students actually take the test seriously as they sit there drafting their essays? Or, as Mr. Hosch put it, “If you’re a senior distracted by the end of the year, and your cellphone rings 20 minutes into the test, do you just pack up and walk away?”
Most cohorts of students at Central Connecticut State have apparently done well on the CLA. And one cohort—seniors who took the test in the spring of 2009—did remarkably well, with a mean score at the 98th percentile of all CLA test-takers nationwide. (That percentile figure is an “adjusted” score, taking into account the average SAT scores of Central Connecticut State students.)
Why did that cohort do so well? One answer appears to be that they spent an average of 63 minutes taking the test, up from 45 minutes for the previous year’s crop of seniors.
And why was that? Did the 2009 cohort happen to be a more motivated, conscientious bunch? Were the test items more engaging? Did the test proctor say something different that year at the beginning of the test?
The Time Factor
No one knows. But the pattern was consistent across all of the cohorts that Mr. Hosch studied: The longer the students spent on the task, the higher their average scores.
And that is what worries Mr. Hosch. The CLA is a worthy effort, he said, but it should not be used for high-stakes accountability programs until colleges get a better handle on making sure that students who take the test are representative of the entire student body and that they devote roughly equal amounts of effort to the test.
“I’m not suggesting that we give up on the CLA,” Mr. Hosch said. “I’m not suggesting that we give up on measuring student learning. But I do think we should acknowledge that test scores are related to time spent on the test, and I think we should research that further.”
The simplest solution, Mr. Hosch said, would be to motivate students by making the CLA a truly high-stakes test—something that really mattered for their grade-point average or their graduation. But Mr. Hosch said that approach would be a serious mistake. “A high-stakes assessment is not the way we want to go,” he said.
Among other things, Mr. Hosch suggested that small groups of similar colleges should create consortia for measuring student learning. For example, five liberal-arts colleges might create a common pool of faculty members that would evaluate senior theses from all five colleges. “That wouldn’t be a national measure,” Mr. Hosch said, “but it would be much more authentic.”
‘Raising the Stakes’
In an e-mail message to The Chronicle on Tuesday, Richard Shavelson, a professor of education at Stanford University and one of the CLA’s creators, conceded that students’ motivation is related to their performance on the test.
But he added that at the institutional level, those variations in motivation tend to wash out, so that it is still valid to use the test to assess a college’s general level of learning. (Jeffrey T. Steedle, a graduate student at Stanford, presented evidence to that effect last month at the American Educational Research Association’s conference.)
“Braden is correct to point out that motivation is critical and a big concern in low-stakes testing and can affect individual students’ test scores,” Mr. Shavelson said. “The challenge confronting higher education is for institutions to address the recruitment and motivation issues if they are to get useful data. From my perspective, we need to integrate assessment into teaching and learning as part of students’ programs of study, thereby raising the stakes a bit while enhancing motivation of both students and faculty. (Incidentally, we find that some faculty do not support assessment programs and convey their feelings to students as well.)”
Richard B. Arum, a professor of sociology at New York University who has studied the CLA, said in an e-mail that he was not surprised by Mr. Hosch’s findings. And he said that he shared Mr. Hosch’s concerns about using the CLA in public accountability regimes.
“I do agree with his central point that it would not be prudent to move to an accountability system based on cross-sectional assessments of freshmen and seniors at an institution,” said Mr. Arum, who is an author, with Josipa Roksa, of Academically Adrift: Limited Learning on College Campuses, forthcoming from the University of Chicago Press.
Mr. Hosch’s paper and related materials are available at his Web site.