A recent report by SRI Education is one of the larger attempts to determine whether adaptive learning — the trendy idea that tutoring software can improve student outcomes by diagnosing and responding to their individual needs — is actually helping students learn better than traditional methods do. Call it a midterm for this kind of educational tool.
What’s the grade? While adaptive learning definitely hasn’t made the honor roll, it turns out there were some limitations in the assessment strategy. Let’s call it an incomplete.
The report looks at 14 colleges for a total of 23 courses, including multisection courses. The institutions collected data for three semesters, reaching a total of more than 19,500 unique students and 280 unique instructors. All of the projects received grants from the Bill & Melinda Gates Foundation (which, in full disclosure, has also given funds to my consulting company).
The results are underwhelming, with pockets of modest to moderate gains here and there. But more important, the report demonstrates how difficult it is to research the impact of an educational intervention at scale.
Here are a few of the findings highlighted in the report’s executive summary. Note the qualifications in each finding:
A majority of courses that used adaptive learning had “no discernible impact” on grades, with just four out of 15 that could be assessed resulting in “slightly higher” averages.
SRI found no evidence that adaptive learning had had an effect on course completion in the 16 grantee-provided data sets “appropriate” for estimating that impact.
The study found “minor enhancements of course grades,” on average, but few strong outliers.
Students and instructors in two-year colleges and developmental courses reported high levels of satisfaction with their adaptive-learning experiences.
However, only a third of students in four-year colleges expressed overall satisfaction. The researchers wrote: “It is not clear from the available data whether the qualities of the courseware they were using, the way in which the courseware was implemented, or a sense of loss of instructor personal attention was most responsible for their lack of enthusiasm.”
None of that should be especially surprising. In the 60 or so years of academic research on various forms of electronic tutoring systems, the literature shows strong benefits in a handful of relatively well-defined teaching situations, more modest and uneven gains across a wider range of courses, and very little evidence that adaptive learning is a cross-curricular panacea.
Further, as the SRI researchers are quick to point out in the report, their study provided only very limited analysis of crucial differences in the course designs in which the adaptive-learning products were used. Without that context, asking how much adaptive learning affects a course is a little bit like asking how much the use of a calculator affects a course.
Well, it depends. Are you teaching a math course? An English course? Are you using the calculator as a paperweight? Does the way you use the calculator change the way you teach? Or the way that students learn?
If one were to conduct a study of how much use of a calculator, or a computer, or a smartphone, or an autoclave affects learning across all classes, or even across all math or science classes, the results would probably be murky and unimpressive.
Those results would not prove that calculators and computers are useless for teaching anything. A screwdriver is extremely useful for turning screws but not so much for driving nails. Like any other tool, adaptive learning is an enabler. It has its uses, but those uses do not include magic.
That said, the murkiness of the results are not only, or even mainly, due to the limitations of adaptive learning. This study is plagued by a fundamental and pervasive methodological problem in educational research — namely, that it is often either impossible or unethical to control variables in a way that gets empirically solid, reproducible results.
Anyone who has had at least one high-school science class will probably remember that the key to doing science is to control the variables. To measure the impact of a variable, one must make sure that the variable is the only relevant detail that changes in your experiment. In educational research, the biggest variables are often the students.
So in order to get an empirically valid apples-to-apples comparison of the effects of adaptive learning, which is specifically intended to help students in the same class with different learning needs, you have to start by making sure that the students you are studying are the same as one another.
Setting aside the head-spinning contradiction, how often are the students in any two course sections, even within the same course at the same institution, similar enough for a valid controlled experiment? The answer in the SRI study is “rarely.”
The researchers had to remove “roughly 70 percent of the submitted data from the final student-outcomes analysis” because students in the adaptive-learning course sections had substantially different scores on beginning-of-the-course prior-learning assessments than did their peers in the nonadaptive course. Of the 14 grantees, only one “showed baseline equivalence for every term’s data.”
Like any other tool, adaptive learning is an enabler. It has its uses, but those uses do not include magic.
Suddenly, a study involving more than 19,500 unique students and 280 unique instructors doesn’t look so big. To its credit, the Gates Foundation gave grantees significant latitude to tailor adaptive-learning projects for the specific needs of their respective campuses. But doing so made comparing results across projects harder.
Is there a big takeaway from the SRI study? For adaptive learning specifically, it’s probably “your mileage may vary.” The researchers do a good job of teasing out some of the variables that may make incremental but real differences on educational impact. That by itself makes the study worth reading.
More generally, the study does provide a reality check on the role of the learning sciences in the craft of education. We are learning more about how people learn all the time. But we are still at the basic research level, and we face daunting methodological challenges.
Professors should be encouraged to develop a basic level of educational-research literacy so that they can factor research into their professional judgment of which approaches to use in their teaching. But at the end of the day, that research is unlikely to yield any simple educational-technology prescriptions that will produce drastic across-the-board gains.
Improving education is not like eradicating polio. Educational research is more likely to be useful as a tool for helping faculty members become more-skilled practitioners than it is as a way of finding a miracle cure.
Michael Feldstein is a partner at MindWires Consulting, co-publisher of the e-Literate blog, and co-producer of e-Literate TV.