As states create warehouses of information about students, scholars see opportunities to assess the effectiveness of education
Last month several news organizations reported on the emergence of “fusion centers” — vast data clearinghouses, operated by state law-enforcement agencies, that can instantly call up key personal information on anyone: telephone numbers, insurance records, family ties, and much more.
Architects of the fusion centers say they are a long-overdue tool for combatting crime and terrorism. But critics warn that the centers are a menace to privacy and say they have been constructed at the state level to avoid the scrutiny that a single federal data system would attract.
The fusion-center debate has an echo in the world of education research. Now that Congress has rejected the idea of a national “unit-record tracking” system for student data, scholars and policy analysts are tantalized by the possibility that states will beef up their own education-data centers.
The most celebrated example is Florida, which began in 2001 to assemble a “data warehouse” that allows officials to track a person’s progress from kindergarten through graduate school and beyond, including postcollege wages and employment, military service, incarceration, and receipt of public assistance.
Many researchers say that Florida’s system, along with somewhat less ambitious education databases in Texas, Washington, and roughly a dozen other states, is a vital tool for assessing schools and colleges and helping them to improve.
“Being able to follow students longitudinally is the key to any sophisticated understanding of how colleges are doing and what’s happening to students,” says Thomas R. Bailey, director of the Community College Research Center at Columbia University’s Teachers College.
Researchers have used the new databases to study many issues, including which high-school math courses are most important for college success and how exposure to adjunct instructors affects student retention.
But the new education databases create obvious challenges for protecting student privacy — which is one reason most states have been slow to build them. Florida’s education department takes elaborate steps to “de-identify” its information before handing it to outside researchers.
Despite those efforts, nervous officials in other states look at a system like Florida’s and worry about potential violations of the Family Educational Rights and Privacy Act, or Ferpa. In March the U.S. Department of Education proposed new Ferpa regulations that might clarify the ground rules for the use of such databases, but it is far from certain that the new rules will make states more comfortable with the projects.
Last month Mr. Bailey joined two dozen other scholars at a conference in Washington convened by the National Academies and the American Educational Research Association. The scholars there — many of whom have used state databases in their own research — want to encourage the development of more state data centers. They did not belittle the privacy question, however. It would only take one serious breach of anonymity, several participants said, to destroy the public trust that sustains the new databases.
Opportunity and Risk
Huge student databases like Florida’s are catnip to researchers.
“We are far from having exhausted the important research questions that can be addressed with these types of data,” Mr. Bailey said during the conference. “Even if we doubled or tripled our capacity at the Community College Research Center, we couldn’t possibly deal with all of the issues that these data could be used for.”
Jane Hannaway, director of the education-policy center at the Urban Institute, pointed to three virtues: Because the data systems include all students, there is often no need for the researcher to worry about constructing a statistically valid sample. Because the data extend across time, scholars can determine how a student’s performance changes over a number of years, which removes some of the usual anxiety about isolating the effects of a single teacher or classroom. And with such a vast number of observations at hand, researchers can do multiple comparisons at once.
In an influential 2005 study, one of Mr. Bailey’s Teachers College colleagues used a large database in Washington State to look at the wages of nontraditional students several years after they entered community college. That study found that for students who begin community college at the age of 25 or older, there is a positive “tipping point": After a student earns 10 or more college credits, future wages tend to improve, even if the student never earns a degree. Below that threshold, community college does not seem to do these students much economic good. That pattern would have been hard to detect without a large-scale data system like Washington’s.
Despite this potential analytic power, many states have shied away from creating robust data systems. That has partly to do with a lack of resources and expertise, Mr. Bailey says. But it also has to do with nervousness about federal and state privacy laws.
Colleges and researchers might soon be given a new set of ground rules for managing privacy in this arena. The pending revisions in the Ferpa regulations do not involve any drastic changes, but they do include several clarifications about how states, school districts, and colleges should protect student confidentiality when working with databases.
Among other things, the regulations would require educational agencies to sign written agreements when they give data to outside researchers and would require outside researchers to return or destroy the data when they are finished with it. (The deadline for public comments on the draft regulations is May 8.)
The draft regulations also make explicit that when states like Florida release de-identified data to outside researchers, that does not count as a “disclosure” under federal law, so there is no need to worry about Ferpa’s consent rules. That clarification has long been sought by researchers because states have interpreted that point differently.
But the commentary that accompanies the draft regulations warns that even when names, Social Security numbers, and dates of birth have been completely removed from data, privacy concerns can remain. For example, imagine that a researcher publicly reports that 80 percent of Latino students at a particular community college failed a particular course, when only a small number (say, five) Latino students took the course. Their privacy and reputation might be compromised.
To deal with that danger, the draft regulations instruct each state to identify a number below which data may not be reported for a particular “cell” of students. In Florida’s case, the number is 10. Researchers are not permitted to disaggregate data so finely that they are reporting on groups of nine or fewer students.
“We came up with 10 because it was the largest number used by any of the agencies in our data system,” says Jeff Sellers, an assistant deputy commissioner in the Florida Department of Education. “We decided to err on the side of caution.”
Making Connections
Mr. Bailey hopes that the federal government will do more to prod states to take action — and especially to create better links between their school databases and their postsecondary databases. Since 2002 the Education Department has given states grants totaling around $40-million per year to improve their data systems. And since 2005, the Bill & Melinda Gates Foundation has supported a coalition known as the Data Quality Campaign, which encourages states to create unified databases of student achievement.
“Even if Ferpa did not exist, many of these challenges would still be with us,” Mr. Bailey says. “Colleges’ IT systems aren’t set up to analyze this stuff. The data generally aren’t stored in a way that’s ideal for research, because that’s not the purpose for which the system was designed. The resources and the time that it takes the staff of these places to comply with requests from researchers — those things are not necessarily Ferpa-related.”
“We have not had the linkages to K-12 that we hoped we would have,” says William E. Knight, director of institutional research at Bowling Green State University. “The Ohio Department of Education simply hasn’t wanted to do it.”
Mr. Knight says he is generally pleased with the quality and breadth of Ohio’s postsecondary-data system, which was created in 1998. Among other things, Bowling Green is able to give a nearby institution, Owens Community College, detailed reports on the fates of its transfer students. Mr. Knight can also view Ohio employment data that allow him to analyze the wages of Bowling Green graduates from particular majors. “But the K-12 linkage — we haven’t been able to crack that nut,” Mr. Knight says.
Mr. Knight and Mr. Bailey both say they hope that the Ferpa revisions and the apparent success of the Florida system will help overcome some of the resistance in Ohio and other states to creating broader data centers.
A final barrier, cited by Mr. Bailey and several other scholars at the National Academies conference: No matter how strong or weak a state’s data system might be, outside researchers need to gain the trust and respect of local officials before they can tap into the data.
“You have to realize that these are public officials, and it takes a lot of courage for them to make public some of these numbers,” Mr. Bailey says. “We always try to explain that we’re there as partners and we want to help them answer questions that are important to them.” (Mr. Sellers says that he and his colleagues in Florida insist on such an approach; if a researcher comes to them with a project that seems irrelevant to Florida public policy, it will probably be rejected.)
“Our job is to use these numbers to help colleges improve what they’re doing,” Mr. Bailey continues. “Not to judge them or to somehow expose them as incompetent. On the other hand, our job isn’t to explain away bad numbers, either. So it’s a balancing act.”
http://chronicle.com Section: The Faculty Volume 54, Issue 35, Page A10