As officials at Ohio State University worked on improving their program offerings, they encountered one need over and over: more people who can manipulate and make sense of data.
They heard it from the Obama administration, and from consultants like McKinsey & Company, which in 2011 projected that the United States could face a shortage of as many as 190,000 people with those skills by 2018. They heard it from business leaders, who described having to retrain new hires to make them versatile data scientists.
But when they looked at Ohio State’s offerings, they found expertise scattered across campus. There was no unified undergraduate pipeline for producing the workers that companies wanted, says Christopher M. Hans, an associate professor in the department of statistics. In response, Hans and a professor of computer science, Srinivasan Parthasarathy, joined with other colleagues to start an interdisciplinary undergraduate major in data analytics. The major, which began in 2014, now enrolls 104 students, with 165 additional “pre-majors” chipping away at the prerequisites they must take before formal admission to the program.
Ohio State is one of numerous universities jostling to plant their flags in the increasingly crowded data-science-education landscape. The growth of new data sources and data-analysis techniques, the abundance of jobs, the “big data” media hype — all propel the trend.
At the graduate level, nearly 200 analytics and data-science programs have sprung up over the past decade, according to figures compiled by Michael Rappa of the Institute for Advanced Analytics at North Carolina State University. It may be “the biggest and fastest-growing new graduate degree in the U.S. in a generation,” he wrote in an email.
Among the latest to jump on the bandwagon is Harvard University, which this fall will welcome students into a new master’s program in data science. More than 1,300 people applied for what will probably be 40 to 45 slots, says Daniel S. Weinstock, who oversees the admissions process. Each will pay about $75,000 in tuition for the three-semester program, which does not offer financial aid.
If the past is a guide, those students might anticipate earning more than $100,000 upon graduation. That’s about the average annual salary for new graduates of a related five-year-old master’s program in computational science and engineering, Weinstock says. The decision to start a new program, he says, was “partially a response to sort of wanting to have something that had ‘data science’ in the name, frankly.”
What does that name mean, exactly?
In the broadest sense, data science is about using quantitative tools — statistical, mathematical, computational — to extract knowledge from data. That involves obtaining the data, managing it, processing it, learning from it, and presenting the results. A data scientist might analyze Twitter to gauge popular sentiment about a politician, or customer demographics to target ads, or financial news to forecast good stock investments.
But the label has grown so bloated that “it’s almost in danger of not meaning nearly as much as people would like it to mean,” says Eric Kolaczyk, director of the statistics program at Boston University and co-chair of a group convened by the National Academies of Sciences, Engineering and Medicine to study data-science education.
One way to think about the array of academic initiatives setting up shop under the data-science banner, Kolaczyk says, is to picture the Greek symbol “gamma,” which features one long leg joined with a horizontal bar to a much shorter leg.
In a typical data-science program, the long leg represents deep training in statistics and computer-science skills such as probability, statistical inference, and developing algorithms for handling data. The short leg represents coursework in a domain area that supplies data and questions.
That domain might be computational biology or business. A business-analytics program won’t give you the immersion of a master’s degree in business administration. But it will make you business-literate.
Some programs reverse the legs of that gamma, though. Now the long leg stands for deep training in a particular field, and the short leg is a basic introduction to the technical stuff. Colleges might spin out programs in digital humanities, or quantitative sociology, or data storytelling.
Any given university, depending on its faculty and resources, might offer multiple varieties of data-science degree, Kolaczyk says.
The result, for students, can be tricky to navigate. And some experts tell them not to bother. A data-science degree often is not worth the investment of time and money, writes Meta S. Brown, a consultant and commentator on analytics. One reason: You can get into the field without one. Most people who work in data science, she writes in Forbes, earned degrees in areas like math or statistics.
While many of those people are employed in private industry, others are attracted by the chance to help solve public-sector problems. Shruti Pandey spent a dozen years in software development before going back to college to pursue a master’s in data science at Columbia University. Pandey now works for the New York City Fire Department. She uses data — on the characteristics of buildings and the behavior of people living in them — to build computer models that predict which structures are most at risk of fire.
Such data applications can raise ethical and privacy questions — a challenge for scholars developing new curricula.
Data-science programs have typically emphasized technical rather than ethical training. They tend to involve a week of focused ethics education at most, says Boston University’s Kolaczyk, with any further discussion woven through the rest of the classes.
One writer pushing the field to take ethics more seriously is Cathy O’Neil, whose 2016 book, Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy (Crown), chronicles many examples of problematic uses of data in areas like awarding loans, evaluating teachers, and policing neighborhoods.
Master’s Programs in Data Science Have Multiplied
Degrees in data science in several disciplines.
Take a company that seeks out particular traits in job applicants based on data about the qualities of its successful employees. If the firm has been hiring white men, their characteristics will be the ones that stand out, says Kathy McKeown, founding director of Columbia’s Data Science Institute and co-chair of the National Academies data-science-education group.
“We would never identify characteristics of women or minorities who would be successful if you hired them,” McKeown says, “because they’re not in the data pool.”
Marc Parry is a senior reporter who writes about ideas, focusing on research in the humanities and social sciences. Email him at marc.parry@chronicle.com, or follow him on Twitter @marcparry.