> Skip to content
FEATURED:
  • The Evolution of Race in Admissions
Sign In
  • News
  • Advice
  • The Review
  • Data
  • Current Issue
  • Virtual Events
  • Store
    • Featured Products
    • Reports
    • Data
    • Collections
    • Back Issues
    • Featured Products
    • Reports
    • Data
    • Collections
    • Back Issues
  • Jobs
    • Find a Job
    • Post a Job
    • Career Resources
    • Find a Job
    • Post a Job
    • Career Resources
Sign In
  • News
  • Advice
  • The Review
  • Data
  • Current Issue
  • Virtual Events
  • Store
    • Featured Products
    • Reports
    • Data
    • Collections
    • Back Issues
    • Featured Products
    • Reports
    • Data
    • Collections
    • Back Issues
  • Jobs
    • Find a Job
    • Post a Job
    • Career Resources
    • Find a Job
    • Post a Job
    • Career Resources
  • News
  • Advice
  • The Review
  • Data
  • Current Issue
  • Virtual Events
  • Store
    • Featured Products
    • Reports
    • Data
    • Collections
    • Back Issues
    • Featured Products
    • Reports
    • Data
    • Collections
    • Back Issues
  • Jobs
    • Find a Job
    • Post a Job
    • Career Resources
    • Find a Job
    • Post a Job
    • Career Resources
Sign In
ADVERTISEMENT
News
  • Twitter
  • LinkedIn
  • Show more sharing options
Share
  • Twitter
  • LinkedIn
  • Facebook
  • Email
  • Copy Link URLCopied!
  • Print

How to Encourage Data-Driven Discovery

By  Ed Lazowska
March 4, 2018

More than a decade ago, the pioneering computer scientist Jim Gray referred to data science as the “fourth paradigm” of discovery, along with experimentation, theory, and numerical simulation.

Today it doesn’t take a visionary to see that rapid advances in our ability to acquire, generate, and analyze data are influencing scholarly work in nearly every academic field. Higher education, though, has been slow to adapt to this trend. Only a few years ago, a provost who wanted to understand what sort of computing would be required by researchers in the future would probably have consulted faculty members who labeled themselves “computational scientists.” Their answer might have been: “Additional subsidized cycles to run my simulations.”

We’re sorry. Something went wrong.

We are unable to fully display the content of this page.

The most likely cause of this is a content blocker on your computer or network. Please make sure your computer, VPN, or network allows javascript and allows content to be delivered from c950.chronicle.com and chronicle.blueconic.net.

Once javascript and access to those URLs are allowed, please refresh this page. You may then be asked to log in, create an account if you don't already have one, or subscribe.

If you continue to experience issues, contact us at 202-466-1032 or help@chronicle.com

More than a decade ago, the pioneering computer scientist Jim Gray referred to data science as the “fourth paradigm” of discovery, along with experimentation, theory, and numerical simulation.

Today it doesn’t take a visionary to see that rapid advances in our ability to acquire, generate, and analyze data are influencing scholarly work in nearly every academic field. Higher education, though, has been slow to adapt to this trend. Only a few years ago, a provost who wanted to understand what sort of computing would be required by researchers in the future would probably have consulted faculty members who labeled themselves “computational scientists.” Their answer might have been: “Additional subsidized cycles to run my simulations.”

Numerical simulation continues to be of great importance. Today’s data science, however, relies far more on intellectual infrastructure than on physical infrastructure: new methods, tools, partnerships, and types of researchers, plus the institutional change required to create new career paths and reward structures. But most researchers — even the very best — are not well versed in the tools and approaches of what we call data-intensive discovery. Until recently, for example, one could be a world-class oceanographer without possessing knowledge of data science. But no more. Oceanography, like so many other disciplines, is becoming an information field, through rapid advances in chemical, physical, biological, and video sensors that stream data with unprecedented volume, velocity, and variety; remotely operated vehicles; and observatories that extend the internet to the seafloor. The sophisticated analysis of data and innovation in data-analysis methods have become integral to the field.

The tools and scholarly approaches of data science, meanwhile, are still in their infancy and evolving rapidly, but academic-reward systems do not provide adequate incentives to make them more accessible, reliable, and effective. Furthermore, essential partnerships between the fields that specialize in data-science methodology (computer science, statistics, applied mathematics) and those that need to employ such methodology to drive research (life sciences, environmental sciences, physical sciences, and social sciences, among others) are not well developed. Such partnerships are difficult to fund and sustain because they must bridge disparate academic fields and require extended effort — notorious challenges for federal research agencies, peer reviewers, and university appointment-and-promotion committees. There are significant cultural mismatches between the entrenched structures of universities and the needs of data-intensive research.

ADVERTISEMENT

In 2013 the Gordon and Betty Moore Foundation and the Alfred P. Sloan Foundation selected New York University, the University of California at Berkeley, and my institution, the University of Washington, to join in a collaborative experiment intended to transform the process of data-intensive research and the institutional environments in which it takes place. As part of the Data Science Environments Partnership, we formed working groups across our institutions to identify institutional impediments and take actions to overcome them. Among our recommendations so far:

  • Colleges need new and better ways to support the careers of data scientists. We must create and sustain long-term career trajectories for a new generation of scientists whose research depends crucially on the analysis of complex data, and whose work may require substantial curation or development. We must reward scientists who focus on building next-generation tools that others will use to advance science.
  • Colleges must rapidly and dramatically improve education and training in data science. Training is required at all levels: for undergraduate and graduate students, postdocs, and members of the research staff and faculty. It must be tailored to meet the needs and mathematical or computing backgrounds of various disciplines.
  • Colleges must help develop an ecosystem of new tools and software environments. Today’s tools and software environments distract from the science that should be the focus. The research community itself is best positioned to tackle this challenge. How can colleges recognize and encourage the development, sharing, and integration of software tools and environments that support data-intensive research?
  • Colleges must encourage reproducible and open data science. As data-intensive research grows in importance, we have the opportunity to create software tools and practices that support the sharing and reuse of data, software, and scientific procedures, allowing us to spend more time standing on one another’s shoulders and less time standing on one another’s toes.
  • Colleges must create physical and intellectual spaces to help expand the data-science community. Physical spaces for collaboration are essential for facilitating work that crosses disciplinary boundaries. We must recreate the “water cooler” where researchers from different fields interact and discover common problems and common solutions.

Four years into the Data Science Environments project, we certainly don’t have all the answers, but we have some of them, and our effort can offer guidance to educational leaders seeking to accelerate data-intensive research on their campuses. An extensive discussion of our successes and challenges may be found at msdse.org/creating_institutional_change.html.

The creation of our Data Science Environments — campuswide settings in which data-intensive discovery can flourish — was supported by multiple funding sources, particularly the Moore and Sloan foundations. The continuing cost of sustaining those activities is a few million dollars per year — a tiny amount when compared with the budgets of most major research universities.

Our data environments wield enormous leverage, providing an intellectual infrastructure that can revolutionize the process of discovery in a broad range of fields. We are optimistic that our institutions will soon consider these environments as essential to our missions as computing and libraries.

Ed Lazowska holds the Bill & Melinda Gates Chair in the Paul G. Allen School of Computer Science & Engineering at the University of Washington, and in 2007 became founding director of the University of Washington eScience Institute.

ADVERTISEMENT

A version of this article appeared in the March 9, 2018, issue.
Read other items in this The 2018 Trends Report package.
We welcome your thoughts and questions about this article. Please email the editors or submit a letter for publication.
ADVERTISEMENT
ADVERTISEMENT
  • Explore
    • Get Newsletters
    • Letters
    • Free Reports and Guides
    • Blogs
    • Virtual Events
    • Chronicle Store
    • Find a Job
    Explore
    • Get Newsletters
    • Letters
    • Free Reports and Guides
    • Blogs
    • Virtual Events
    • Chronicle Store
    • Find a Job
  • The Chronicle
    • About Us
    • DEI Commitment Statement
    • Write for Us
    • Talk to Us
    • Work at The Chronicle
    • User Agreement
    • Privacy Policy
    • California Privacy Policy
    • Site Map
    • Accessibility Statement
    The Chronicle
    • About Us
    • DEI Commitment Statement
    • Write for Us
    • Talk to Us
    • Work at The Chronicle
    • User Agreement
    • Privacy Policy
    • California Privacy Policy
    • Site Map
    • Accessibility Statement
  • Customer Assistance
    • Contact Us
    • Advertise With Us
    • Post a Job
    • Advertising Terms and Conditions
    • Reprints & Permissions
    • Do Not Sell My Personal Information
    Customer Assistance
    • Contact Us
    • Advertise With Us
    • Post a Job
    • Advertising Terms and Conditions
    • Reprints & Permissions
    • Do Not Sell My Personal Information
  • Subscribe
    • Individual Subscriptions
    • Institutional Subscriptions
    • Subscription & Account FAQ
    • Manage Newsletters
    • Manage Your Account
    Subscribe
    • Individual Subscriptions
    • Institutional Subscriptions
    • Subscription & Account FAQ
    • Manage Newsletters
    • Manage Your Account
1255 23rd Street, N.W. Washington, D.C. 20037
© 2023 The Chronicle of Higher Education
  • twitter
  • instagram
  • youtube
  • facebook
  • linkedin