The National Institutes of Health on Thursday awarded almost $32-million in grants to more than two dozen institutions to devise innovative ways of helping researchers handle huge sets of data seen as increasingly central to future medical discoveries.
The grants are the first outlay in a project, announced last year and known as Big Data to Knowledge, that’s expected to involve more than $600-million in spending by 2020. Its goals include developing and distributing methods, software, and tools for sharing, analyzing, managing, and integrating data into medical research.
Examples of the medical challenges that the grant recipients hope to help solve include finding disease associations in the three billion base pairs in the human genome, or in the estimated 86 billion neurons in the human brain, NIH officials said.
“Data creation has become exponentially more rapid than anything we anticipated even a decade ago,” the NIH’s director, Francis S. Collins, said during a briefing on Thursday, “and the challenge is to try to be sure we’re not exceeding the ability of researchers to capitalize on the data.”
“We see more and more the NIH as a digital enterprise,” said Philip E. Bourne, who this year became the agency’s first permanent associate director for data science.
The awards announced on Thursday were divided by the NIH into four broad categories: Twelve centers that will focus on solving computing challenges, nine that will create indexing systems for large volumes of biomedical data, nine that will tackle training and career-development strategies, and another nine that will develop course materials related to big data, including open online formats.
The grant recipients are a mix of leading public and private research institutions. Those with multiple awards are the University of California at San Diego, with three grants, and Harvard University, the Johns Hopkins University, the Mayo Clinic in Rochester, the Oregon Health and Science University, Stanford University, the University of California at Los Angeles, the University of Pennsylvania, and the University of Southern California, with two apiece.
At the same time it is working to expand overall capabilities for handling big data in medicine, the NIH and other parts of the federal government are still working on related problems, such as refining policies for protecting patient privacy, ensuring researchers cooperate fully in sharing, and setting common standards for defining and formatting data.
On the issue of sharing, Dr. Collins said the NIH had found success by increasingly requiring scientists to make a detailed commitment to data-sharing before they receive their grant money. Technological elements of the Big Data to Knowledge project can help, Mr. Bourne said, such as incorporating author-citation elements into data-processing systems so researchers are duly credited for the information they produce.