The practice of biomedical research has changed significantly over the last decade, as advances in technology make it possible for researchers to collect and analyze more data than ever before. Funder and journal policies that require researchers to share their final research data have also helped to create a research ecosystem rich in data to enable discovery. However, many researchers are not prepared to handle the challenges that come with managing, curating, sharing, and reusing research data.
Librarians’ skills and experience with information management, metadata, and information-seeking behavior make them ideal partners for researchers who need help working with data. In fact, librarians may even be considered part of the research team, as collaborators who bring unique expertise and skill sets. Librarians can offer support throughout the research data life cycle in many ways, a few of which are discussed here.
Data Management Planning
In 2013, the United States Office of Science and Technology Policy asked federal agencies to draft policies to facilitate public access to federally funded research results, including both the published articles reporting those results and the final research data. The National Institutes of Health (NIH) has indicated that their policy, which will likely come into effect in 2016, will require grant proposals to include a data management and sharing plan (National Institutes of Health, 2015). Given that NIH-funded researchers have not been required to submit such plans before, they will likely need guidance, including suggestions about institutional resources, metadata standards, and data preservation best practices. Librarians are well suited to provide such assistance; those who are not familiar with data management plans (DMPs) may find the DMPTool, an interactive resource for creating DMPs, helpful (The Regents of the University of California, 2016).
Researchers can often benefit from help in organizing their data during the collection and analysis process. Though some teams have sophisticated tools for data management, many researchers still use simple spreadsheet software like Excel or gather their data in paper lab notebooks. They frequently rely on improvised, ad hoc organizational strategies that make sense to them, but may not conform to best practices for information management. Moreover, they often do not document these strategies, making it difficult for others -- or even themselves when they come back months or years later -- to understand how the data were gathered and organized. Librarians can provide guidance on how best to organize information, and also bring a valuable outside perspective that can be beneficial to researchers.
Facilitating Reuse of Shared Data
Research can move more quickly when researchers are able to analyze data that have already been collected and made publicly available. Though an increasingly large body of such shared data is available, many researchers are not able to take advantage of existing data because they are not aware of how to find it. Just as librarians help users quickly locate literature that they would have had difficulty identifying on their own, librarians can also aid researchers in locating suitable shared datasets. Planning is underway for an NIH-funded “data discovery index” (bioCADDIE, 2016), but there is not yet a single federated search resource that would allow users to easily search across multiple data repositories in the same way that bibliographic databases make it easy to search across multiple journals. As a result, locating shared data requires a familiarity with data repositories and how to locate them. Resources like the Registry of Research Data Repositories (Registry of Research Data Repositories, 2016) can help librarians and researchers identify potentially useful data repositories.
As the biomedical research landscape continues to shift, librarians have many opportunities to work closely with research teams to enable effective and efficient use of data and improve the research process. In many cases, librarians already have the skills and knowledge they need to be successful in such work. For those who wish to retool and gain new competencies, a wealth of learning resources exist, both online and in-person, on topics related to data management and data science. Now is an exciting time for librarians to get involved in working closely with research teams and potentially have a significant impact on the biomedical research process.
bioCADDIE (2016). Biomedical and healthCAre Data Discovery Index Ecosystem. http://biocaddie.org
The Regents of the University of California (2016). DMPTool. http://www.dmptool.org
National Institutes of Health (2015). National Institutes of Health Plan for Increasing Access to Scientific Publications and Digital Scientific Data from NIH Funded Scientific Research. http://grants.nih.gov/grants/NIH-Public-Access-Plan.pdf
Registry of Research Data Repositories (2016). re3data.org. http://www.re3data.org/