Provenance-enabled Reproducibility: Developments in DataONE
Speakers
Bryce Mecum
NCEAS
Matthew Jones
NCEAS
Matt directs the Informatics program at NCEAS, which focuses on both supporting efficient synthesis through scientific computing and on building new advanced infrastructure to support data sharing, preservation, analysis, and modeling. Matt is the Director of the DataONE program, a global network of interoperable data repositories, and of the NSF Arctic Data Center. In addition to data infrastructure work at NCEAS, Matt also helps to build the NCEAS Learning Hub through an emphasis on data science and reproducible research teaching.
Matt’s career has focused on improving data science infrastructure to support cross-disciplinary and synthetic science, principally through the development of open source software for data repositories, metadata systems, and reproducible analysis and modeling.
Matt has a M.S. in Zoology from the University of Florida that focused on the ecology of plant-animal interactions, and a B.A. from Dartmouth College.
Chris Jones
NCEAS
Reproducible research is enabled, in part, by provenance metadata that describes the lineage and processing history of data and knowledge artifacts. Provenance plays an important role in many scientific applications and use cases. Yet this information is often not tracked as thoroughly and systematically as science metadata. DataONE has been working on tools to display provenance information and to support recording of provenance metadata through programming languages such as R and Matlab and through an intuitive, user friendly, web-based UI.
During this webinar we will describe the history to date, showcasing the tools developed and providing a demonstration of the new web-based provenance editor. We highlight the collaborative efforts in building a community around provenance, and introduce future integration with WholeTale and other community initiatives.