Reproducible big data science: A case study in continuous FAIRness

Publishing date: 2019-05-23

Published on: PLOS ONE

summary: In recent years, researchers have access to larger volumes of data. More and more hypotheses are now developed and tested against existing data rather than by generating new data. However, the success of these data driven discovery methods is the ability to easily access and analyse data of considerable size.

In “Reproducible big data science: A case study in continuous FAIRness” the authors explore the analysis of biomedical data and present an end-to-end workflow called TFBS. TFBS is designed to facilitate the implementation of complex “big data” computations in ways that make the associated data and code findable, accessible, interoperable, and reusable. The approach has been evaluated via a user study, and showed that 91% of participants were able to replicate a complex analysis involving large data volumes.

authors: Ravi Madduri, Kyle Chard, Mike D’Arcy, Segun C. Jung, Alexis Rodriguez, Dinanath Sulakhe, Eric Deutsch, Cory Funk, Ben Heavner, Matthew Richards, Paul Shannon, Gustavo Glusman, Nathan Price, Carl Kesselman, Ian Foster

link to paper: 10.1371/journal.pone.0213013

Icons made by catkuro from www.flaticon.com

Related