
Reproducible big data science: A case study in continuous FAIRness
projectWe present the work that went in creation of the atlas of putative transcription factor binding sites from the ENCODE DNase I hypersensitive sequencing data in compliance with the 10 simple rules for reproducible computational research as defined by Sandve et al. We describe the approach we have taken, the tools we built to organize and analyze big biomedical data in compliance with the FAIR principles. This work has been conducted by a multi- disciplinary team of scientists in systems biology, genomics and computer science. We strongly believe that PLOS One journal is an appropriate journal for our paper as the conducted study is interdisciplinary and worthy of a broad audience with differing and multidisciplinary expertise
URL(s):
View Associations View AnalyticsProject Assessments (13)
Assessment | Metrics | Date | |||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Target | Rubric | Globally unique identifier | Persistent identifier | Machine-readable metadata | Standardized metadata | Resource identifier in metadata | Resource discovery through web search | Open, Free, Standardized Access protocol | Protocol to access restricted content | Persistence of resource and metadata | Resource uses formal language | FAIR vocabulary | Linked | Digital resource license | Metadata license | Provenance scheme | Certificate of compliance to community standard | ||
BDBag of DNase-Seq data from the ENCODE project for 27 tissues(D1) | FAIR metrics by fairmetrics.org |
|
yes (1.00) | yes (1.00) | yes (1.00) | yes (1.00) | yes (1.00) | yes (1.00) | yes (1.00) | yes (1.00) | yes (1.00) | yes (1.00) | yes (1.00) | yes (1.00) | yes (1.00) | yes (1.00) | yes (1.00) | no (0.00) | Jan 4, 2019 |
Aligned reads of DNase Sequence data from ENCODE project | FAIR metrics by fairmetrics.org |
|
yes (1.00) | yes (1.00) | yes (1.00) | yes (1.00) | yes (1.00) | no (0.00) | yes (1.00) | yes (1.00) | yes (1.00) | yes (1.00) | yes (1.00) | yes (1.00) | yes (1.00) | yes (1.00) | yes (1.00) | yes (1.00) | Jan 4, 2019 |
Non-redundant motifs for TFBS inference | FAIR metrics by fairmetrics.org |
|
yes (1.00) | yes (1.00) | yes (1.00) | yes (1.00) | yes (1.00) | no (0.00) | yes (1.00) | yes (1.00) | yes (1.00) | yes (1.00) | yes (1.00) | yes (1.00) | yes (1.00) | yes (1.00) | yes (1.00) | no (0.00) | Jan 7, 2019 |
Database of hits | FAIR metrics by fairmetrics.org |
|
yes (1.00) | yes (1.00) | yes (1.00) | yes (1.00) | yes (1.00) | no (0.00) | yes (1.00) | yes (1.00) | yes (1.00) | yes (1.00) | yes (1.00) | yes (1.00) | yes (1.00) | yes (1.00) | yes (1.00) | no (0.00) | Jan 11, 2019 |
Transcription Factor Binding Sites for DNAse data from 27 tissues in ENCODE | FAIR metrics by fairmetrics.org |
|
yes (1.00) | yes (1.00) | yes (1.00) | yes (1.00) | yes (1.00) | no (0.00) | yes (1.00) | yes (1.00) | yes (1.00) | yes (1.00) | yes (1.00) | yes (1.00) | yes (1.00) | yes (1.00) | yes (1.00) | yes (1.00) | Jan 11, 2019 |
BED files with footprints of ENCODE DNase Sequence data | FAIR metrics by fairmetrics.org |
|
yes (1.00) | yes (1.00) | yes (1.00) | yes (1.00) | yes (1.00) | no (0.00) | yes (1.00) | yes (1.00) | yes (1.00) | yes (1.00) | yes (1.00) | yes (1.00) | yes (1.00) | yes (1.00) | yes (1.00) | no (0.00) | Jan 11, 2019 |
ENCODE2BDBag Service | FAIR metrics by fairmetrics.org |
|
yes (1.00) | yes (1.00) | yes (1.00) | yes (1.00) | yes (1.00) | no (0.00) | yes (1.00) | yes (1.00) | yes (1.00) | yes (1.00) | yes (1.00) | yes (1.00) | yes (1.00) | yes (1.00) | yes (1.00) | no (0.00) | Jan 11, 2019 |
ENCODE2BdBag Tool | FAIR metrics by fairmetrics.org |
|
yes (1.00) | yes (1.00) | yes (1.00) | yes (1.00) | yes (1.00) | yes (1.00) | yes (1.00) | yes (1.00) | yes (1.00) | yes (1.00) | yes (1.00) | yes (1.00) | yes (1.00) | yes (1.00) | yes (1.00) | no (0.00) | Jan 11, 2019 |
Galaxy workflow for generating Footprints | FAIR metrics by fairmetrics.org |
|
yes (1.00) | yes (1.00) | yes (1.00) | yes (1.00) | yes (1.00) | yes (1.00) | yes (1.00) | yes (1.00) | yes (1.00) | yes (1.00) | yes (1.00) | yes (1.00) | yes (1.00) | yes (1.00) | yes (1.00) | no (0.00) | Jan 11, 2019 |
Docker container description for analysis tools used in creating the atlas of putative transcription factor binding sites from terabytes of ENCODE DNase I hypersensitive sites sequencing data | FAIR metrics by fairmetrics.org |
|
yes (1.00) | yes (1.00) | yes (1.00) | yes (1.00) | yes (1.00) | yes (1.00) | yes (1.00) | yes (1.00) | yes (1.00) | yes (1.00) | yes (1.00) | yes (1.00) | yes (1.00) | yes (1.00) | yes (1.00) | no (0.00) | Jan 11, 2019 |
Docker Container for analysis tools used in creating the atlas of putative transcription factor binding sites from terabytes of ENCODE DNase I hypersensitive sites sequencing data | FAIR metrics by fairmetrics.org |
|
yes (1.00) | yes (1.00) | yes (1.00) | yes (1.00) | yes (1.00) | yes (1.00) | yes (1.00) | yes (1.00) | yes (1.00) | yes (1.00) | yes (1.00) | yes (1.00) | yes (1.00) | yes (1.00) | yes (1.00) | no (0.00) | Jan 11, 2019 |
R Script that is used to generate hits from Motifs database | FAIR metrics by fairmetrics.org |
|
yes (1.00) | yes (1.00) | yes (1.00) | yes (1.00) | yes (1.00) | yes (1.00) | yes (1.00) | yes (1.00) | yes (1.00) | yes (1.00) | yes (1.00) | yes (1.00) | yes (1.00) | yes (1.00) | yes (1.00) | no (0.00) | Jan 11, 2019 |
R Script for generating Transcription Factor Binding Sites | FAIR metrics by fairmetrics.org |
|
yes (1.00) | yes (1.00) | yes (1.00) | yes (1.00) | yes (1.00) | yes (1.00) | yes (1.00) | yes (1.00) | yes (1.00) | yes (1.00) | yes (1.00) | yes (1.00) | yes (1.00) | yes (1.00) | yes (1.00) | no (0.00) | Jan 11, 2019 |