
Reproducible big data science: A case study in continuous FAIRness
projectWe present the work that went in creation of the atlas of putative transcription factor binding sites from the ENCODE DNase I hypersensitive sequencing data in compliance with the 10 simple rules for reproducible computational research as defined by Sandve et al. We describe the approach we have taken, the tools we built to organize and analyze big biomedical data in compliance with the FAIR principles. This work has been conducted by a multi- disciplinary team of scientists in systems biology, genomics and computer science. We strongly believe that PLOS One journal is an appropriate journal for our paper as the conducted study is interdisciplinary and worthy of a broad audience with differing and multidisciplinary expertise
URL(s):
View Analytics View AssessmentsAssociated Digital Objects (13)
BDBag of DNase-Seq data from the ENCODE project for 27 tissues(D1)
dataA BDBag of tissue-specific DNase-seq data from ENCODE, for hundreds of biosample replicates and 27 t...
Aligned reads of DNase Sequence data from ENCODE project
dataAligned reads of DNASE-Seq data of 27 tissues from the ENCODE project with two alignment seeds (16 a...
Non-redundant motifs for TFBS inference
dataDatabase file containing the hits produced
Database of hits
dataDatabase generated from the hits produced by non-redundant motifs (http://minid.bd2k.org/minid/landi...
Transcription Factor Binding Sites for DNAse data from 27 tissues in ENCODE
dataBDBag of 54 BDBags containing candidate TFBSs , one BDBag per {tissue, seed}. Each BDBag contains tw...
BED files with footprints of ENCODE DNase Sequence data
dataBDBag of 54 BDBags containing footprints computed one per {tissue, seed}. Each BDBag contains two B...
ENCODE2BDBag Service
toolA Service to create a BDBag for a given ENCODE query or metadata file. The resulting BDBag includes ...
ENCODE2BdBag Tool
toolUtility for converting ENCODE search URLs or metadata files into BDBags
Galaxy workflow for generating Footprints
toolA Galaxy workflow for generating transcription factor binding sites from DNAse data from ENCODE
Docker container description for analysis tools used in creating the atlas of putative transcription factor binding sites from terabytes of ENCODE DNase I hypersensitive sites sequencing data
toolDockerfile that enables recreation of Docker container with footprinting tools (HINT, Wellington)
Docker Container for analysis tools used in creating the atlas of putative transcription factor binding sites from terabytes of ENCODE DNase I hypersensitive sites sequencing data
toolDocker Container for analysis tools used in creating the atlas of putative transcription factor bind...