[HTML][HTML] An R package that automatically collects and archives details for reproducible computing

Z Liu, S Pounds - BMC bioinformatics, 2014 - Springer
BMC bioinformatics, 2014Springer
Background It is scientifically and ethically imperative that the results of statistical analysis of
biomedical research data be computationally reproducible in the sense that the reported
results can be easily recapitulated from the study data. Some statistical analyses are
computationally a function of many data files, program files, and other details that are
updated or corrected over time. In many applications, it is infeasible to manually maintain an
accurate and complete record of all these details about a particular analysis. Results …
Background
It is scientifically and ethically imperative that the results of statistical analysis of biomedical research data be computationally reproducible in the sense that the reported results can be easily recapitulated from the study data. Some statistical analyses are computationally a function of many data files, program files, and other details that are updated or corrected over time. In many applications, it is infeasible to manually maintain an accurate and complete record of all these details about a particular analysis.
Results
Therefore, we developed the rctrack package that automatically collects and archives read only copies of program files, data files, and other details needed to computationally reproduce an analysis.
Conclusions
The rctrack package uses the trace function to temporarily embed detail collection procedures into functions that read files, write files, or generate random numbers so that no special modifications of the primary R program are necessary. At the conclusion of the analysis, rctrack uses these details to automatically generate a read only archive of data files, program files, result files, and other details needed to recapitulate the analysis results. Information about this archive may be included as an appendix of a report generated by Sweave or knitR. Here, we describe the usage, implementation, and other features of the rctrack package. The rctrack package is freely available from http://www.stjuderesearch.org/site/depts/biostats/rctrack under the GPL license.
Springer