Do yourself a favor: learn Markdown – Episode 4. Reproducible reports

The message

The essential message is this: stop using Copy/Paste from one program to another (e.g. a chart from Excel to Word) as this is a manual process that is not reproducible and can easily get out of sync if new data need to be incorporated and the report reconstructed anew. The solution is to embed everything needed to create the report:

  • the narrative:  the text describing the story of the data in the report
  • the code: used to compute values or summarize data
  • the code used to create charts and figures

Reproducible reports

Reproducible reports are a new trend in “reproducible research” that allows others to verify your findings. Reproducibility is a major principle of the scientific method but many scientific studies are difficult or impossible to replicate or reproduce. One could argue that there is a replication crisis as suggested already in 2005 by Ionannidis (“Why Most Published Research Findings Are False“.)

Reporting on data generation and data analysis can be tedious when not using the right tools. Markdown can be used to create reports and detailed analysis when combined with the power of the R or python programming languages. In both cases the principle is similar: markdown is used to format the narrative (the explanations) while the computations are performed by the embedded code of the chosen language.

The best combination between python and markdown was Jupyter Notebooks, now even better as JupyterLab, “an open-source web application that allows you to create and share documents that contain live code, equations, visualizations and narrative text.

The best combination between R and markdown is enhanced as Rmarkdown within the (free) R/Rstudio platform (that supports many other languages beyond R.)

For those more interested in python check chapter 5 of this workshop: Jupyter Notebook for Open Science.
For those interested in R see chapter 7 of this workshop: R for Reproducible Scientific Analysis (RMarkdown / knitr) or check the video described below.

Easy, reproducible reports with R

A very good demonstration is provided on the O’Reilly publisher web site by Garrett Grolemund:

Easy, reproducible reports with R
Garrett Grolemund demonstrates how to use R Markdown to combine code and text into a single .Rmd file to generate polished reports automatically in a variety of formats.
Markdown Rmarkdown process
The Rmarkdown process can export reports in various formats.

The process is to combine everything within an Rmarkdown file containing the narrative and all the necessary code, in R or one of the many other languages. An Rmarkdown file can be exported into various formats within R/Rstudio as shown in the video, including slides.

The combination of data occurs internally knitting together the narrative and the code into a final document thanks to the well named knitr package, which knits everything within this r environment and makes everything neat!

More on this process is well described on this short blog: Project Reporting with RMarkdown

Resources