Intermediate Reproducible Research in R

An intermediate workshop on modern approaches and workflows to processing data

Author

Luke W. Johnston

Published

May 20, 2025

Doi

10.5281/zenodo.4061900

Welcome!

Reproducibility and open scientific practices are increasingly demanded of, and needed by, scientists and researchers in our modern research environments. As our tools for generating data become more sophisticated and powerful, we also need to use more sophisticated and powerful tools for processing the data. Training on how to use these tools and how to build modern data processing skills is lacking for researchers, even though this work is extremely time-consuming and technical. As a consequence of this lack of awareness for the need of these skills, how exactly data is processed is poorly, if at all, described in scientific studies. This hidden aspect of research could have major impacts on the reproducibility of studies. This workshop is therefore aimed to specifically to start addressing these types of problems.

The workshop is designed as a series of participatory live-coding lessons, where the teacher and learners code together, and is interspersed with hands-on exercises, discussion time, reading tasks, and group project work, all while using a real-world (open) dataset. This website contains all of the material for the workshop, including for both the learner and the teacher. It is structured as a book, with “chapters” as sessions, given in order of appearance. We make heavy use of the website throughout the workshop where code-along sessions follow the material on the website nearly exactly (with slight modifications for time or more detailed explanations).

The workshop material was created using Quarto to write the material and create the book format, GitHub to host the Git repository of the material, and GitHub Actions with Netlify to build and host the website. The original source material for this workshop is found on the r-cubed-intermediate GitHub repository.

Want to contribute to this workshop? Check out the README file as well as the CONTRIBUTING file on the GitHub repository for more details. The main way to contribute is by using GitHub and creating a new issue to make comments and give feedback for the material.

⭐ Do you find this workshop material useful? Then please consider “starring” our GitHub repository to show your support! Starring will save it to a list of repositories you like that you can easily find again, plus it helps give our project more visibility!

Target audiences

This website and its content are targeted to three groups:

For the learners to use during the workshop, both to follow along in case they get lost and also to use as a reference after the workshop ends. See Is this workshop for you? for a more detailed description of who the learner is.
For the teachers to use as a guide for when they do the code-along sessions and presentations.
For those who are interested in teaching, who may not have much experience or may not know where to start, to use this website as a guide to running and teaching their own workshops.

Re-use and licensing

The workshop is licensed under the Creative Commons Attribution 4.0 International License so the material can be used, re-used, and modified, as long as there is attribution to this source.

Acknowledgements

The workshop material draws inspiration from these excellent resources:

R for Data Science
Advanced R
R Packages
UofTCoders Reproducible Quantitative Methods for EEB
Software and Data Carpentry workshop material

The Danish Diabetes and Endocrinology Academy hosted, organized, and sponsored this workshop many times. A huge thanks to them for their involvement, support, and sponsorship! Steno Diabetes Center Aarhus and Aarhus University employs Luke, who is the lead teacher and educational resource developer.

Steno Diabetes Center Aarhus logo