Appendix B β Solutions
This document contains the solutions, or in many ways potentials solutions, to the exercises throughout the workshop. They are hidden by default so you donβt accidentally see solutions for the other exercises.
Pre-process data as you import it
π§βπ» Exercise: Import participant 101βs sleep data
Tip
Bundling code into functions
π§βπ» Exercise: Convert the sleep code into a function
Tip
#' Import a participant's sleep data from DIME.
#'
#' @param file_path Path to the participant's sleep file.
#'
#' @return Outputs a data frame/tibble.
#'
import_sleep <- function(file_path) {
sleep <- file_path |>
read_csv(
show_col_types = FALSE,
name_repair = to_snake_case,
n_max = 100,
)
return(sleep)
}
Making robust and general-purpose functions
π§βπ» Exercise: Finish setting the dependencies
Tip
Console
usethis::use_package("readr")
usethis::use_package("snakecase")
usethis::use_package("fs")
docs/learning.qmd
import_cgm <- function(file_path) {
cgm <- file_path |>
readr::read_csv(
show_col_types = FALSE,
name_repair = snakecase::to_snake_case,
n_max = 100
)
return(cgm)
}
Doing many things at once with functionals
π§βπ» Exercise: Convert into a function to use it on the sleep data
Tip
#' Import all DIME CSV files in a folder into one data frame.
#'
#' @param folder_path The path to the folder that has the CSV files.
#'
#' @return A single data frame/tibble.
#'
import_csv_files <- function(folder_path) {
files <- folder_path |>
fs::dir_ls(glob = "*.csv")
data <- files |>
purrr::map(import_dime) |>
purrr::list_rbind(names_to = "file_path_id")
return(data)
}
Cleaning characters and dates
π§βπ» Exercise: Using NSE in your function
Tip
#' Prepare the date columns in DIME CGM and sleep data for joining.
#'
#' @param data The data that has the datetime column.
#' @param column The datetime column to convert to date and hour.
#'
#' @returns A tibble/data.frame
#'
prepare_dates <- function(data, column) {
prepared_dates <- data |>
dplyr::mutate(
date = lubridate::as_date({{ column }}),
hour = lubridate::hour({{ column }}),
.before = {{ column }}
)
return(prepared_dates)
}
π§βπ» Exercise: Convert βget IDβ code into a function
Tip
#' Get the participant ID from the file path column.
#'
#' @param data Data with `file_path_id` column.
#'
#' @return A data.frame/tibble.
#'
get_participant_id <- function(data) {
data_with_id <- data |>
dplyr::mutate(
id = stringr::str_extract(
file_path_id,
"[:digit:]+\\.csv$"
) |>
stringr::str_remove("\\.csv$") |>
as.integer(),
.before = file_path_id
) |>
dplyr::select(-file_path_id)
return(data_with_id)
}
Using split-apply-combine to help in processing
π§βπ» Exercise: Create a clean_sleep()
function
Tip
#' Clean and prepare the sleep data for joining.
#'
#' @param data The sleep dataset.
#'
#' @returns A cleaner data frame.
#'
clean_sleep <- function(data) {
cleaned <- data |>
get_participant_id() |>
dplyr::rename(datetime = date) |>
prepare_dates(datetime) |>
summarise_column(seconds, list(sum = sum))
return(cleaned)
}
#' Clean and prepare the CGM data for joining.
#'
#' @param data The CGM dataset.
#'
#' @returns A cleaner data frame.
#'
clean_cgm <- function(data) {
cleaned <- data |>
get_participant_id() |>
prepare_dates(device_timestamp) |>
dplyr::rename(glucose = historic_glucose_mmol_l) |>
# You can decide what functions to summarise by.
summarise_column(glucose, list(mean = mean, sd = sd))
return(cleaned)
}
Pivoting your data from and to long or wide
π§βπ» Exercise: Create a new function to pivot sleep to wider
Tip
#' Convert the sleep types to wide format.
#'
#' @param data The cleaned DIME sleep data.
#'
#' @returns A data frame.
#'
sleep_types_to_wider <- function(data) {
wider <- data |>
tidyr::pivot_wider(
names_from = sleep_type,
names_prefix = "seconds_",
values_from = seconds_sum
)
return(wider)
}