Appendix A — Extra material

A.1 Debugging functions

Debugging is one of activities that seems really scary and difficult, but once you try it and use it, is not nearly as intimidating as it seemed. To debug, which means to find and fix problems in your code, there are several ways, the simplest of which is by inserting the browser() function into the the start of your function, re-running the function by either manually running it or using source with Ctrl-Shift-S or with the Palette (Ctrl-Shift-P, then type “source”), and using it again.

For instance, we have a function like this:

test_debugging <- function(number) {
  new_number <- number + number
  return(new_number)
}
test_debugging(5)

[1] 10

To start the debugger, insert the browser() function into your function:

test_debugging <- function(number) {
  browser()
  new_number <- number + number
  return(new_number)
}
test_debugging(5)

And re-run and use the function again, which will pop up a new debugging panel in RStudio. Sadly, we can’t show this on the website since it only works in RStudio. When you are in RStudio, the debugger will open up and it will show a few things:

A yellow line will highlight the code in the function, along with a green arrow on the left of the line number.
The Console will now start with Browse[1]> and will have text like debug at ....
There will be new buttons on the top of the Console like “Next”, “Continue”, and “Stop”.
The Environment pane will be empty and will say “Traceback”.

In this mode you can really investigate what is happening with your code and how to fix it. The way to figure out what’s wrong is by running the code bit by bit. This debug environment is empty except for the actions that occur within it, so it really can help figure things out.

A.2 Functionals and for loops

The concept of functional programming can be difficult to grasp. As an alternative to the above explanation, one could also explain functional programming in relation to what it enables. For example, imagine that you have a vector and would like to investigate whether any given number inside this vector is above 5. This can be accomplished using a variety of programming styles. First, we could create a function to check whether a given number is above 5, and manually check each element of the vector:

# Create vector of 10 numbers
numbers <- c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10)

# Function to check if a number is above 5
over_five <- function(number) {
  if (number > 5) {
    return(TRUE)
  } else {
    return(FALSE)
  }
}

# Check each element
over_five(numbers[1])

[1] FALSE

over_five(numbers[2])

[1] FALSE

over_five(numbers[3])

[1] FALSE

over_five(numbers[4])

[1] FALSE

over_five(numbers[5])

[1] FALSE

over_five(numbers[6])

[1] TRUE

over_five(numbers[7])

[1] TRUE

over_five(numbers[8])

[1] TRUE

over_five(numbers[9])

[1] TRUE

over_five(numbers[10])

[1] TRUE

Such an approach is verbose, not “functional” in style, and very inflexible. Imagine if the vector had contained 100 numbers instead of 10. In such a case, it would have been unreasonable to manually check each element. As an alternative we could accomplish the same by using a for loop:

# Initialize a vector to capture the output in the for loop
output <- vector("logical", length = length(numbers))

# Use seq_along to get for loop to start from 1 and end at the end of numbers
for (item in seq_along(numbers)) {
  output[item] <- over_five(numbers[item])
}
output

 [1] FALSE FALSE FALSE FALSE FALSE  TRUE  TRUE  TRUE  TRUE  TRUE

This for loop is taking each item from numbers, using over_five() on it, and then saving the output into the object output. So this loop allows us to do the same as we did manually, but more precisely and with less code. If the vector has more items or less, it doesn’t matter to the for loop since it will run no matter the length. You might notice already that there are several technical things going on, like the use of vector(), seq_along(), and output[item]. We won’t explain them here, but this is meant to highlight that loops aren’t easy to actually use properly.

For some situations, for loops are the perfect solution. However, they are not R’s strength, but rather functional programming is. In this case, we replace the for loop with the functional map(), which would make the code shorter and more robust.

library(tidyverse)

── Attaching core tidyverse packages ──────────────── tidyverse 2.0.0 ──
✔ dplyr     1.1.4     ✔ readr     2.1.5
✔ forcats   1.0.0     ✔ stringr   1.5.1
✔ ggplot2   3.5.2     ✔ tibble    3.2.1
✔ lubridate 1.9.4     ✔ tidyr     1.3.1
✔ purrr     1.0.4     
── Conflicts ────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors

# Functional
map(numbers, over_five)

[[1]]
[1] FALSE

[[2]]
[1] FALSE

[[3]]
[1] FALSE

[[4]]
[1] FALSE

[[5]]
[1] FALSE

[[6]]
[1] TRUE

[[7]]
[1] TRUE

[[8]]
[1] TRUE

[[9]]
[1] TRUE

[[10]]
[1] TRUE

This works great as is, but sometimes you might not need to create a new function to use inside of map(). Instead we could use an anonymous function with \() (or either function() or ~).

# Anonymous function
map(numbers, \(x) if (x > 5) TRUE else FALSE)

[[1]]
[1] FALSE

[[2]]
[1] FALSE

[[3]]
[1] FALSE

[[4]]
[1] FALSE

[[5]]
[1] FALSE

[[6]]
[1] TRUE

[[7]]
[1] TRUE

[[8]]
[1] TRUE

[[9]]
[1] TRUE

[[10]]
[1] TRUE