Spring 2018 R Open Lab: Apply Family

Apr 11

The topic of this week is the apply family in R. Recall that we learned loops as one of the basic concepts at the very beginning; you can review it from the Starter Kit and the Lab featuring More Fundamentals. Although loop is conceptually simple and intuitive, it is inefficient. The apply family comes in handy in this case. In this lab, we will cover apply,lapply, sapply,mapply, tapply, and sweep. Here are the codes for this lab:

# apply: better than loops!
m <- matrix(1:9, 3, 3, byrow = TRUE)
for (i in 1:3) {
print(mean(m[i, ]))

apply(m, 1, mean)
apply(m, 2, mean)

sos <- function(x, y) {
apply(m, 1, sos, y = 3)

apply(diamonds[, 2:4], 2, table) # data frame `diamonds` is defined in package `ggplot2`

sweep(m, 2, mu, “*”)
mu <- apply(m, 2, mean)
sweep(m, 2, mu, FUN = “-“)


# lapply and sapply
lapply(m, sos, y = 3)
l <- list(c(1, 2, 3), 4, 5, m)
lapply(l, sos, 3)
sapply(l, sos, 3)

lapply(1:10, function(x) x^2)
sapply(1:10, function(x) x^2, simplify = F)
unlist(lapply(1:10, function(x) x^2))
sapply(1:10, function(x) x^2)


# mapply
mapply(rep, 1:4, 4:1)
mapply(rep, 2:9, 4)


# tapply
s <- c(10:19, 2:5, 3:15)
i <- factor(c(rep(1, 10), rep(2, 4), rep(3, 13)))
tapply(s, i, sum)

Here are a few practice problems you can try by yourself (All of them require the data frame diamonds defined in the package ggplot2) :

Task 1: Find the color and clarity of largest 5 entries of price using apply family.

Task 2: Compute leave-one-out mean for carat and find which observation has the greatest leave-one-out mean.

Task 3: Compute mean and standard deviation for different groups of cut.

Thank you all for showing up. If you have further questions regarding topics covered in the material, please feel free to drop by during next week’s lab or email me or leave a comment.

See you all next week!

Leave a Reply

Your email address will not be published. Required fields are marked *