Exercise 2: Test your knowledge

After working through Exercise 2, you’ll…

have assessed how well you know dplyr
know what dplyr functions and concepts you might want to repeat again
have managed to apply the dplyr concepts to data

Task 1

Below you will see multiple choice questions. Please try to identify the correct answers. 1, 2, 3 and 4 correct answers are possible for each question.

1. What are the main characteristics of tidy data?

Every cell contains values.
Every cell contains a variable.
Every observation is a column.
Every observation is a row.

2. What are dplyr functions?

summary()
describe()
mutate()
manage()

3. How can you sort the eye_color of Star Wars characters from Z to A?

starwars_data %>% arrange(desc(eye_color))
starwars_data %>% arrange(eye_color)
starwars_data %>% select(arrange(eye_color))
starwars_data %>% select(eye_color) %>% arrange(desc(eye_color))

4. Imagine you want to recode the height of the these characters. You want to have three categories from small and medium to tall. What is a valid approach?

starwars_data %>% mutate(height = case_when(height<=150~"small",height<=190~"medium",height>190~"tall"))
starwars_data %>% mutate(height = case_when(height<=150~small,height<=190~medium,height>190~tall))
starwars_data %>% recode(height = case_when(height<=150~"small",height<=190~"medium",height>190~"tall"))
starwars_data %>% recode(height = case_when(height<=150~small,height<=190~medium,height>190~tall))

5. Imagine you want to provide a systematic overview over all hair colors and what species wear these hair colors frequently (not accounting for the skewed sampling of species)? What is a valid approach?

starwars_data %>% group_by(hair_color) %>% group_by(species) %>% summarize(count = n()) %>% arrange(hair_color)
starwars_data %>% group_by(hair_color, species) %>% summarize(count = n()) %>% arrange(hair_color)
starwars_data %>% group_by(hair_color & species) %>% summarize(count = n()) %>% arrange(hair_color)
starwars_data %>% group_by(hair_color + species) %>% summarize(count = n()) %>% arrange(hair_color)

Task 2

It’s you turn now. Load the starwars data like this:

library(dplyr)              # to activate the dplyr package
starwars_data <- starwars   # to assign the pre-installed starwars data set (dplyr) into a source object in our environment

How many humans are contained in the starwars data overall? (Hint: use summarize(count = n()) or count())

Task 3

How many humans are contained in starwars by gender?

Task 4

What is the most common eye_color among Star Wars characters? (Hint: use arrange())

Task 5

What is the average mass of Star Wars characters that are not human and have yellow eyes? (Hint: remove all NAs)

Task 6

Compare the mean, median, and standard deviation of mass for all humans and droids. (Hint: remove all NAs)

Task 7

Create a new variable in which you store the mass in gram (gr_mass). Add it to the data frame. Test whether your solution works by printing your data to the console, but only show the name, species, mass, and your new variable gr_mass.

When you’re ready to look at the solutions, you can find them here: Solutions for Exercise 2.

Are you ready for some beautiful graphs? Then check out the next Tutorial: Data visualization with ggplot.