Very often you will need to write a process or function that can
flexibly handle incoming data, and can manage variables that may or may
not exist. One way would be to write a bunch of
if(exists("variable", dataset)){ }
type structures, where
the operations are in the curly brackets. There are better
approaches.
We can use the rename()
function in combination with the
any_of()
selection helper function. The way it works is
that the new-name and old-name pairs are passed as a vector to the
any_of
selection helper within rename
. An
example of how it works is shown below. The variable
Species
does exist and is renamed to
Type
. The variable Scent
does
not exist, and nothing is renamed to Smell
.
library(tidyverse)
new_dataset <- iris %>%
rename(any_of(c("Type" = "Species", "Smell" = "Scent")) )
head(new_dataset)
## Sepal.Length Sepal.Width Petal.Length Petal.Width Type
## 1 5.1 3.5 1.4 0.2 setosa
## 2 4.9 3.0 1.4 0.2 setosa
## 3 4.7 3.2 1.3 0.2 setosa
## 4 4.6 3.1 1.5 0.2 setosa
## 5 5.0 3.6 1.4 0.2 setosa
## 6 5.4 3.9 1.7 0.4 setosa
Again we make use of the selection helper function
any_of()
, this time inside the function
across()
, which itself is inside the function
arrange()
. Again the variable names (which may or may not
exist) are passed as a character vector. In the example below, the
variable cyl
does exist, while the
variable emissions
.
library(tidyverse)
new_dataset <- mpg %>%
arrange(across(any_of(c('cyl' , 'emissions'))))
head(new_dataset)
## # A tibble: 6 × 11
## manufacturer model displ year cyl trans drv cty hwy fl class
## <chr> <chr> <dbl> <int> <int> <chr> <chr> <int> <int> <chr> <chr>
## 1 audi a4 1.8 1999 4 auto(… f 18 29 p comp…
## 2 audi a4 1.8 1999 4 manua… f 21 29 p comp…
## 3 audi a4 2 2008 4 manua… f 20 31 p comp…
## 4 audi a4 2 2008 4 auto(… f 21 30 p comp…
## 5 audi a4 quattro 1.8 1999 4 manua… 4 18 26 p comp…
## 6 audi a4 quattro 1.8 1999 4 auto(… 4 16 25 p comp…