Difference between if_any(any_of(vars)) and if_any(all_of(vars))

Difference between if_any(any_of(vars)) and if_any(all_of(vars))
typescript
Ethan Jackson

Take the following MWE:

df <- data.frame(a=c(TRUE, TRUE, FALSE), b=c(FALSE, TRUE, FALSE)) myvars <- c("a","b")

The aim is to build a column c which is row-wise TRUE if one or both of a and b are TRUE.

It is required that the list of variables to use is held by vector character myvars.

With

df %>% mutate(c=if_any(myvars))

I get:

! Using an external vector in selections was deprecated in tidyselect 1.1.0. Please use `all_of()` or `any_of()` instead. # Was: data %>% select(myvars) # Now: data %>% select(all_of(myvars)) See <https://tidyselect.r-lib.org/reference/faq-external-vector.html>.

Given that, I have to do:

df %>% mutate(c=if_any(any_of(myvars))) df %>% mutate(c=if_any(all_of(myvars)))

But I don't understand the difference between those two.

if_any should imply any_of.

What is the difference between the two?

Answer

The difference is whether or not those columns are required to be in the data frame. all_of() will throw an error if you include any columns that aren't in the data frame. any_of() will happily work with whatever columns happen to be present in the data frame, ignoring any that aren't.

For code safety, you should use all_of() whenever you can, but sometimes, for example, you're writing scripts to work on similar inputs that have slightly different columns, and any_of() is very useful in this case.

To illustrate, let's add the non-existent "x" column to myvars:

## any_of still works df %>% mutate(c=if_any(any_of(c(myvars, "x")))) # a b c # 1 TRUE FALSE TRUE # 2 TRUE TRUE TRUE # 3 FALSE FALSE FALSE ## all_of throws an error:Element `x` doesn't exist. df %>% mutate(c=if_any(all_of(c(myvars, "x")))) # Error in `mutate()`: # ℹ In argument: `c = if_any(all_of(c(myvars, "x")))`. # Caused by error in `if_any()`: # ℹ In argument: `all_of(c(myvars, "x"))`. # Caused by error in `all_of()`: # ! Can't subset elements that don't exist. # ✖ Element `x` doesn't exist.

This is explained at the shared help page accessible at ?any_of or ?all_of:

  • all_of() is for strict selection. If any of the variables in the character vector is missing, an error is thrown.

  • any_of() doesn't check for missing variables. It is especially useful with negative selections, when you would like to make sure a variable is removed.

Related Articles