dplyr: find if a column value is substring of any item in a fixed list and mutate value

dplyr: find if a column value is substring of any item in a fixed list and mutate value

I started with a simple example which works, but I don't know how to use within dplyr mutate

This works, I get "bcd-234":

library(tidyverse)
list = c("abc-123", 'bcd-234', 'cde-345', 'bcd-987')
s = 'bcd'
list[str_detect(list, s)][1]

but when I try to use it within dplyr mutate I get an error:

df <- tibble(
  name = c('aaa', 'bbb', 'ccc', 'ddd', 'abc', 'bcd', 'cde')
)


df |> mutate(
  new_name = if_else(any(str_detect(list, name)), list[str_detect(list, name)][1], name)
  )

I get an error:

! Can't recycle `string` (size 4) to match `pattern` (size 7).

Thanks

Answer

res <- df |>
  rowwise() |>
  mutate(
    new_name = if (any(str_detect(list, name))) list(list[str_detect(list, name)]) else list(name)
  )
> res
#   name  new_name 
#   <chr> <list>   
# 1 aaa   <chr [1]>
# 2 bbb   <chr [1]>
# 3 ccc   <chr [1]>
# 4 ddd   <chr [1]>
# 5 abc   <chr [1]>
# 6 bcd   <chr [2]>
# 7 cde   <chr [1]>

> as.data.frame(res)
  name         new_name
1  aaa              aaa
2  bbb              bbb
3  ccc              ccc
4  ddd              ddd
5  abc          abc-123
6  bcd bcd-234, bcd-987
7  cde          cde-345

PS. avoid using list as a variable name.

Enjoyed this article?

Check out more content on our blog or follow us on social media.

Browse more articles