Home > database >  vector output for if_else or alternative
vector output for if_else or alternative

Time:12-15

I'm struggling to find an answer to the following problem.

I want to search a column in a data.frame by a vector. Upon finding a match I then wish to utilise the element of the 'search vector' to create a new element of a new column. See the reproducible example below please.

colour <- c('red', 'yellow')

a <- c('violet', 'red', 'taupe', 'blue', 'yellow_a', 'yellow_b', 'blue_a', 'red_c')
b <- c('non', 'prim', 'non', 'prim', 'prim', 'prim', 'prim', 'prim')
c <- c(1, 2, 3, 4, 5, 6, 7, 8)

df <- data.frame(a, b, c)

I've tried the following:

df_clean <- df %>% mutate(d = if_else(str_detect(a, colour), colour, NA_character_))

The Output:

Problem: Looking at help files I'm unable to output greater than 1 from an 'if_else', I'm receiving the following:

Error: Problem with mutate() column d. ℹ d = if_else(rep(str_detect(a, colour), length(colour)), colour, NA_character_). x true must be length 16 (length of condition) or one, not 2.

I'm looking to achieve:

a <- c('violet', 'red', 'taupe', 'blue', 'yellow_a', 'yellow_b', 'blue_a', 'red_c')
b <- c('non', 'prim', 'non', 'prim', 'prim', 'prim', 'prim', 'prim')
c <- c(1, 2, 3, 4, 5, 6, 7, 8)
d <- c(NA_character_, 'red', NA_character_, NA_character_, 'yellow', 'yellow', NA_character_, 'red')

df_clean <- data.frame(a, b, c, d)

Requirements:

If you could help me fix this or find an alternative solution I would be most grateful, I'm unable to bridge the gap. I'm missing something potentially obvious?

Any help would be greatly appreciated!

Many Thanks

CodePudding user response:

Potential solution with str_extract from the stringr package.

colour <- c('red', 'yellow')

a <- c('violet', 'red', 'taupe', 'blue', 'yellow_a', 'yellow_b', 'blue_a', 'red_c')
b <- c('non', 'prim', 'non', 'prim', 'prim', 'prim', 'prim', 'prim')
c <- c(1, 2, 3, 4, 5, 6, 7, 8)

df <- data.frame(a, b, c)


colour_str <- paste(colour, collapse='|')

df |> 
  mutate(d = str_extract(a, colour_str))

Output:

         a    b c      d
1   violet  non 1   <NA>
2      red prim 2    red
3    taupe  non 3   <NA>
4     blue prim 4   <NA>
5 yellow_a prim 5 yellow
6 yellow_b prim 6 yellow
7   blue_a prim 7   <NA>
8    red_c prim 8    red

CodePudding user response:

You were very close from the solution :

a <- c('violet', 'red', 'taupe', 'blue', 'yellow_a', 'yellow_b', 'blue_a', 'red_c')
b <- c('non', 'prim', 'non', 'prim', 'prim', 'prim', 'prim', 'prim')
c <- c(1, 2, 3, 4, 5, 6, 7, 8)
d <- c(NA_character_, 'red', NA_character_, NA_character_, 'yellow', 'yellow', NA_character_, 'red')

my_df <- data.frame(a, b, c, d)
pattern <- "red|yellow"
my_df$test <- ifelse(test = str_detect(string = my_df$a, pattern = pattern) == TRUE, yes = str_extract(string = my_df$a, pattern = pattern), no = NA)
  • Related