Home > Enterprise >  Loops: How can I loop case_when function in R?
Loops: How can I loop case_when function in R?

Time:04-15

Here's the code, where I am trying to create a variable by detecting the words and matching them. Here I use dplyr package and its function mutate in combination with case_when. The problem is I am adding each one of the values manually as you see. How can I automate it by applying some loop functions to match the two?

city <- LETTERS #26 cities
district <- letters[10:20] #11 districts
streets <- paste0(district, district)
streets <- streets[-c(5:26)] #4 streets

df <- data.frame(x = c(1:5), 
           address = c("A, b, cc,", "B, dd", "a, dd", "C", "D, a, cc"))

library(dplyr)
library(stringi)

df2 <- df %>%
  mutate(districts = case_when(
    stri_detect_fixed(address, "b") ~ "b",   #address[1]
                                             #address[2]
    stri_detect_fixed(address, "a") ~ "a",   #address[3]
                                             #address[4]
    stri_detect_fixed(address, "cc") ~ "cc"  #address[5]
))

The code scans through address for the value in district vector. I would love to do the same for city and street variables. So I used the modified version of the code from another question in Stack Overflow. It produces an error.

for (j in town_village2) {
trn_house3[,93] <- case_when(
      stri_detect_fixed(trn_house3[1:6469, 4], j) ~ j)
}

I seek to produce this result:

x    address      city     district   street
1    A, b, cc,      A        b          cc  
2    B, dd          B        NA         dd
3    a, dd          NA       a          dd
4    C              C        NA         NA
5    D, a, cc       D        a          cc

CodePudding user response:

This will separate the elements into vectors:

library(tidyverse)

df <- data.frame(
  x = c(1:5),
  address = c("A, b, cc,", "B, dd", "a, dd", "C", "D, a, cc")
)

df3 <-
  df %>%
  separate_rows(address, sep = "[, ] ") %>%
  filter(nchar(address) > 0) %>%
  nest(address) %>%
  transmute(x, districts = data %>% map(~ .x[[1]]))
#> Warning: All elements of `...` must be named.
#> Did you want `data = address`?
df3
#> # A tibble: 5 × 2
#>       x districts
#>   <int> <list>   
#> 1     1 <chr [3]>
#> 2     2 <chr [2]>
#> 3     3 <chr [2]>
#> 4     4 <chr [1]>
#> 5     5 <chr [3]>
df3$districts[[1]]
#> [1] "A"  "b"  "cc"

Created on 2022-04-14 by the reprex package (v2.0.0)

CodePudding user response:

a data.table approach

library(data.table)
DT <- data.table(city, streets, district)
# create a lookup table with all elements
lookup <- melt(DT, measure.vars = names(DT))
# set df to data.table format
setDT(df)
final <- df[, .(address = unlist(tstrsplit(address, ",[ ]*", perl = TRUE))), by = .(x)]
# now add elements
final[lookup, type := i.variable, on = .(address = value)]
# and dcast to wide
dcast(final, x ~ type, value.var = "address")
#    x city streets district
# 1: 1    A      cc        b
# 2: 2    B      dd     <NA>
# 3: 3 <NA>      dd        a
# 4: 4    C    <NA>     <NA>
# 5: 5    D      cc        a
  • Related