Home > Back-end >  Why is this conditional statement in R dplyr generating an error message when it is generating the c
Why is this conditional statement in R dplyr generating an error message when it is generating the c

Time:09-16

Please refer to Mikko Marttila answer below where he highlights the core issue with a better example. You don't need to waste your time going through all this OP gibberish.

I am working on a function with a for-loop and have broken it down into steps, as this my first ever for-loop.

The first section of code below, shown at the very bottom, generates a data frame called nCode and it is fine and produces no errors (leave the for-loop at i in 1:1 !!, just run the code without changes).

But when I run this second bit of code simulating the beginning of the 2nd loop run on the nCode data frame, it outputs fine but I get the error message "Problem with mutate() column concat_2. i concat_2 = ifelse(...). i NAs introduced by coercion". I can't see what's wrong with the ifelse(), it looks legit to me. Here's that second bit of code (to run after the first section of code is run):

i = 2
reSeq_prior <- str_c("reSeq_",i-1)
concat_col <- str_c("concat_",i)

nCode <- if(i==1){
  nCode %>% mutate(!! concat_col:= as.numeric(paste0(seqBase,".",grpRnk)))} else {
    nCode %>% mutate(!! concat_col:= ifelse(
      !!rlang::sym(reSeq_prior)%%1 > 0, 
      !!rlang::sym(reSeq_prior),
      as.numeric(paste0(!!rlang::sym(reSeq_prior),".",grpRnk))
      )
    )
  }
nCode

Here's the good output I get when running these two sections of code (I am resisting my urge to use suppressWarnings(), I'd rather understand the problem):

> nCode
# A tibble: 15 x 10
   Name  Group nmCnt seqBase grpRnk concat_1 alloc_1 merge_1 reSeq_1 concat_2
   <chr> <dbl> <int>   <int>  <dbl>    <dbl>   <dbl>   <dbl>   <dbl>    <dbl>
 1 R         0     1       1      0      1       1       1       1        1  
 2 R         0     2       2      0      2       2.1     2.1     2.1      2.1
 3 X         0     1       1      0      1       1       1       1        1  
 4 X         1     2       2      1      2.1     2.1     2.1     2.1      2.1
 5 X         1     3       2      2      2.2     2.2     2.2     2.2      2.2
 6 X         0     4       3      0      3      NA       3       3        3  
 7 X         0     5       4      0      4      NA       4       4        4  
 8 X         0     6       5      0      5      NA       5       5        5  
 9 B         0     1       1      0      1       1       1       1        1  
10 R         0     3       3      0      3       2.2     2.2     2.2      2.2
11 R         2     4       4      1      4.1    NA       4       3        3.1
12 R         2     5       4      2      4.2    NA       4       3        3.2
13 X         3     7       6      1      6.1    NA       6       6        6.1
14 X         3     8       6      2      6.2    NA       6       6        6.2
15 X         3     9       6      3      6.3    NA       6       6        6.3

First section of code:

library(dplyr)
library(stringr)

myDF1 <-
  data.frame(
    Name = c("R","R","X","X","X","X","X","X","B","R","R","R","X","X","X"),
    Group = c(0,0,0,1,1,0,0,0,0,0,2,2,3,3,3)
  )

nCode <-  myDF1 %>%
  group_by(Name) %>%
  mutate(nmCnt = row_number()) %>%
  ungroup() %>%
  mutate(seqBase = ifelse(Group == 0 | Group != lag(Group), nmCnt,0)) %>%
  mutate(seqBase = na_if(seqBase, 0)) %>%
  group_by(Name) %>%
  fill(seqBase) %>%
  mutate(seqBase = match(seqBase, unique(seqBase))) %>%
  ungroup %>%
  mutate(grpRnk = ifelse(Group > 0, sapply(1:n(), function(x) sum(Name[1:x]==Name[x] & Group[1:x] == Group[x])),0))
  
loopCntr <- nrow(unique(myDF1[myDF1$Group!=0,]))
  
for(i in 1:1) {
  
  reSeq_prior <- str_c("reSeq_",i-1)
    
  concat_col <- str_c("concat_",i)

  nCode <- if(i==1){
    nCode %>% mutate(!! concat_col:= as.numeric(paste0(seqBase,".",grpRnk)))} else {
      nCode %>% mutate(!! concat_col:= as.numeric(paste0(!!rlang::sym(reSeq_prior),".",grpRnk)))
    }
  
    index <- filter(nCode, Group !=0) %>%
      select(all_of(concat_col)) %>%
      distinct() %>%
      mutate(truncInd = trunc(get(concat_col))) %>%
      group_by(truncInd) %>%
      mutate(cumGrp = cur_group_id()) %>%
      ungroup() %>%
      select(-truncInd,cumGrp,concat_col)
  
  # below inserts a 1 in 1st row of index if lowest element count group is >= 2  
    index <- if(ifelse(loopCntr > 0, min(index[[concat_col]]), Inf) >= 2){
      tmp <- data.frame(cumGrp = 1, concat = 1)
      names(tmp)[2] <- concat_col
      rbind(tmp,index)
    }else{index}
    
    nCode <- nCode %>%
      mutate(alloc = index[[concat_col]][index$cumGrp==1][nmCnt]) %>%
      mutate(merge = ifelse(is.na(alloc),seqBase,alloc)) %>%
      group_by(Name) %>%
      mutate(reSeq = match(trunc(merge), unique(trunc(merge)))) %>%
      mutate(reSeq = (reSeq   round(merge%%1 * 10,0)/10)) %>%
      ungroup()
    
    nCode <- nCode %>%
      rename_with(~ str_c(.x, "_", i), c("alloc", "merge", "reSeq"))
   
  }

CodePudding user response:

Both branches are evaluated in ifelse(). The warnings are generated from NAs in the no branch, even though the final result will include the value from the yes branch.

Here’s a simplified example:

a <- c(1.1, 2.1, 1)
b <- c(0, 2, 1)

ifelse(a != trunc(a), a, as.numeric(paste0(a, ".", b)))
#> Warning in ifelse(a != trunc(a), a, as.numeric(paste0(a, ".", b))): NAs
#> introduced by coercion
#> [1] 1.1 2.1 1.1

Which is essentially equivalent to:

y <- a
n <- as.numeric(paste0(a, ".", b))
#> Warning: NAs introduced by coercion

ifelse(a != trunc(a), y, n)
#> [1] 1.1 2.1 1.1

To avoid the warning, write code that won’t generate warnings in either branch:

ifelse(a != trunc(a), a, a   b / 10)
#> [1] 1.1 2.1 1.1
  • Related