dates: Not yet implemented NAbounds=TRUE for this non-numeric and non-character type-CodePudding

I have this dataframe:

df1 <- structure(list(ID = c(1, 2, 2, 2, 3, 4, 5, 6, 6, 7, 8, 8, 9, 
10), dateA = structure(c(14974, 18628, 18628, 18628, 14882, 16800, 
14882, 17835, 17835, 16832, 16556, 16556, 15949, 16801), class = "Date"), 
dateB = structure(c(14610, 15340, 15706, 17501, 14730, NA, 
14700, 16191, 17106, 16801, 15810, 16436, 14655, 15431), class = "Date"), 
dateC = structure(c(18628, 15705, 17500, 18628, 18628, NA, 
18628, 17105, 18628, 18628, 16435, 16556, 15706, 18628), class = "Date")), row.names = c(NA, 
-14L), class = c("data.table", "data.frame"))

    ID      dateA      dateB      dateC
 1:  1 2010-12-31 2010-01-01 2021-01-01
 2:  2 2021-01-01 2012-01-01 2012-12-31
 3:  2 2021-01-01 2013-01-01 2017-11-30
 4:  2 2021-01-01 2017-12-01 2021-01-01
 5:  3 2010-09-30 2010-05-01 2021-01-01
 6:  4 2015-12-31       <NA>       <NA>
 7:  5 2010-09-30 2010-04-01 2021-01-01
 8:  6 2018-10-31 2014-05-01 2016-10-31
 9:  6 2018-10-31 2016-11-01 2021-01-01
10:  7 2016-02-01 2016-01-01 2021-01-01
11:  8 2015-05-01 2013-04-15 2014-12-31
12:  8 2015-05-01 2015-01-01 2015-05-01
13:  9 2013-09-01 2010-02-15 2013-01-01
14: 10 2016-01-01 2012-04-01 2021-01-01

I would like to check if dateA is in the interval of dateB and dateC: My code:

library(dplyr)
df1 %>% 
  mutate(match= ifelse(between(dateA, dateB, dateC), 1, 0))

gives:

Error: Problem with `mutate()` column `match`.
i `match = ifelse(between(dateA, dateB, dateC), 1, 0)`.
x Not yet implemented NAbounds=TRUE for this non-numeric and non-character type

If I remove the row that contains NA the code works:

df1 %>% 
  slice(-6) %>% 
  mutate(match= ifelse(between(dateA, dateB, dateC), 1, 0))

I want to know, can I leave the row with NA and perform my code?

CodePudding user response：

There is a confusion as to which between the OP was using because the input object is data.table and the code used is dplyr. So, if we assume both packages are loaded, there is a between function in each of those packages and depending on which package got loaded last, the between from the previous package will get masked. If use dplyr::between, it is not vectorized fully and it is documented in the ?dplyr::between

left, right Boundary values (must be scalars).

df1 %>%
    rowwise %>% 
    mutate(match =  (dplyr::between(dateA, dateB, dateC))) %>%
    ungroup

-output

# A tibble: 14 × 5
      ID dateA      dateB      dateC      match
   <dbl> <date>     <date>     <date>     <int>
 1     1 2010-12-31 2010-01-01 2021-01-01     1
 2     2 2021-01-01 2012-01-01 2012-12-31     0
 3     2 2021-01-01 2013-01-01 2017-11-30     0
 4     2 2021-01-01 2017-12-01 2021-01-01     1
 5     3 2010-09-30 2010-05-01 2021-01-01     1
 6     4 2015-12-31 NA         NA            NA
 7     5 2010-09-30 2010-04-01 2021-01-01     1
 8     6 2018-10-31 2014-05-01 2016-10-31     0
 9     6 2018-10-31 2016-11-01 2021-01-01     1
10     7 2016-02-01 2016-01-01 2021-01-01     1
11     8 2015-05-01 2013-04-15 2014-12-31     0
12     8 2015-05-01 2015-01-01 2015-05-01     1
13     9 2013-09-01 2010-02-15 2013-01-01     0
14    10 2016-01-01 2012-04-01 2021-01-01     1

However, it is not the case with ?data.table::between (based on the error showed in OP's post, it seems that the between used was from data.table,

lower - Lower range bound. Either length 1 or same length as x.

upper - Upper range bound. Either length 1 or same length as x.

but class can be an issue though it says otherwise

x- Any orderable vector, i.e., those with relevant methods for <=, such as numeric, character, Date, etc. in case of between and a numeric vector in case of inrange.

Convert to integer/numeric from Date class and it should work

df1 %>%
   mutate(match =  (data.table::between(as.numeric(dateA), 
       as.numeric(dateB), as.numeric(dateC))))

-output

ID      dateA      dateB      dateC match
 1:  1 2010-12-31 2010-01-01 2021-01-01     1
 2:  2 2021-01-01 2012-01-01 2012-12-31     0
 3:  2 2021-01-01 2013-01-01 2017-11-30     0
 4:  2 2021-01-01 2017-12-01 2021-01-01     1
 5:  3 2010-09-30 2010-05-01 2021-01-01     1
 6:  4 2015-12-31       <NA>       <NA>     1
 7:  5 2010-09-30 2010-04-01 2021-01-01     1
 8:  6 2018-10-31 2014-05-01 2016-10-31     0
 9:  6 2018-10-31 2016-11-01 2021-01-01     1
10:  7 2016-02-01 2016-01-01 2021-01-01     1
11:  8 2015-05-01 2013-04-15 2014-12-31     0
12:  8 2015-05-01 2015-01-01 2015-05-01     1
13:  9 2013-09-01 2010-02-15 2013-01-01     0
14: 10 2016-01-01 2012-04-01 2021-01-01     1

By diving deeper, the issue is in the argument NAbounds which is TRUE by default. In the OP's data, there is a single NA element

df1 %>% 
    mutate(match = data.table::between(dateA, dateB, dateC))

Error: Problem with mutate() column match. ℹ match = data.table::between(dateA, dateB, dateC). ✖ Not yet implemented NAbounds=TRUE for this non-numeric and non-character type Run rlang::last_error() to see where the error occurred.

We may need to set this to FALSE

df1 %>% 
   mutate(match =  (data.table::between(dateA, dateB, dateC, NAbounds = FALSE)))
    ID      dateA      dateB      dateC match
 1:  1 2010-12-31 2010-01-01 2021-01-01     1
 2:  2 2021-01-01 2012-01-01 2012-12-31     0
 3:  2 2021-01-01 2013-01-01 2017-11-30     0
 4:  2 2021-01-01 2017-12-01 2021-01-01     1
 5:  3 2010-09-30 2010-05-01 2021-01-01     1
 6:  4 2015-12-31       <NA>       <NA>    NA
 7:  5 2010-09-30 2010-04-01 2021-01-01     1
 8:  6 2018-10-31 2014-05-01 2016-10-31     0
 9:  6 2018-10-31 2016-11-01 2021-01-01     1
10:  7 2016-02-01 2016-01-01 2021-01-01     1
11:  8 2015-05-01 2013-04-15 2014-12-31     0
12:  8 2015-05-01 2015-01-01 2015-05-01     1
13:  9 2013-09-01 2010-02-15 2013-01-01     0
14: 10 2016-01-01 2012-04-01 2021-01-01     1

Or may also do a conversion on the NA with as.Date

df1 %>% 
    mutate(match =  (data.table::between(dateA, dateB, dateC, 
         NAbounds = as.Date(NA))))
    ID      dateA      dateB      dateC match
 1:  1 2010-12-31 2010-01-01 2021-01-01     1
 2:  2 2021-01-01 2012-01-01 2012-12-31     0
 3:  2 2021-01-01 2013-01-01 2017-11-30     0
 4:  2 2021-01-01 2017-12-01 2021-01-01     1
 5:  3 2010-09-30 2010-05-01 2021-01-01     1
 6:  4 2015-12-31       <NA>       <NA>    NA
 7:  5 2010-09-30 2010-04-01 2021-01-01     1
 8:  6 2018-10-31 2014-05-01 2016-10-31     0
 9:  6 2018-10-31 2016-11-01 2021-01-01     1
10:  7 2016-02-01 2016-01-01 2021-01-01     1
11:  8 2015-05-01 2013-04-15 2014-12-31     0
12:  8 2015-05-01 2015-01-01 2015-05-01     1
13:  9 2013-09-01 2010-02-15 2013-01-01     0
14: 10 2016-01-01 2012-04-01 2021-01-01     1