Home > Enterprise >  Fill a column with one of four date columns based on another R
Fill a column with one of four date columns based on another R

Time:04-30

I have a DF with 5 columns like so;

A  B  Date1 Date2 Date3 Date4
1       x     NA    NA    NA
2      NA     y     NA    NA
3      NA    NA     z     NA  
4      NA    NA    NA     f

I want to use the dplyr package and the case_when() function to state something like this

df <- df %>%
    mutate(B = case_when(
     A == 1 ~ B == Date1,
     A == 2 ~ B == Date2,
     A == 3 ~ B == Date3,
     A == 4 ~ B == Date4))

Essentially based on the value of A I would like to fill B with one of 4 date coloumns. A is of class character, B and the Date are all class Date.

Problem is when I apply this to the dataframe it simply doesn't work. It returns NAs and changes the class of B to boolean. I am using R version 4.1.2. Any help is appreciated.

CodePudding user response:

You can use coalesce() to find first non-missing element.

library(dplyr)

df %>%
  mutate(B = coalesce(!!!df[-1]))

#   A Date1 Date2 Date3 Date4 B
# 1 1     x  <NA>  <NA>  <NA> x
# 2 2  <NA>     y  <NA>  <NA> y
# 3 3  <NA>  <NA>     z  <NA> z
# 4 4  <NA>  <NA>  <NA>     f f

The above code is just a shortcut of

df %>%
  mutate(B = coalesce(Date1, Date2, Date3, Date4))

If the B needs to be filled based on the value of A, then here is an idea with c_across():

df %>%
  rowwise() %>%
  mutate(B = c_across(starts_with("Date"))[A]) %>%
  ungroup()

# # A tibble: 4 × 6
#       A Date1 Date2 Date3 Date4 B    
#   <int> <chr> <chr> <chr> <chr> <chr>
# 1     1 x     NA    NA    NA    x    
# 2     2 NA    y     NA    NA    y    
# 3     3 NA    NA    z     NA    z    
# 4     4 NA    NA    NA    f     f 

CodePudding user response:

The other answers are superior, but if you must use your current code for the actual application, the corrected version is:

df %>%
  mutate(B = case_when(
    A == 1 ~ Date1,
    A == 2 ~ Date2,
    A == 3 ~ Date3,
    A == 4 ~ Date4))

Output:

# A B Date1 Date2 Date3 Date4
# 1 x     x  <NA>  <NA>  <NA>
# 2 y  <NA>     y  <NA>  <NA>
# 3 z  <NA>  <NA>     z  <NA>
# 4 f  <NA>  <NA>  <NA>     f

CodePudding user response:

As it seems, you want diagonal values from columns with Date, you can use diag:

df$B <- diag(as.matrix(df[grepl("Date", colnames(df))]))
#[1] "x" "y" "z" "f"

Other answers (if you want to coalesce):

  • With max:
df$B <- apply(df[2:5], 1, \(x) max(x, na.rm = T))
  • With c_across:
df %>% 
  rowwise() %>% 
  mutate(B = max(c_across(Date1:Date4), na.rm = T))

output

  A Date1 Date2 Date3 Date4 B
1 1     x  <NA>  <NA>  <NA> x
2 2  <NA>     y  <NA>  <NA> y
3 3  <NA>  <NA>     z  <NA> z
4 4  <NA>  <NA>  <NA>     f f
  •  Tags:  
  • r
  • Related