Home > Net >  Conditional statement based on unique IDs and matching dates in R
Conditional statement based on unique IDs and matching dates in R

Time:01-06

I'm hoping someone can help with my problem, I'm new to R and can't figure my problem out.

I have a dataframe with multiple rows per ID, with a lot of mising data. I want to ask R to make a new column applying a calculation, if for each unique ID the dates match.

An example data frame =

    example <- data.frame(id = c("A01","A01","A01", "A02","A02"),
                      al = c(14,NA,56,89,NA),
                      cr = c(NA,100,NA,NA,87),                   
                      date = c("2014-10-29","2014-10-29","2022-01-01", "1993-10-22", "1993-10-22"))
    example$date <- as.Date(example$date)

For each unique ID (A01 and A02), if "cr" and "al" were taken on the same date, create a new column called ACR and apply this: (example$al100)/((example$cr0.0113)*0.01).

I have tried group_by() and mutate(), but I can't figure out how to ask if two dates within the column of the ID match?

example2 <- example %>%
      dplyr::group_by(id) %>%
      dplyr::mutate(ACR = if_else(date==date), (example$al*100)/((example$cr*0.0113)*0.01), 0, NA)


Thank you so much in advance.

CodePudding user response:

#This function returns missing value if there is no value, otherwise it returns the non-missing value.
select_value <- function(x){
  if(all(is.na(x))){
    return(x)
  }else{
    return(max(x,na.rm=T))
  }
}

example%>%
  group_by(id,date)%>%
  mutate(ACR=(select_value(al)*100)/(select_value(cr)*0.01))%>%
  ungroup

  id       al    cr date          ACR
  <fct> <dbl> <dbl> <date>      <dbl>
1 A01      14    NA 2014-10-29  1400 
2 A01      NA   100 2014-10-29  1400 
3 A01      56    NA 2022-01-01    NA 
4 A02      89    NA 1993-10-22 10230.
5 A02      NA    87 1993-10-22 10230.

CodePudding user response:

An approach using fill and distinct, assuming that either one variable (al or cr) is defined if they share a date.

library(dplyr)
library(tidyr)

example %>% 
  group_by(id, date) %>% 
  fill(al:cr, .direction="updown") %>% 
  distinct() %>% 
  mutate(ACR = (al * 100) / ((cr * 0.0113) * 0.01)) %>% 
  ungroup()
# A tibble: 3 × 5
  id       al    cr date           ACR
  <chr> <dbl> <dbl> <date>       <dbl>
1 A01      14   100 2014-10-29 123894.
2 A01      56    NA 2022-01-01     NA 
3 A02      89    87 1993-10-22 905300.
  • Related