I'm hoping someone can help with my problem, I'm new to R and can't figure my problem out.
I have a dataframe with multiple rows per ID, with a lot of mising data. I want to ask R to make a new column applying a calculation, if for each unique ID the dates match.
An example data frame =
example <- data.frame(id = c("A01","A01","A01", "A02","A02"),
al = c(14,NA,56,89,NA),
cr = c(NA,100,NA,NA,87),
date = c("2014-10-29","2014-10-29","2022-01-01", "1993-10-22", "1993-10-22"))
example$date <- as.Date(example$date)
For each unique ID (A01 and A02), if "cr" and "al" were taken on the same date, create a new column called ACR and apply this: (example$al100)/((example$cr0.0113)*0.01).
I have tried group_by() and mutate(), but I can't figure out how to ask if two dates within the column of the ID match?
example2 <- example %>%
dplyr::group_by(id) %>%
dplyr::mutate(ACR = if_else(date==date), (example$al*100)/((example$cr*0.0113)*0.01), 0, NA)
Thank you so much in advance.
CodePudding user response:
#This function returns missing value if there is no value, otherwise it returns the non-missing value.
select_value <- function(x){
if(all(is.na(x))){
return(x)
}else{
return(max(x,na.rm=T))
}
}
example%>%
group_by(id,date)%>%
mutate(ACR=(select_value(al)*100)/(select_value(cr)*0.01))%>%
ungroup
id al cr date ACR
<fct> <dbl> <dbl> <date> <dbl>
1 A01 14 NA 2014-10-29 1400
2 A01 NA 100 2014-10-29 1400
3 A01 56 NA 2022-01-01 NA
4 A02 89 NA 1993-10-22 10230.
5 A02 NA 87 1993-10-22 10230.
CodePudding user response:
An approach using fill
and distinct
, assuming that either one variable (al
or cr
) is defined if they share a date.
library(dplyr)
library(tidyr)
example %>%
group_by(id, date) %>%
fill(al:cr, .direction="updown") %>%
distinct() %>%
mutate(ACR = (al * 100) / ((cr * 0.0113) * 0.01)) %>%
ungroup()
# A tibble: 3 × 5
id al cr date ACR
<chr> <dbl> <dbl> <date> <dbl>
1 A01 14 100 2014-10-29 123894.
2 A01 56 NA 2022-01-01 NA
3 A02 89 87 1993-10-22 905300.