I have the data frame below and want to calculate the number of days since the student's last (most recent) absence from the present date and add that to the original data frame.
Student ID Absent Date Subject
4567 08/30/2018 M
4567 09/22/2019 M
8345 09/01/2019 S
8345 03/30/2019 PE
8345 07/18/2017 M
5601 01/08/2019 SS
This is the desired output:
Student ID Absent Date Subject # of Days Since Last Absence
4567 08/30/2018 M 816
4567 09/22/2019 M 816
8345 09/01/2019 S 837
8345 03/30/2019 PE 837
8345 07/18/2017 M 837
5601 01/08/2019 SS 1073
Thank you for any help.
CodePudding user response:
Package dplyr and a max
on the date will give you the answer. current date in R is Sys.Date()
library(dplyr)
df1 %>%
group_by(Student_ID) %>%
mutate(days_since_last_absence = Sys.Date() - max(as.Date(Absent_Date, format = "%m/%d/%Y")))
# A tibble: 6 x 4
# Groups: Student_ID [3]
Student_ID Absent_Date Subject days_since_last_absence
<int> <chr> <chr> <drtn>
1 4567 08/30/2018 M 816 days
2 4567 09/22/2019 M 816 days
3 8345 09/01/2019 S 837 days
4 8345 03/30/2019 PE 837 days
5 8345 07/18/2017 M 837 days
6 5601 01/08/2019 SS 1073 days
data:
df1 <- structure(list(Student_ID = c(4567L, 4567L, 8345L, 8345L, 8345L,
5601L), Absent_Date = c("08/30/2018", "09/22/2019", "09/01/2019",
"03/30/2019", "07/18/2017", "01/08/2019"), Subject = c("M", "M",
"S", "PE", "M", "SS")), class = "data.frame", row.names = c(NA,
-6L))
CodePudding user response:
Assuming your dataframe is called df
, in base R, try
df$difference<-as.numeric(difftime(Sys.Date(),as.POSIXct(paste(df[,2]),format="%m/%d/%Y"),tz="UTC"))
tz
has to be adjusted for your location.