I am trying to determine how many days each of my columns were recording values for. They all start/stop recording at different times, and it's important that the total days calculated doesn't include times when the column has NA's. Here is an example of my dataframe
df = structure(list(Date_Time_GMT_3 = structure(c(1594233000, 1594533900, 1597235700,
1595234800, 1594336600, 1595237500),
class = c("POSIXct", "POSIXt"), tzone = "EST"),
`20874285_33MR` = c(14.996, 15.091, 15.187, 15.282, 15.378, 15.378),
`20874290_103MR` = c(NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_),
`20874287_102MR` = c(NA_real_, 15.091, 15.187, 15.282, NA_real_, NA_real_),
`20874299_54MR` = c(NA_real_, 15.378, 15.378, NA_real_, NA_real_, NA_real_),
`20874316_AIR_90MR` = c(NA_real_, NA_real_, NA_real_,15.091, 15.187, 15.282)),
row.names = c(NA, 6L), class = "data.frame")
The time doesn't matter. As long as there is a record on that day I can count it as the column having records for 1 Day.
The final result should have a total number of days for each column
CodePudding user response:
Is this what you want to do?
library(dplyr)
df %>%
group_by(date = as.Date(Date_Time_GMT_3)) %>%
summarise(across(everything(), ~any(!is.na(.)))) %>%
summarise(across(-date, sum))
#> # A tibble: 1 x 6
#> Date_Time_GMT_3 `20874285_33MR` `20874290_103MR` `20874287_102MR` `20874299_54MR` `20874316_AIR_90MR`
#> <int> <int> <int> <int> <int> <int>
#> 1 5 5 0 3 2 2