I have a dataframe with a DATE/TIME column and a column with some numeric values. I'd like to change some numeric values to "N/A"
based of a range of DATE/TIME they are recorded at.
This is what my dataframe
looks like
df = structure(list(Date_Time_GMT_3 = structure(c(1592226000, 1592226900,
1592227800, 1592228700, 1592229600, 1592230500), class = c("POSIXct",
"POSIXt"), tzone = "EST"), diff_20676892_AIR_X3lh = c(NA, 0.385999999999999,
0.193, 0.290000000000001, 0.385, 0.576000000000001), diff_20819828_B1LH_DOUBLE_CHECK = c(NA,
0, 0, 0, 0.0949999999999989, 0)), row.names = c(NA, 6L), class = "data.frame")
I want to change all values for diff_20819828_B1LH_DOUBLE_CHECK
to N/A
if they are between 2020-06-15 08:30:00
and 2020-06-15 09:00:00
I tried this code
df[df$Date_Time_GMT_3 > "2020-06-15 08:30:00"| < "2020-06-15 09:00:00"] = "NA"
but to no surprise this doesn't work. How can I fix this?
CodePudding user response:
Your date column is in "EST", so you can do this:
df[df$Date_Time_GMT_3 > as.POSIXct("2020-06-15 08:30:00", tz="EST") &
df$Date_Time_GMT_3 < as.POSIXct("2020-06-15 09:00:00", tz="EST"),3] <- NA
Date_Time_GMT_3 diff_20676892_AIR_X3lh diff_20819828_B1LH_DOUBLE_CHECK
1 2020-06-15 08:00:00 NA NA
2 2020-06-15 08:15:00 0.386 0.000
3 2020-06-15 08:30:00 0.193 0.000
4 2020-06-15 08:45:00 0.290 NA
5 2020-06-15 09:00:00 0.385 0.095
6 2020-06-15 09:15:00 0.576 0.000
Note that there is only one row between those times, row 4, and above changes the value(s) in the 3rd column for such row(s) to NA
CodePudding user response:
Your base R code isn't working because
- You didn't specify which column's values should be changed
- You're using an
|
instead of an&
- After a logical operator you need to repeat which vector to assess
- You're not telling R that those strings are date-times.
Langtang's solution is very neat. Another option using dplyr
and lubridate
is:
library(dplyr)
library(lubridate)
df %>% mutate(diff_20819828_B1LH_DOUBLE_CHECK = na_if(
diff_20819828_B1LH_DOUBLE_CHECK,
Date_Time_GMT_3 %within% interval("2020-06-15 08:30:00", "2020-06-15 09:00:00")
))