I have a question about the use of a logical expression in combination with a variable.
Imagine that I have a data frame with multiple rows that each contain a date saved as 2021-09-25T06:04:35:689Z
.
I also have a variable that contains the date of yesterday as '2021-09-24' - yesterday <- Sys.Date()-1
.
How do I filter the rows in my data frame based on the date of yesterday which is stored in the variable 'yesterday'?
To solve my problem, I have looked at multiple posts, for example:
I am well aware that this question might be a duplicate. However, current questions do not provide me with the help that I need help. I hope that one of you can help me.
CodePudding user response:
As an initial matter, it looks like you have a vector instead of a data frame (only one column). If you really do have a data frame and only ran str() on one column, the very similar technique at the end will work for you.
The first thing to know is that your dates are stored as character strings, while your yesterday object is in the Date format. R will not let you compare objects of different types, so you need to convert at least one of the two objects.
I suggest converting both to the POSIXct format so that you do not lose any information in your dates column but can still compare it to yesterday. Make sure to set the timezone to the same as your system time (mine is "America/New_York").
Dates <- c("2021-09-09T06:04:35.689Z", "2021-09-09T06:04:35.690Z", "2021-09-09T06:04:35.260Z", "2021-09-24T06:04:35.260Z")
Dates <- gsub("T", " ", Dates)
Dates <- gsub("Z", "", Dates)
Dates <- as.POSIXct(Dates, '%Y-%m-%d %H:%M:%OS', tz = "America/New_York")
yesterday <- Sys.time()-86400 #the number of seconds in one day
Now you can tell R to ignore the time any only compare the dates.
trunc(Dates, units = c("days")) == trunc(yesterday, units = c("days"))]
The other part of your question was about filtering. The easiest way to filter is subsetting. You first ask R for the indices of the matching values in your vector (or column) by wrapping your comparison in the which()
function.
Indices <- which(trunc(Dates, units = c("days")) == trunc(yesterday, units = c("days"))])
None of the dates in your str() results match yesterday, so I added one at the end that matches. Calling which()
returns a 4 to tell you that the fourth item in your vector matches yesterday's date. If more dates matched, it would have more values. I saved the results in "Indices"
We can then use the Indices from which()
to subset your vector or dataframe.
Filtered_Dates <- Dates[Indices]
Filtered_Dataframe <- df[Indices,] #note the comma, which indicates that we are filtering rows instead of columns.