I cannot share the dataset but I will give an example, I have a dataset with 50 columns 48 of them are dates.
Here is an example of the data:
Date1 | Date2 | Date3 | Date4 |
---|---|---|---|
2014-02-12 01:12:21 | 2014-03-12 01:12:21 | 2014-04-12 01:12:21 | 2014-05-12 01:12:21 |
2014-05-12 01:12:21 | 2014-04-12 01:12:21 | 2014-01-12 01:12:21 | 2014-04-11 01:12:21 |
Here is what I want:
Date1 | Date2 | Date3 | Date4 |
---|---|---|---|
2014-02-12 01:12:21 | 2014-03-12 01:12:21 | NA | NA |
NA | 2014-04-12 01:12:21 | 2014-01-12 01:12:21 | 2014-04-11 01:12:21 |
basically Date2
column will be the determining column. If the other columns in that row are later than Date2
column then, time of those columns will be NA
.
Here is what I tried on my original dataset and what error I got(in this one I tried to replace dates with 1 and 0 based on the condition I specified):
ex <- Paidneed[ifelse(Paidneed[1:nrow(Paidneed),1] < Paidneed[1:nrow(Paidneed),3:50], 1,0),]
Error in ifelse(Paidneed[1:nrow(Paidneed), 5] < Paidneed[1:nrow(Paidneed), :
'list' object cannot be coerced to type 'double'
In addition: Warning messages:
1: In ifelse(Paidneed[1:nrow(Paidneed), 5] < Paidneed[1:nrow(Paidneed), :
Incompatible methods ("<.POSIXt", "Ops.data.frame") for "<"
2: In Paidneed[1:nrow(Paidneed), 5] < Paidneed[1:nrow(Paidneed), 3:50] :
longer object length is not a multiple of shorter object length
CodePudding user response:
A base
solution:
col <- grepl('Date', names(df))
df[col] <- lapply(df[col], \(x) replace(x, x > df$Date2, NA))
df
With dplyr
:
library(dplyr)
df %>%
mutate(across(contains('Date'), ~ replace(.x, .x > Date2, NA)))
Output
# Date1 Date2 Date3 Date4
# 1 2014-02-12 01:12:21 2014-03-12 01:12:21 <NA> <NA>
# 2 <NA> 2014-04-12 01:12:21 2014-01-12 01:12:21 2014-04-11 01:12:21
Data
df <- read.csv(text = "Date1,Date2,Date3,Date4
2014-02-12 01:12:21,2014-03-12 01:12:21,2014-04-12 01:12:21,2014-05-12 01:12:21
2014-05-12 01:12:21,2014-04-12 01:12:21,2014-01-12 01:12:21,2014-04-11 01:12:21")
df[] <- lapply(df, as.POSIXct)