Home > Back-end >  Comparing multiple columns in one row and change their value
Comparing multiple columns in one row and change their value

Time:08-15

I cannot share the dataset but I will give an example, I have a dataset with 50 columns 48 of them are dates.

Here is an example of the data:

Date1 Date2 Date3 Date4
2014-02-12 01:12:21 2014-03-12 01:12:21 2014-04-12 01:12:21 2014-05-12 01:12:21
2014-05-12 01:12:21 2014-04-12 01:12:21 2014-01-12 01:12:21 2014-04-11 01:12:21

Here is what I want:

Date1 Date2 Date3 Date4
2014-02-12 01:12:21 2014-03-12 01:12:21 NA NA
NA 2014-04-12 01:12:21 2014-01-12 01:12:21 2014-04-11 01:12:21

basically Date2 column will be the determining column. If the other columns in that row are later than Date2 column then, time of those columns will be NA.

Here is what I tried on my original dataset and what error I got(in this one I tried to replace dates with 1 and 0 based on the condition I specified):

ex <- Paidneed[ifelse(Paidneed[1:nrow(Paidneed),1] < Paidneed[1:nrow(Paidneed),3:50], 1,0),]
Error in ifelse(Paidneed[1:nrow(Paidneed), 5] < Paidneed[1:nrow(Paidneed),  : 
  'list' object cannot be coerced to type 'double'
In addition: Warning messages:
1: In ifelse(Paidneed[1:nrow(Paidneed), 5] < Paidneed[1:nrow(Paidneed),  :
  Incompatible methods ("<.POSIXt", "Ops.data.frame") for "<"
2: In Paidneed[1:nrow(Paidneed), 5] < Paidneed[1:nrow(Paidneed), 3:50] :
  longer object length is not a multiple of shorter object length

CodePudding user response:

A base solution:

col <- grepl('Date', names(df))
df[col] <- lapply(df[col], \(x) replace(x, x > df$Date2, NA))
df

With dplyr:

library(dplyr)

df %>%
  mutate(across(contains('Date'), ~ replace(.x, .x > Date2, NA)))
Output
#                 Date1               Date2               Date3               Date4
# 1 2014-02-12 01:12:21 2014-03-12 01:12:21                <NA>                <NA>
# 2                <NA> 2014-04-12 01:12:21 2014-01-12 01:12:21 2014-04-11 01:12:21

Data
df <- read.csv(text = "Date1,Date2,Date3,Date4
2014-02-12 01:12:21,2014-03-12 01:12:21,2014-04-12 01:12:21,2014-05-12 01:12:21
2014-05-12 01:12:21,2014-04-12 01:12:21,2014-01-12 01:12:21,2014-04-11 01:12:21")
df[] <- lapply(df, as.POSIXct)
  • Related