Home > Blockchain >  Create new column based on cut off date (date from another column)
Create new column based on cut off date (date from another column)

Time:10-22

I have a sample df below (with date formatted into as.Date):

| date       |
--------------
| 2020-03-03 |
| 2020-06-30 |
| 2020-01-23 |
| 2020-02-10 |
| 2020-11-29 |

I am trying to add a column according to cut-off date of 2020-05-01 and expects to get this table:


| date       | cutoff |
------------------------
| 2020-03-03 | prior   |
| 2020-06-30 | later   |
| 2020-01-23 | prior   |
| 2020-02-10 | prior   |
| 2020-11-29 | later   |

I used dplyr and called the mutate to create a column and initially applied case_when:

df %>%
  mutate(cutoff = case_when(
    date < 2020-05-01 ~ "prior",
    "later"
  ))

The code above created cutoff column with only "later" values.

I also tried the ifelse:

df <- with(df, ifelse(date < 2020-05-01, "prior", "later"))

The code above replaced the values in the date column with NA value.

I gave a different code a try:

df %>%
  mutate(cutoff = case_when(date < 2020-05-01 ~ "prior", 
                          TRUE ~ "later"))

but the result was the same as the first code I tried.

I thought of converting the date into POSixct format, but each code above produced the same output as above.

CodePudding user response:

First define date class with ymd then use ifelse:

library(lubridate)
library(dplyr)
df %>% 
  mutate(date = ymd(date),
         cutoff = ifelse(date < ymd("2020-05-01"), "prior", "later")) 
        date cutoff
1 2020-03-03  prior
2 2020-06-30  later
3 2020-01-23  prior
4 2020-02-10  prior
5 2020-11-29  later

data:

df <- structure(list(date = c("2020-03-03", "2020-06-30", "2020-01-23", 
"2020-02-10", "2020-11-29")), class = "data.frame", row.names = c(NA, 
-5L))
  • Related