Home > Blockchain >  How to remove rows based on condition
How to remove rows based on condition

Time:02-23

In a sample dataset I have companies with their respective exit_date in that I want to show that company name if one company left in 2018 until the fiscal year ends i need to show the company name after the fiscal year the row should be removed

Company <- c("Nokia","Samsung","Moto","Blue Berry","Vivo","Redmi")
Exit_date <- c(NA, "2019-08-25", "2019-12-31", "2018-12-31","2020-09-15",NA)
data <- data.frame(Company,Exit_date)   
data$Exit_date <-as.Date(data$Exit_date,format = "%Y-%m-%d")

Fiscal years:

a = as.Date("2018-04-01")
b = as.Date("2019-04-01")
c = as.Date("2020-04-01")
d = as.Date("2021-04-01")
e = as.Date("2022-04-01")

I tried using if but didn't get any proper output:

if(data$exit_date > a & data$exit_date < b){
  data = data[!data$Exit_date,]
}else if (data$exit_date > b & data$exit_date < c){
  data = data[!data$Exit_date,]
}else if (data$exit_date > c & data$exit_date < d){
  data = data[!data$Exit_date,]
}else if (data$exit_date > d & data$exit_date < e){
  data = data[!data$Exit_date,]
}else if (data$Exit_date==NA){
}

Any idea would be appreciated.

CodePudding user response:

library(tidyverse)

Company <- c("Nokia", "Samsung", "Moto", "Blue Berry", "Vivo", "Redmi")
Exit_date <- c(NA, "2019-08-25", "2019-12-31", "2018-12-31", "2020-09-15", NA)
data <- data.frame(Company, Exit_date)
data$Exit_date <- as.Date(data$Exit_date, format = "%Y-%m-%d")

# must be in ascending order
a <- as.Date("2018-04-01")
b <- as.Date("2019-04-01")
c <- as.Date("2020-04-01")
d <- as.Date("2021-04-01")
e <- as.Date("2022-04-01")

data <-
  data %>%
  as_tibble() %>%
  mutate(
    left_fiscal = case_when(
      Exit_date < a ~ a,
      Exit_date < b ~ b,
      Exit_date < c ~ c,
      Exit_date < d ~ d,
      Exit_date < e ~ e
    )
  )
data
#> # A tibble: 6 x 3
#>   Company    Exit_date  left_fiscal
#>   <chr>      <date>     <date>     
#> 1 Nokia      NA         NA         
#> 2 Samsung    2019-08-25 2020-04-01 
#> 3 Moto       2019-12-31 2020-04-01 
#> 4 Blue Berry 2018-12-31 2019-04-01 
#> 5 Vivo       2020-09-15 2021-04-01 
#> 6 Redmi      NA         NA

# remove companies left in fiscal year 2018
data %>%
  filter(! left_fiscal == b)
#> # A tibble: 3 x 3
#>   Company Exit_date  left_fiscal
#>   <chr>   <date>     <date>     
#> 1 Samsung 2019-08-25 2020-04-01 
#> 2 Moto    2019-12-31 2020-04-01 
#> 3 Vivo    2020-09-15 2021-04-01

Created on 2022-02-22 by the reprex package (v2.0.0)

CodePudding user response:

The problem with your if statement is that data$Exit_date isn't a logical vector so you can't use data[!data$Exit_date,] to remove the rows.

The other thing is that adding up your if statements : it seems you want only exit date matching a, b, c, d or e (since you're removing everyting in between). But it doesn't seem to be the desired output?

To get the year they left, it's easier if your dates are in a vector : you can add up the conditions (>= to x date) to get the index of the year in your vector like so :

Fiscal_years <- c(
  as.Date("2018-04-01"),
  as.Date("2019-04-01"),
  as.Date("2020-04-01"),
  as.Date("2021-04-01"),
  as.Date("2022-04-01")
)
data <- within(data,{
  Exit_fiscal <- Fiscal_years[rowSums(sapply(Fiscal_years, function(x) Exit_date >= x)) 1]
})

You could also use the library lubridate to match years :

Fiscal_years <- c(
  "2018" = as.Date("2018-04-01"),
  "2019" = as.Date("2019-04-01"),
  "2020" = as.Date("2020-04-01"),
  "2021" = as.Date("2021-04-01"),
  "2022" = as.Date("2022-04-01")
)
library(lubridate)
data <- within(data,{
  Exit_fiscal <- Fiscal_years[as.character(year(Exit_date)   (month(Exit_date) >= 4))]
})

Then to filter :

data <- data[!data$Exit_fiscal %in% Fiscal_years["2019"],]
#>   Company  Exit_date Exit_fiscal
#> 1   Nokia       <NA>        <NA>
#> 2 Samsung 2019-08-25  2020-04-01
#> 3    Moto 2019-12-31  2020-04-01
#> 5    Vivo 2020-09-15  2021-04-01
#> 6   Redmi       <NA>        <NA>

Or if you don't want to keep NA values :

data <- data[which(data$Exit_fiscal != Fiscal_years["2019"]),]
#>   Company  Exit_date Exit_fiscal
#> 2 Samsung 2019-08-25  2020-04-01
#> 3    Moto 2019-12-31  2020-04-01
#> 5    Vivo 2020-09-15  2021-04-01
  • Related