Home > Blockchain >  Replace NA by 0 for selected columns based on a years range in R
Replace NA by 0 for selected columns based on a years range in R

Time:08-08

Following my question here: *

I would like to change the NA values by 0 when year >= 1997 & <=2010.

Here is a subset of the dataset :

    structure(list(gid = c(79600, 79600, 79600, 79600, 79600, 79600, 
79600, 79600, 79600, 79600, 79600, 79600, 79600, 79600, 79600, 
79600, 79600, 79600, 79600, 79600, 79600, 79600, 79600, 79600, 
79600, 79600, 79600, 79600, 79600, 79600, 79601, 79601, 79601, 
79601, 79601, 79601, 79601, 79601, 79601, 79601, 79601, 79601, 
79601, 79601, 79601, 79601, 79601, 79601, 79601, 79601, 79601, 
79601, 79601, 79601, 79601, 79601, 79601, 79601, 79601, 79601
), Year = c(1981, 1982, 1983, 1984, 1985, 1986, 1987, 1988, 1989, 
1990, 1991, 1992, 1993, 1994, 1995, 1996, 1997, 1998, 1999, 2000, 
2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010, 1981, 
1982, 1983, 1984, 1985, 1986, 1987, 1988, 1989, 1990, 1991, 1992, 
1993, 1994, 1995, 1996, 1997, 1998, 1999, 2000, 2001, 2002, 2003, 
2004, 2005, 2006, 2007, 2008, 2009, 2010), TotalConflicts = c(NA, 
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 2, 
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), IncidenceConflicts = c(NA, 
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 1, 
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA)), row.names = c(NA, 
-60L), class = c("tbl_df", "tbl", "data.frame"))

Please note that the column IncidenceConflict is a binary column. Sometimes I got 1, 0 or NA.

I tried this but it does not work:

test<- test %>% 
  mutate(across(3:4,  ~replace(.x, .x == "NA" & Year >= 1997, 0)))

CodePudding user response:

Updated code after clarification:

ifelse will do that:

df %>%
  mutate(across(3:4, ~ifelse((Year >= 1997 & Year <= 2010)
                             & is.na(.), 0, .)))
  gid  Year TotalConflicts IncidenceConflicts
   <dbl> <dbl>          <dbl>              <dbl>
 1 79600  1981             NA                 NA
 2 79600  1982             NA                 NA
 3 79600  1983             NA                 NA
 4 79600  1984             NA                 NA
 5 79600  1985             NA                 NA
 6 79600  1986             NA                 NA
 7 79600  1987             NA                 NA
 8 79600  1988             NA                 NA
 9 79600  1989             NA                 NA
10 79600  1990             NA                 NA
11 79600  1991             NA                 NA
12 79600  1992             NA                 NA
13 79600  1993             NA                 NA
14 79600  1994             NA                 NA
15 79600  1995             NA                 NA
16 79600  1996             NA                 NA
17 79600  1997              0                  0
18 79600  1998              0                  0
19 79600  1999              0                  0
20 79600  2000              0                  0
21 79600  2001              0                  0
22 79600  2002              0                  0
23 79600  2003              0                  0
24 79600  2004              0                  0
25 79600  2005              0                  0
26 79600  2006              0                  0
27 79600  2007              0                  0
28 79600  2008              0                  0
29 79600  2009              0                  0
30 79600  2010              0                  0
31 79601  1981             NA                 NA
32 79601  1982             NA                 NA
33 79601  1983             NA                 NA
34 79601  1984             NA                 NA
35 79601  1985             NA                 NA
36 79601  1986             NA                 NA
37 79601  1987             NA                 NA
38 79601  1988             NA                 NA
39 79601  1989             NA                 NA
40 79601  1990             NA                 NA
41 79601  1991             NA                 NA
42 79601  1992             NA                 NA
43 79601  1993             NA                 NA
44 79601  1994             NA                 NA
45 79601  1995             NA                 NA
46 79601  1996             NA                 NA
47 79601  1997              0                  0
48 79601  1998              0                  0
49 79601  1999              2                  1
50 79601  2000              0                  0
51 79601  2001              0                  0
52 79601  2002              0                  0
53 79601  2003              0                  0
54 79601  2004              0                  0
55 79601  2005              0                  0
56 79601  2006              0                  0
57 79601  2007              0                  0
58 79601  2008              0                  0
59 79601  2009              0                  0
60 79601  2010              0                  0
> 





    ggh4x.axis.nesttext.y = element_text(margin = margin(r = 6, l = 6)),



CodePudding user response:

df %>%
  mutate(across(3:4, ~ case_when(
    between(x = Year, 1997, 2010) & is.na(.x) ~ 0,
    TRUE ~ .x
  )))

 A tibble: 19 × 4
     gid  Year TotalConflicts IncidenceConflicts
   <dbl> <dbl>          <dbl>              <dbl>
 1 79600  1992             NA                 NA
 2 79600  1993             NA                 NA
 3 79600  1994             NA                 NA
 4 79600  1995             NA                 NA
 5 79600  1996             NA                 NA
 6 79600  1997              0                  0
 7 79600  1998              0                  0
 8 79600  1999              0                  0
 9 79600  2000              0                  0
10 79600  2001              0                  0
11 79600  2002              0                  0
12 79600  2003              0                  0
13 79600  2004              0                  0

CodePudding user response:

library(data.table)

COLS <- c("TotalConflicts", "IncidenceConflicts")
setDT(dt)[between(Year, 1997, 2010),
      (COLS) := lapply(.SD, function(x) fifelse(is.na(x), 0, x)), .SDcols = COLS][]
  • Related