Home > Mobile >  R - change the first two times of a variable in each row into NAs
R - change the first two times of a variable in each row into NAs

Time:03-16

I want the percentage of ones for each year, so the percentage for each column. My problem is now that I have to exclude the first two ones of each row because at that point the individual are to young to be included into my analysis. I tried to change the first two ones into NAs, so I still know that there was a one but it is not included into my analysis/calculations. The first six rows of my data set (df) looks like the following:

    2007 2008 2009 2010 2011 2012 2013 2014
   1    1    1    1    1   1     1    1    1
   2    0    1    1    1   0     0    0    0
   3    1    1    1    1   1     1    1    1
   4    1    1    1    0   0     0    0    0
   5    0    1    1    1   0     0    0    0
   6    1    1    1    1   1     1    1    1 

The data set should look like the following | expected output:

  2007 2008 2009 2010 2011 2012 2013 2014
 1  NA   NA    1    1   1     1    1    1
 2  0    NA   NA    1   0     0    0    0
 3  NA   NA    1    1   1     1    1    1
 4  NA   NA    1    0   0     0    0    0
 5  0    NA   NA    1   0     0    0    0
 6  NA   NA    1    1   1     1    1    1 

I tried different formulars. Most of them did not worked at all. The following code at least worked but did not do any change in my data set. Any help would be really appreciated.

 df2 <- df %>% 
  transmute(across(.cols = everything(), .fns = NULL, 
                   (length(x<-which(myRow == 1)) == length(x 1)), NA))

I also tried the following but there I got an error:

 df3 <- transmute_if (df,(length(x<-which(myRow == 1)) == length(x 1)), return(NA))

Error: .predicate must have length 1, not 14.

CodePudding user response:

Here is a base R way.

df1 <- read.table(text = "
2007 2008 2009 2010 2011 2012 2013 2014
   1    1    1    1    1   1     1    1    1
   2    0    1    1    1   0     0    0    0
   3    1    1    1    1   1     1    1    1
   4    1    1    1    0   0     0    0    0
   5    0    1    1    1   0     0    0    0
   6    1    1    1    1   1     1    1    1
", header = TRUE, check.names = FALSE)

f <- function(x){
  i <- which(x == 1)
  if(length(i) ==  1L) {
    is.na(x) <- i
  } else if (length(i >= 2L)) {
    is.na(x) <- i[1:2]
  }
  x
}
t(apply(df1, 1, f))
#>   2007 2008 2009 2010 2011 2012 2013 2014
#> 1   NA   NA    1    1    1    1    1    1
#> 2    0   NA   NA    1    0    0    0    0
#> 3   NA   NA    1    1    1    1    1    1
#> 4   NA   NA    1    0    0    0    0    0
#> 5    0   NA   NA    1    0    0    0    0
#> 6   NA   NA    1    1    1    1    1    1

Created on 2022-03-15 by the reprex package (v2.0.1)

  •  Tags:  
  • r na
  • Related