I have a data frame that is not z-score converted. I want to delete from the data frame (and convert to NA) only those values that are higher or equal to 4, without dropping any row or column. I would appreciate an answer.
Best
CodePudding user response:
You can use the following code:
df <- data.frame(v1 = c(1,3,6,7,3),
v2 = c(2,1,4,6,7),
v3 = c(1,2,3,4,5))
df
#> v1 v2 v3
#> 1 1 2 1
#> 2 3 1 2
#> 3 6 4 3
#> 4 7 6 4
#> 5 3 7 5
is.na(df) <- df >= 4
df
#> v1 v2 v3
#> 1 1 2 1
#> 2 3 1 2
#> 3 NA NA 3
#> 4 NA NA NA
#> 5 3 NA NA
Created on 2022-07-10 by the reprex package (v2.0.1)
CodePudding user response:
you can simply use df[df>=4] <- NA
to achieve what you want.
df <- data.frame(replicate(10,sample(0:10,10,rep=TRUE)))
> df
X1 X2 X3 X4 X5 X6 X7 X8 X9 X10
1 2 3 4 5 6 4 3 1 10 6
2 5 7 0 4 3 10 10 3 6 10
3 5 5 0 3 1 3 5 7 2 7
4 7 0 4 1 10 0 5 2 5 0
5 8 8 7 8 4 6 6 10 10 0
6 1 4 1 3 3 8 8 0 4 8
7 6 3 3 6 7 4 10 9 7 2
8 2 1 4 0 7 8 10 1 6 3
9 0 9 6 2 9 6 2 9 0 3
10 8 2 1 0 1 4 0 6 2 8
df[df>=4] <- NA
> df
X1 X2 X3 X4 X5 X6 X7 X8 X9 X10
1 2 3 NA NA NA NA 3 1 NA NA
2 NA NA 0 NA 3 NA NA 3 NA NA
3 NA NA 0 3 1 3 NA NA 2 NA
4 NA 0 NA 1 NA 0 NA 2 NA 0
5 NA NA NA NA NA NA NA NA NA 0
6 1 NA 1 3 3 NA NA 0 NA NA
7 NA 3 3 NA NA NA NA NA NA 2
8 2 1 NA 0 NA NA NA 1 NA 3
9 0 NA NA 2 NA NA 2 NA 0 3
10 NA 2 1 0 1 NA 0 NA 2 NA
CodePudding user response:
Though the solution by @Quinten is very concise, just add an approach in tidyverse
library(dplyr)
set.seed(123)
df <- data.frame(
x = sample(1:10, 7),
y = sample(1:10, 7)
)
df %>%
mutate(
across(.fns = ~ if_else(.x >= 4, NA_integer_, .x))
)
#> x y
#> 1 3 NA
#> 2 NA NA
#> 3 2 1
#> 4 NA 2
#> 5 NA 3
#> 6 NA NA
#> 7 1 NA
Created on 2022-07-10 by the reprex package (v2.0.1)