How to transform values into NA from a data.frame, based on an external list, using R?-CodePudding

While I was transforming data from a dataframe in R (Rstudio), I wanted to give NA values to a specified column if the number is in a list. This list (I believe it is a list), comes from a boxplot.stats(x)$out.

So this is what I did to get a variable with a list of the numbers from the boxplot:

age_outofrange <- boxplot.stats(census$age)$out

And this is what I coded. I used the unique(x) method because some ages where repeated:

census["age"][census["age"] == unique(age_outofrange), ] <- NA

census -> Dataframe

age -> The target column

This is an example of my current dataframe:

index|age    
1|34
2|79
3|80
4|23
5|650
6|44
7|560
8|12
9|65
10|79

This is what I am expecting (I write a new csv and nothing happens):

index|age    
1|34
2|NA
3|NA
4|23
5|NA
6|44
7|NA
8|12
9|65
10|NA

So I substituted the values: 79, 80, 650, and 560, which are the values from age_outofrange. I also tried something like the following code but nothing happened (or at least what the csv showed me). A few values were changed but the vast majority didn't:

df <- df$column[-listvalue, ]

Do you know how to code it right? Thank you for your answers!

CodePudding user response：

We may need to use [[ to extract the column as a vector. In addition, == can be replaced with %in% if the length of unique elements in 'age_outofrange' is more than 1

census[["age"]][census[["age"]]  %in% unique(age_outofrange)] <- NA

-output

> census
   index age
1      1  34
2      2  NA
3      3  NA
4      4  23
5      5  NA
6      6  44
7      7  NA
8      8  12
9      9  65
10    10  NA

data

census <- structure(list(index = 1:10, age = c(34L, 79L, 80L, 23L, 650L, 
44L, 560L, 12L, 65L, 79L)), class = "data.frame", row.names = c(NA, 
-10L))
age_outofrange <- c(79, 80, 650, 560)