Was having an oddly difficult time trying to find the answer to this probably much answered question, but here is my query. Suppose I have a data frame like the one below:
df <- data.frame(age = c(30,40,-999,40,20),
money.usd = c(-999,55,55,54,30),
cars = c(1,1,2,0,-999))
Filtering and mutating the values are straight forward for single variables. For example, with an ifelse
statement, I can turn the -999 in age to NA in the following way:
df %>%
mutate(age = ifelse(age == -999,"NA",age))
However, since all of these variables have this value and have different names, I was curious how I can achieve this sort of mutation across several variables. Additionally, if there is the case of many similar variables and many dissimilar variables, I imagine the case is more complicated but certainly ways to make it easier. For example, if I have the following data with three variables for "car":
df.2 <- data.frame(age = c(30,40,-999,40,20),
money.usd = c(-999,55,55,54,30),
cars.1 = c(1,1,2,0,-999),
cars.2 = c(0,1,-999,0,0),
cars.3 = c(-999,5,4,5,4))
How would one mutate the value for both age
and money.usd
while also selecting several variables of car
in order to mutate the -999 value? To summarize, my main objective is switching this -999 value from across the data frame to a NA value.
CodePudding user response:
You can use tidyr::na_if
to replace a value by NA, and across
to apply it to multiple columns.
library(tidyr)
library(dplyr)
df.2 %>%
mutate(across(everything(), ~ na_if(.x, -999)))
If not NA, use replace
:
df.2 %>%
mutate(across(everything(), ~ replace(.x, .x == -999, NA)))
age money.usd cars.1 cars.2 cars.3
1 30 NA 1 0 NA
2 40 55 1 1 5
3 NA 55 2 NA 4
4 40 54 0 0 5
5 20 30 NA 0 4
Or in base R:
df.2[df.2 == -999] <- NA