I would like to replace NAs in a data frame using na_if in column Value conditonal on column Category. But instead of condition used below I would like to replace it in case it is not equal to "cat_1".
data_B <- data_A %>%
mutate(Value = na_if(Category, "cat_1"))
Can it be modified? Equality operators do not seem to work.
Note: na_if function keeps original values in a column whilst replacing part of them with NAs (it does not substitute Category values in the Value column in this example)
CodePudding user response:
I don't think it is directly possible with na_if
, but you can use replace
!=
instead, or case_when
==
:
library(dplyr)
data.frame(Category = paste0("cat_", 1:4)) %>%
mutate(Value = replace(Category, Category != "cat_1", NA),
Value2 = case_when(Category == "cat_1" ~ Category))
output
Category Value Value2
1 cat_1 cat_1 cat_1
2 cat_2 <NA> <NA>
3 cat_3 <NA> <NA>
4 cat_4 <NA> <NA>
CodePudding user response:
If your variable is a factor or your willing to convert:
df <- df |>
mutate(
Value = factor(df$Category, levels = "cat_1"),
Value2 = as.character(Value) # Converting factor to character
)
# 'data.frame': 4 obs. of 3 variables:
# $ Category: Factor w/ 4 levels "cat_1","cat_2",..: 1 2 3 4
# $ Value : Factor w/ 1 level "cat_1": 1 NA NA NA
# $ Value2 : chr "cat_1" NA NA NA
# Category Value Value2
# 1 cat_1 cat_1 cat_1
# 2 cat_2 <NA> <NA>
# 3 cat_3 <NA> <NA>
# 4 cat_4 <NA> <NA>
Data:
df = data.frame(Category = factor(paste0("cat_", 1:4)))
CodePudding user response:
In my opinion Maël's answer is the easiest solution, but another potential option is to create your own function; looking at the source code for na_if()
you could Negate()
the vec_equal()
to create your own na_if_not()
function and still retain the utility and behaviour of na_if()
, i.e.
Simple example:
library(tidyverse)
library(vctrs)
na_if_not <- function(x, y) {
y <- vec_cast(x = y, to = x, x_arg = "y", to_arg = "x")
y <- vec_recycle(y, size = vec_size(x), x_arg = "y")
na <- vec_init(x)
vec_not_equal <- Negate(vec_equal)
where <- vec_not_equal(x, y, na_equal = TRUE)
x <- vec_assign(x, where, na)
x
}
df <- data.frame(Category = paste0("cat_", 1:4),
Value = paste0("cat_", 1:4),
Value2 = paste0("cat_", 1:4))
df %>%
mutate(Value = na_if_not(Value, "cat_1"),
Value2 = na_if_not(Category, "cat_1"))
#> Category Value Value2
#> 1 cat_1 cat_1 cat_1
#> 2 cat_2 <NA> <NA>
#> 3 cat_3 <NA> <NA>
#> 4 cat_4 <NA> <NA>
Created on 2022-09-30 by the reprex package (v2.0.1)
Replacing "setosa's" (na_if()
) and "everything-but-setosa's" (na_if_not()
) in place:
library(tidyverse)
library(vctrs)
na_if_not <- function(x, y) {
y <- vec_cast(x = y, to = x, x_arg = "y", to_arg = "x")
y <- vec_recycle(y, size = vec_size(x), x_arg = "y")
na <- vec_init(x)
vec_not_equal <- Negate(vec_equal)
where <- vec_not_equal(x, y, na_equal = TRUE)
x <- vec_assign(x, where, na)
x
}
# na_if() example
iris %>%
head() %>%
mutate(Species = c("Setosa", "virginica", "versicolor",
"Setosa", "virginica", "versicolor")) %>%
mutate(Species = na_if(Species, "Setosa"))
#> Sepal.Length Sepal.Width Petal.Length Petal.Width Species
#> 1 5.1 3.5 1.4 0.2 <NA>
#> 2 4.9 3.0 1.4 0.2 virginica
#> 3 4.7 3.2 1.3 0.2 versicolor
#> 4 4.6 3.1 1.5 0.2 <NA>
#> 5 5.0 3.6 1.4 0.2 virginica
#> 6 5.4 3.9 1.7 0.4 versicolor
# na_if_not() example
iris %>%
head() %>%
mutate(Species = c("Setosa", "virginica", "versicolor",
"Setosa", "virginica", "versicolor")) %>%
mutate(Species = na_if_not(Species, "Setosa"))
#> Sepal.Length Sepal.Width Petal.Length Petal.Width Species
#> 1 5.1 3.5 1.4 0.2 Setosa
#> 2 4.9 3.0 1.4 0.2 <NA>
#> 3 4.7 3.2 1.3 0.2 <NA>
#> 4 4.6 3.1 1.5 0.2 Setosa
#> 5 5.0 3.6 1.4 0.2 <NA>
#> 6 5.4 3.9 1.7 0.4 <NA>
Created on 2022-09-30 by the reprex package (v2.0.1)