Home > Blockchain >  Use na_if function in R with negative condition
Use na_if function in R with negative condition


I would like to replace NAs in a data frame using na_if in column Value conditonal on column Category. But instead of condition used below I would like to replace it in case it is not equal to "cat_1".

data_B <- data_A %>% 
  mutate(Value = na_if(Category, "cat_1"))

Can it be modified? Equality operators do not seem to work.

Note: na_if function keeps original values in a column whilst replacing part of them with NAs (it does not substitute Category values in the Value column in this example)

CodePudding user response:

I don't think it is directly possible with na_if, but you can use replace != instead, or case_when ==:

data.frame(Category = paste0("cat_", 1:4)) %>% 
  mutate(Value = replace(Category, Category != "cat_1", NA),
         Value2 = case_when(Category == "cat_1" ~ Category))


  Category Value Value2
1    cat_1 cat_1  cat_1
2    cat_2  <NA>   <NA>
3    cat_3  <NA>   <NA>
4    cat_4  <NA>   <NA>

CodePudding user response:

If your variable is a factor or your willing to convert:

df <- df |>
    Value = factor(df$Category, levels = "cat_1"),
    Value2 = as.character(Value) # Converting factor to character

# 'data.frame': 4 obs. of  3 variables:
#  $ Category: Factor w/ 4 levels "cat_1","cat_2",..: 1 2 3 4
#  $ Value   : Factor w/ 1 level "cat_1": 1 NA NA NA
#  $ Value2  : chr  "cat_1" NA NA NA

#   Category Value Value2
# 1    cat_1 cat_1  cat_1
# 2    cat_2  <NA>   <NA>
# 3    cat_3  <NA>   <NA>
# 4    cat_4  <NA>   <NA>


df = data.frame(Category = factor(paste0("cat_", 1:4))) 

CodePudding user response:

In my opinion Maël's answer is the easiest solution, but another potential option is to create your own function; looking at the source code for na_if() you could Negate() the vec_equal() to create your own na_if_not() function and still retain the utility and behaviour of na_if(), i.e.

Simple example:


na_if_not <- function(x, y) {
  y <- vec_cast(x = y, to = x, x_arg = "y", to_arg = "x")
  y <- vec_recycle(y, size = vec_size(x), x_arg = "y")
  na <- vec_init(x)
  vec_not_equal <- Negate(vec_equal)
  where <- vec_not_equal(x, y, na_equal = TRUE)
  x <- vec_assign(x, where, na)

df <- data.frame(Category = paste0("cat_", 1:4),
                 Value = paste0("cat_", 1:4),
                 Value2 = paste0("cat_", 1:4))
df %>% 
  mutate(Value = na_if_not(Value, "cat_1"),
         Value2 = na_if_not(Category, "cat_1"))
#>   Category Value Value2
#> 1    cat_1 cat_1  cat_1
#> 2    cat_2  <NA>   <NA>
#> 3    cat_3  <NA>   <NA>
#> 4    cat_4  <NA>   <NA>

Created on 2022-09-30 by the reprex package (v2.0.1)

Replacing "setosa's" (na_if()) and "everything-but-setosa's" (na_if_not()) in place:


na_if_not <- function(x, y) {
  y <- vec_cast(x = y, to = x, x_arg = "y", to_arg = "x")
  y <- vec_recycle(y, size = vec_size(x), x_arg = "y")
  na <- vec_init(x)
  vec_not_equal <- Negate(vec_equal)
  where <- vec_not_equal(x, y, na_equal = TRUE)
  x <- vec_assign(x, where, na)

# na_if() example
iris %>%
  head() %>%
  mutate(Species = c("Setosa", "virginica", "versicolor",
                     "Setosa", "virginica", "versicolor")) %>% 
  mutate(Species = na_if(Species, "Setosa"))
#>   Sepal.Length Sepal.Width Petal.Length Petal.Width    Species
#> 1          5.1         3.5          1.4         0.2       <NA>
#> 2          4.9         3.0          1.4         0.2  virginica
#> 3          4.7         3.2          1.3         0.2 versicolor
#> 4          4.6         3.1          1.5         0.2       <NA>
#> 5          5.0         3.6          1.4         0.2  virginica
#> 6          5.4         3.9          1.7         0.4 versicolor

# na_if_not() example
iris %>%
  head() %>%
  mutate(Species = c("Setosa", "virginica", "versicolor",
                     "Setosa", "virginica", "versicolor")) %>% 
  mutate(Species = na_if_not(Species, "Setosa"))
#>   Sepal.Length Sepal.Width Petal.Length Petal.Width Species
#> 1          5.1         3.5          1.4         0.2  Setosa
#> 2          4.9         3.0          1.4         0.2    <NA>
#> 3          4.7         3.2          1.3         0.2    <NA>
#> 4          4.6         3.1          1.5         0.2  Setosa
#> 5          5.0         3.6          1.4         0.2    <NA>
#> 6          5.4         3.9          1.7         0.4    <NA>

Created on 2022-09-30 by the reprex package (v2.0.1)

  • Related