Home > front end >  Get the rowwise minimum of certain columns excluding 0 and NA
Get the rowwise minimum of certain columns excluding 0 and NA

Time:11-09

I have made a very complex solution to something I feel should have a much simpler solution. In short what I want:

  • I want to compute a new column containing the minimum value across 3 columns
  • I want to ignore zeros and NAs
  • If I only have zeros and NAs I want a zero
  • If I have only NAs I want a NA

Here is my solution, it works, but it is very complex and produces a warning.

> library(dplyr)

> df <- data.frame(
    id = c(1, 2, 3, 4),
    test1 = c( NA, NA, 2 , 3), 
    test2 = c( NA, 0, 1 , 1), 
    test3 = c(NA, NA, 0 , 2) 
  )

> df2 <- df %>% 
    mutate(nieuw = apply(across(test1:test3), 1, function(x) min(x[x>0]))) %>% 
    rowwise() %>% 
    mutate(nieuw = if_else(is.na(nieuw), max(across(test1:test3), na.rm = TRUE), nieuw)) %>% 
    mutate(nieuw = ifelse(is.infinite(nieuw), NA, nieuw))

> df
  id test1 test2 test3
1  1    NA    NA    NA
2  2    NA     0    NA
3  3     2     1     0
4  4     3     1     2

> df2
# A tibble: 4 x 5
# Rowwise: 
     id test1 test2 test3 nieuw
  <dbl> <dbl> <dbl> <dbl> <dbl>
1     1    NA    NA    NA    NA
2     2    NA     0    NA     0
3     3     2     1     0     1
4     4     3     1     2     1
Warning message:
Problem while computing `nieuw = if_else(...)`.
i no non-missing arguments to max; returning -Inf
i The warning occurred in row 1. 

CodePudding user response:

You can create a helper function and then apply it rowwise:

library(dplyr)
safe <- function(x, f, ...) ifelse(all(is.na(x)), NA, 
                                   ifelse(all(is.na(x) | x == 0), 
                                          0, f(x[x > 0], na.rm = TRUE, ...)))
                                      
df %>% 
  rowwise() %>% 
  mutate(a = safe(c_across(test1:test3), min))

# A tibble: 4 × 5
# Rowwise: 
     id test1 test2 test3     a
  <dbl> <dbl> <dbl> <dbl> <dbl>
1     1    NA    NA    NA    NA
2     2    NA     0    NA     0
3     3     2     1     0     1
4     4     3     1     2     1

CodePudding user response:

Here is another option. It leverages making zeros and NA's very large and then recodes them at the end:

library(tidyverse)

get_min <- function(data, cols){
  data[is.na(data)] <- 1e6
  data[data == 0] <- 1e5
  nums <- do.call(pmin, select(data, all_of(cols)))
  recode(nums, `1e 06` = NA_real_, `1e 05` = 0.)
}

df %>%
  mutate(nieuw = get_min(., c("test1", "test2", "test3")))
#>   id test1 test2 test3 nieuw
#> 1  1    NA    NA    NA    NA
#> 2  2    NA     0    NA     0
#> 3  3     2     1     0     1
#> 4  4     3     1     2     1
  • Related