Home > Back-end >  Mutate if variable is numeric AND the sum is bigger than 1
Mutate if variable is numeric AND the sum is bigger than 1

Time:10-11

I'm trying to format at big table and I want to:

round with two decimals all "small" variable (which are ratios) at two decimals

BUT I also want to format all "big" variables with big mark notation and no decimals

(and there is also categorical variables that I don't have to include)

so far I have :

mutate_if(is.numeric, formattable::comma, big.mark = "'", digits=2) 

This works for one or the other but not both

I would want something with two conditions, like is.numeric AND max() < 100

CodePudding user response:

Here is a way to format columns on multiple conditions:

df <- tibble(i = 1:10, j = i^4/10, k = i/20, l = i^2) 


# Selection based on whether greater than 100
df %>%
  mutate(across(where(~is.numeric(.) & max(.) >= 100), formattable::comma, big.mark = ",", digits = 2),
         across(where(~is.numeric(.) & max(.) < 100), formattable::percent, big.mark = ",", digits = 0))
 
# Selection based on whether all are less than or greater than 1
df %>%
  mutate(across(where(~is.numeric(.) & all(. >= 1)), formattable::comma, big.mark = ",", digits = 2),
         across(where(~is.numeric(.) & all(. < 1)), formattable::percent, big.mark = ",", digits = 0))

 > df
# A tibble: 10 × 4
       i      j     k     l
   <int>  <dbl> <dbl> <dbl>
 1     1    0.1  0.05     1
 2     2    1.6  0.1      4
 3     3    8.1  0.15     9
 4     4   25.6  0.2     16
 5     5   62.5  0.25    25
 6     6  130.   0.3     36
 7     7  240.   0.35    49
 8     8  410.   0.4     64
 9     9  656.   0.45    81
10    10 1000    0.5    100
 > df %>%
...   mutate(across(where(~is.numeric(.) & all(. >= 1)), formattable::comma, big.mark = ",", digits = 2),
...          across(where(~is.numeric(.) & all(. < 1)), formattable::percent, big.mark = ",", digits = 0))
# A tibble: 10 × 4
   i               j k          l         
   <formttbl>  <dbl> <formttbl> <formttbl>
 1 1.00          0.1 5%         1.00      
 2 2.00          1.6 10%        4.00      
 3 3.00          8.1 15%        9.00      
 4 4.00         25.6 20%        16.00     
 5 5.00         62.5 25%        25.00     
 6 6.00        130.  30%        36.00     
 7 7.00        240.  35%        49.00     
 8 8.00        410.  40%        64.00     
 9 9.00        656.  45%        81.00     
10 10.00      1000   50%        100.00   

CodePudding user response:

You can create your own little function that returns a TRUE or FALSE depending on whether a column meets a particular set of conditions. You can either do this beforehand or directly as an argument to mutate_if

Suppose we have the following data frame:

df <- data.frame(a = c('A', 'B', 'C'), b = c(999, 123, 84), c = c(32, 21, 80))
#>   a   b  c
#> 1 A 999 32
#> 2 B 123 21
#> 3 C  84 80

Then we can write a lambda function as the first argument to mutate_if:

df %>%
  mutate_if(~ if(is.numeric(.x)) max(.x, na.rm = TRUE) < 100 else FALSE, 
            formattable::comma, big.mark = "'", digits = 2) 
#> a   b     c
#> 1 A 999 32.00
#> 2 B 123 21.00
#> 3 C  84 80.00

Alternatively, we can write a function beforehand, depending on whether we are likely to re-use it a lot:

needs_mutated <- function(x) {
  if(!is.numeric(x)) return(FALSE)
  max(x, na.rm = TRUE) < 100
}

Allowing

df %>%
  mutate_if(needs_mutated, formattable::comma, big.mark = "'", digits = 2) 
#> a   b     c
#> 1 A 999 32.00
#> 2 B 123 21.00
#> 3 C  84 80.00
  • Related