Home > Net >  count nonzero values in each column tidyverse
count nonzero values in each column tidyverse

Time:04-12

I have a df with a bunch of sites and a bunch of variables. I need to summarize it to count how many non-zero values each site has. I feel like I should be able to do this with summarize() and count() or tally(), but can't quite figure it out.

reprex:


df <- 
  tribble(
    ~variable,   ~site1,   ~site2,  ~site3,
    "var1",        0 ,       1,        0,
    "var2",        .5,       0,        0,
    "var3",        .1,       2,        0,
    "var4",        0,        .8,       1
  )


# does not work:
df %>%
  summarise(across(where(is.numeric), ~ count(.x>0)))

desired output:

# A tibble: 1 × 3
  site1 site2 site3
  <dbl> <dbl> <dbl>
1   2     3     1

CodePudding user response:

In base R, you can use colSums:

colSums(df[-1] > 0)

#> site1 site2 site3 
#>     2     3     1 

CodePudding user response:

A possible solution:

library(dplyr)

df %>% 
  summarise(across(starts_with("site"), ~ sum(.x != 0)))

#> # A tibble: 1 × 3
#>   site1 site2 site3
#>   <int> <int> <int>
#> 1     2     3     1

Another possible solution, in base R:

library(tidyverse)

apply(df[-1], 2, \(x) sum(x != 0))

#> site1 site2 site3 
#>     2     3     1
  • Related