Home > OS >  R Get number of row that share same value on several columns
R Get number of row that share same value on several columns

Time:12-10

I have a dataframe that contains several columns.

I'm trying to get the number of row that are unique, shared by at least two columns and shared by all columns.

test=data.frame(
  A=c("inactive","inactive","active","active"),
  B=c("active","active","inactive","active"),
  C=c("active","inactive","inactive","active")
  )

I want to know number of row that at least one 'active', two 'active' and row where all are 'active'

So I tried this :

all <- filter(
  test,A == "active" & B=="active" & C=="active")

Then I get the lenght of the dataframe. I can do it for other conditions (shared between A and B, B and C, A and C) but I wonder if there is a better way to compute this.

Thanks

CodePudding user response:

A possible solution:

library(tidyverse)

test=data.frame(
  A=c("inactive","inactive","active","active"),
  B=c("active","active","inactive","active"),
  C=c("active","inactive","inactive","active")
)

test %>% 
  mutate(nActives = rowSums(. == "active")) %>% 
  group_by(nActives) %>% 
  summarise(nRows = n()) %>% 
  ungroup

#> # A tibble: 3 × 2
#>   nActives nRows
#>      <dbl> <int>
#> 1        1     2
#> 2        2     1
#> 3        3     1

CodePudding user response:

We may use a base R solution

 with(test, table(rowSums(test == 'active')))

1 2 3 
2 1 1 

To filter the data with at least 2 'active' per row

> subset(test, rowSums(test == 'active') >=2)
         A      B      C
1 inactive active active
4   active active active
  •  Tags:  
  • r
  • Related