Home > Mobile >  Method for grouping in R for unique values in a list?
Method for grouping in R for unique values in a list?

Time:10-22

I have a dataframe of patients who underwent one or more surgical procedures and am interested in grouping them by procedure type for analysis of outcomes. The procedures are represented by numbers (1-5). To avoid having to create a new column in the dataframe for each procedure type to identify whether the patient had that unique procedure performed, I'm basically looking for a way to do aggregate grouping and summarizing for each unique value in a list.

A representative df would look like this...

id <- c(1,2,3,4,5,6,7,8,9,10)
procedures <- list(2, 3, c(1,5), 1, c(3,4), c(1,3), 5, 2, c(1,2,5), 4)
df <- as.data.frame(cbind(id, procedures))

Say I wanted to count the number of patients who had each type of procedure. The following would obviously count each unique list as a separate object.

df %>%
  group_by(procedures) %>%
  summarise(n = n())

What I'm trying to accomplish would be a count of times each unique procedure appears in the list of lists. The below is oversimplified but an example of this.

df %>%
  group_by(unique(procedures)) %>%
  summarise(n = n())

CodePudding user response:

We may unnest the list column and use that in group_by

library(dplyr)
library(tidyr)
df %>%
    unnest(everything()) %>%
    group_by(procedures) %>% 
    summarise(n = n())

CodePudding user response:

We could use separate_rows with count:

library(dplyr)
library(tidyr)
df %>% 
  separate_rows("procedures", sep = " ,") %>% 
  count(procedures)
  procedures     n
       <dbl> <int>
1          1     4
2          2     3
3          3     3
4          4     2
5          5     3
  • Related