Home > Back-end >  Counting values based on certain conditions in R?
Counting values based on certain conditions in R?

Time:02-10

I have a dataset and I'm trying to count the number of codes each patient has, as well as the number of codes of interest that each patient has.

Let's say that I have this table below and my code of interest is 26.

patient code
1       25   
1       26  
1       39
1       26
1       86
2       26 
2       24 
2       89
3       56 
3       45 
3       26
3       89 
4       56
4       25 
4       66
4       56

Patient 1 total code: 5 total codes, 2 codes of interest

Patient 2 total code: 3 total codes, 1 code of interest

Patient 3 total code: 4 total codes, 1 code of interest

Patient 4 total code: 4 total codes, 0 codes of interest

How can I do this in R? Thank you!

CodePudding user response:

Here's a tidyverse approach.

First you need to group_by(patient) so that R will calculate patient as a group. Then use summarise() to calculate the count of codes n() in each patient, and also count the occurrence of 26 in each patient (sum(code == 26)).

library(tidyverse)

df %>% group_by(patient) %>% 
  summarize(Total_codes = n(), 
            Codes_of_interest = sum(code == 26))

# A tibble: 4 x 3
  patient Total_codes Codes_of_interest
    <int>       <int>             <int>
1       1           5                 2
2       2           3                 1
3       3           4                 1
4       4           4                 0

CodePudding user response:

Here's a data.table approach:

library(data.table)

setDT(dt)[ , list(cases = .N, interest = sum(code == 26)), by=patient]

Output

   patient cases interest
1:       1     5        2
2:       2     3        1
3:       3     4        1
4:       4     4        0

Data

dt <- structure(list(patient = c(1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 3L, 
3L, 3L, 3L, 4L, 4L, 4L, 4L), code = c(25L, 26L, 39L, 26L, 86L, 
26L, 24L, 89L, 56L, 45L, 26L, 89L, 56L, 25L, 66L, 56L)), class = "data.frame", row.names = c(NA, 
-16L))

CodePudding user response:

Assuming you have a data.table called dt have variables patient and code.

library(data.table)
dt <- data.table(patient = c(1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 3L, 
                                 3L, 3L, 3L, 4L, 4L, 4L, 4L), code = c(25L, 26L, 39L, 26L, 86L, 
                                                                       26L, 24L, 89L, 56L, 45L, 26L, 89L, 56L, 25L, 66L, 56L))

with(dt[code == 26], ftable(patient))

Here is the output

patient 1 2 3
             
        2 1 1
  • Related