I need to count how many Patients (my groups) fulfill a condition.
I have a large dataset and the last row always states yes or no (every patient has only yes or only no, but more of them), and now I need to know how many Patients are in the yes condition and how many patients are in the no condition.
I can only find results that count the conditions in a group, but not the groups by a condition.
The data looks like this:
structure(list(PATIENT.ID = c(210625L, 210625L, 210625L, 210625L,
210625L, 210625L, 210625L, 210625L, 210625L, 210625L, 210625L,
210625L, 210625L, 210625L, 210625L, 210625L, 210625L, 220909L,
220909L, 220909L, 220909L, 220909L, 220909L, 220909L, 220909L,
220909L, 220909L, 221179L, 221179L, 221179L, 221179L, 221179L,
221179L, 221179L, 221179L, 221179L, 221179L, 221179L, 221179L,
221179L, 221179L, 301705L, 301705L, 301705L, 301705L, 301705L,
301705L, 301705L, 301705L, 301705L, 301705L, 301705L, 301705L,
301705L, 301705L, 301705L, 303926L, 303926L, 303926L, 303926L
), Anycaffeina = c("Yes", "Yes", "Yes", "Yes", "Yes", "Yes",
"Yes", "Yes", "Yes", "Yes", "Yes", "Yes", "Yes", "Yes", "Yes",
"Yes", "Yes", "Yes", "Yes", "Yes", "Yes", "Yes", "Yes", "Yes",
"Yes", "Yes", "Yes", "Yes", "Yes", "Yes", "Yes", "Yes", "Yes",
"Yes", "Yes", "Yes", "Yes", "Yes", "Yes", "Yes", "Yes", "Yes",
"Yes", "Yes", "Yes", "Yes", "Yes", "Yes", "Yes", "Yes", "Yes",
"Yes", "Yes", "Yes", "Yes", "Yes", "No", "No", "No", "No"
)), row.names = c(NA, -60L), class = c("tbl_df", "tbl", "data.frame"
))
And I want something like this: NO = N (here 1) and YES = N (here 4)
I now found a solution that worked with my dataset (much longer than the above and 81 columns, so maybe that's why @Yuriy Saraykin did not work with my original data?)
Anycaffeine[!duplicated(Anycaffeine$PATIENT.ID), ]
count(z$Anycaffeina)
CodePudding user response:
tidyverse
library(tidyverse)
df %>%
distinct() %>%
count(Anycaffeina)
# A tibble: 2 x 2
Anycaffeina n
<chr> <int>
1 No 1
2 Yes 4
base
aggregate(.~Anycaffeina, data = unique(df), FUN = length)
Anycaffeina PATIENT.ID
1 No 1
2 Yes 4
data.table
library(data.table)
library(magrittr)
setDT(df) %>%
unique() %>%
.[, .N, by = Anycaffeina] %>%
.[]
Anycaffeina N
1: Yes 4
2: No 1
CodePudding user response:
z <- Anycaffeine[!duplicated(Anycaffeine$PATIENT.ID), ] count(z$Anycaffeina)