I have two lists of patient ID's. One is short, the other is very long.
I'd like to know how many times patients from the short list appear in the long list, preferably using dplyr.
library(tidyr)
library(tidyverse)
shortlist<-tribble(
~PatientID,
#--
10,
11,
12,
13,
14,
15
)
longlist<-tribble(
~PatientID,
#--
10,
10,
11,
12,
12,
12,
13,
14,
15,
15,
15,
16,
17
)
Desired output would be something like:
10 2
11 1
12 3
13 1
and so on
CodePudding user response:
We filter
the 'PatientID' based on the 'shortlist' and get the count
library(dplyr)
longlist %>%
filter(PatientID %in% shortlist$PatientID) %>%
count(PatientID)
-output
# A tibble: 6 × 2
PatientID n
<dbl> <int>
1 10 2
2 11 1
3 12 3
4 13 1
5 14 1
6 15 3
CodePudding user response:
shortlist %>%
group_by(PatientID) %>%
mutate(count = sum(longlist$PatientID == PatientID))
# A tibble: 6 × 2
# Groups: PatientID [6]
PatientID count
<dbl> <int>
1 10 2
2 11 1
3 12 3
4 13 1
5 14 1
6 15 3
CodePudding user response:
Another possible solution:
library(tidyverse)
table(longlist) %>%
stack %>%
mutate(PatientID = levels(ind) %>% as.numeric, ind = NULL) %>%
left_join(shortlist, ., by = "PatientID")
#> # A tibble: 6 × 2
#> PatientID values
#> <dbl> <int>
#> 1 10 2
#> 2 11 1
#> 3 12 3
#> 4 13 1
#> 5 14 1
#> 6 15 3