I have two tibbles that I would like to combine. They are grouped by the same variable, but I would like to see them in the same table together. The first is:
df %>%
filter(Cancelled == FALSE) %>%
count(School)
Which gives me the count of "School":
School | count |
---|---|
Comm | 42 |
IR | 52 |
Business | 34 |
Nursing | 23 |
The next is:
df%>%
filter(Cancelled == FALSE) %>%
group_by(School) %>%
summarise(n_distinct(ID))
Which gives me the count of unique "ID" values in each "School".:
School | unique |
---|---|
Comm | 17 |
IR | 18 |
Business | 14 |
Nursing | 12 |
Basically, what I would like is for the count to be one row and the unique value count to be a second row:
School | count | unique |
---|---|---|
Comm | 17 | 42 |
IR | 18 | 52 |
Business | 14 | 34 |
Nursing | 12 | 23 |
Thank you in advance!
*Edit: better description of original data
dput(data)
structure(list(ID = c(1986, 3707, 2467, 3087, 2155, 3133, 2531,
3112, 2042, 2912, 1305, 1519, 2411, 3630, 2015, 2943, 2873, 1591,
3127, 3733, 3492, 3156, 3907, 3877, 2050, 2956, 1280, 3544, 1465,
1410, 3946, 2868, 2288, 3722, 1611, 3188, 3609, 2847, 1803, 2580,
1928, 1775, 2774, 1259, 3851, 2135, 3046, 1480, 2480, 2240, 3279,
3983, 2042, 3754, 1851, 3528, 3161, 2547, 3068, 2739, 3936, 3290,
2465, 2839, 2139, 2635, 1655, 3903, 2333, 1787, 2913, 2764, 2791,
1501, 2101, 3312, 3428, 3502, 1826, 3823, 3064, 2705, 1917, 1427,
1627, 1519, 3811, 3661, 3034, 1977, 2502, 3240, 1575, 2882, 3651,
2065, 2366, 2016, 2991, 1996), School = c("Nursing", "Business",
"Comm", "Nursing", "Business", "Nursing", "Nursing", "Nursing",
"Nursing", "Nursing", "IR", "Comm", "Nursing", "IR", "Nursing",
"Comm", "Business", "Business", "Business", "Nursing", "Nursing",
"Nursing", "Comm", "Nursing", "Business", "Nursing", "Comm",
"Business", "IR", "IR", "Nursing", "Business", "Business", "IR",
"Business", "Business", "Business", "Comm", "Nursing", "Comm",
"IR", "Nursing", "Nursing", "Nursing", "Nursing", "Comm", "Nursing",
"Business", "IR", "Comm", "Comm", "Business", "IR", "Nursing",
"Nursing", "IR", "Comm", "Business", "IR", "IR", "Nursing", "IR",
"Nursing", "Nursing", "Nursing", "Business", "Comm", "Nursing",
"IR", "IR", "Business", "Comm", "IR", "Nursing", "Nursing", "Business",
"Nursing", "Comm", "Business", "Business", "Nursing", "Nursing",
"Nursing", "Nursing", "Nursing", "Nursing", "Comm", "Nursing",
"IR", "Business", "Nursing", "Comm", "Nursing", "Comm", "Nursing",
"Nursing", "IR", "Business", "Nursing", "Comm")), row.names = c(NA,
-100L), class = c("tbl_df", "tbl", "data.frame"))
CodePudding user response:
You can do everything in one pipe, but it doesn‘t necessarily look cleaner:
library(tidyverse)
data %>%
count(School, name = 'count') %>%
left_join(., data %>%
group_by(School) %>%
summarize(unique = n_distinct(ID)),
by = 'School')
which gives with your example data:
# A tibble: 4 x 3
School count unique
<chr> <int> <int>
1 Business 22 22
2 Comm 18 18
3 IR 17 17
4 Nursing 43 43
I guess/assume it‘s just coincidence of your example data that there are no duplicate IDs per school and hence the count and unique value are the same.
CodePudding user response:
We could use left_join
:
library(dplyr)
left_join(df, df1, by="School")
School count unique
1 Comm 42 17
2 IR 52 18
3 Business 34 14
4 Nursing 23 12