My current table looks like this:
Region | Diabetes | percentage | lower limit | upper limit | N |
---|---|---|---|---|---|
1 | 0 | 85 | 80 | 90 | 100 |
1 | 1 | 15 | 10 | 16 | 500 |
2 | 0 | 90 | 80 | 97 | 198 |
2 | 1 | 10 | 7 | 20 | 134 |
3 | 0 | 97 | 90 | 99 | 434 |
3 | 1 | 3 | 0 | 10 | 283 |
This is the code I used to create that table.
CIregion_prop <- dta %>%
filter(!is.na(Diabetes)) %>%
filter(!is.na(region)) %>%
group_by(region) %>%
count(diabetes) %>%
mutate(perc =prop.table(n)*100,
lower = lapply(n, prop.test, n = sum(n)),
upper = sapply(lower, function(x) x$conf.int[2])*100,
lower = sapply(lower, function(x) x$conf.int[1])*100)
I want to transform the table to below. Organized by how many people are positive and negative in each region to look like this:
Diabetes | percentage | lower limit | upper limit | N |
---|---|---|---|---|
0 | 85 | 80 | 90 | 732 |
1 | 15 | 10 | 16 | 917 |
How can I transform my above code?
CodePudding user response:
Try using dplyr::select()
to remove the region data and omit the group_by()
step:
library(dplyr)
region <- sample(c(1,2,3), 1649, replace = T)
Diabetes <- sample(c(0,1), 1649, replace = T)
df <- data.frame(region, Diabetes)
CI.no.region_prop <- df %>%
filter(!is.na(Diabetes)) %>%
filter(!is.na(region)) %>%
dplyr::select(Diabetes) %>%
#group_by(region) %>%
count(Diabetes) %>%
mutate(perc = prop.table(n)*100,
lower = lapply(n, prop.test, n = sum(n)),
upper = sapply(lower, function(x) x$conf.int[2])*100,
lower = sapply(lower, function(x) x$conf.int[1])*100)