The following for-loop loops over columns 6:22 in poll_22
to calculate lower and upper confidence intervals for each political party. The lower and upper intervals are then saved in two new columns. However, the code I have written overwrites the calculated intervals resulting in poll_22$lowerinterval
and poll_22$upperinterval
containing calculated intervals for the last political party in column 22
.
Is it possible to add the name of the column i.e. the letter of a political party when appending the confidence intervals in the last two lines? Intervals for column 6
named A
would then be poll_22$A_upperinterval
and poll_22$A_lowerinterval
. Column 7
named B
would be poll_22$B_upperinterval
and poll_22$B_lowerinterval
etc.
for(i in poll_22[, 6:22]) {
n <- poll_22$n # sample size
p <- i/100 # party
# calculate confidence interval
margin <- qnorm(0.975)*sqrt(p*(1-p)/n)
# calculate upper and lower intervals
lowerinterval <- (p - margin)*100
upperinterval <- (p margin)*100
# append intervals
poll_22$lowerinterval <- lowerinterval
poll_22$upperinterval <- upperinterval
}
Head of poll_22
poll_22 <- structure(list(id = c(1555, 1556, 1557, 1558, 1559, 1560), pollingfirm = c("VOXMETER",
"VOXMETER", "MEGAFON", "VOXMETER", "VOXMETER", "VOXMETER"), year = c(2022,
2022, 2022, 2022, 2022, 2022), month = c(1, 1, 1, 1, 1, 2), day = c(8,
16, 20, 23, 30, 6), A = c(25.1, 25.8, 23.9, 26.8, 24.9, 24),
B = c(7.2, 7.5, 6.9, 6.9, 7.7, 7.2), C = c(15, 14.6, 18.8,
15.5, 16.4, 16.1), D = c(6, 5.9, 7.9, 6.8, 6, 5.4), E = c(NA,
NA, NA, NA, NA, NA), F = c(8.6, 8.5, 8.9, 8, 8.6, 9.2), G = c(NA,
NA, NA, NA, NA, NA), I = c(2.5, 3.3, 2.7, 2.9, 2.1, 2.8),
K = c(1.8, 1.4, 1.6, 1.2, 1.7, 1.8), M = c(NA, NA, 2.2, NA,
NA, NA), O = c(6.2, 5.3, 4.5, 5.5, 7.1, 6.2), P = c(NA, NA,
NA, NA, NA, NA), Q = c(0.1, 0.3, 0.2, 0.1, 0, 0.2), V = c(16.5,
15.1, 11.6, 14.4, 14.2, 14.8), Ø = c(8.9, 9.3, 9.4, 9.4,
8.3, 9.1), Å = c(1.2, 1.1, 0.7, 0.8, 0.9, 0.9), Æ = c(NA_real_,
NA_real_, NA_real_, NA_real_, NA_real_, NA_real_), noparty = c(24.9,
23.5, NA, NA, 25, 25.8), n = c(1002, 1003, 2015, 1008, 1015,
1024)), row.names = c(NA, -6L), class = c("tbl_df", "tbl",
"data.frame"))
CodePudding user response:
If I understood what you are looking for correctly, then here is a solution using dplyr:
library(dplyr)
poll_22 %>%
mutate_at(vars(6:22), ~./100) %>% # performs p <- i/100 for columns 6:22
mutate_at(vars(6:22), list(lowerinterval = ~100 * . - qnorm(0.975)*sqrt(.*(1-.)/n),
upperinterval = ~100 * . qnorm(0.975)*sqrt(.*(1-.)/n))) %>%
mutate_at(vars(6:22), ~100*.) # reverts values in columns 6:22