Home > front end >  Loop over several columns while appending results to new columns using the name of the exisiting col
Loop over several columns while appending results to new columns using the name of the exisiting col

Time:10-07

The following for-loop loops over columns 6:22 in poll_22 to calculate lower and upper confidence intervals for each political party. The lower and upper intervals are then saved in two new columns. However, the code I have written overwrites the calculated intervals resulting in poll_22$lowerinterval and poll_22$upperinterval containing calculated intervals for the last political party in column 22.

Is it possible to add the name of the column i.e. the letter of a political party when appending the confidence intervals in the last two lines? Intervals for column 6 named A would then be poll_22$A_upperinterval and poll_22$A_lowerinterval. Column 7 named B would be poll_22$B_upperinterval and poll_22$B_lowerinterval etc.

for(i in poll_22[, 6:22]) {
  n <- poll_22$n # sample size
  p <- i/100 # party
  
  # calculate confidence interval
  margin <- qnorm(0.975)*sqrt(p*(1-p)/n)
  
  # calculate upper and lower intervals
  lowerinterval <- (p - margin)*100
  upperinterval <- (p   margin)*100
  
  # append intervals
  poll_22$lowerinterval <- lowerinterval
  poll_22$upperinterval <- upperinterval
}

Head of poll_22

poll_22 <- structure(list(id = c(1555, 1556, 1557, 1558, 1559, 1560), pollingfirm = c("VOXMETER", 
"VOXMETER", "MEGAFON", "VOXMETER", "VOXMETER", "VOXMETER"), year = c(2022, 
2022, 2022, 2022, 2022, 2022), month = c(1, 1, 1, 1, 1, 2), day = c(8, 
16, 20, 23, 30, 6), A = c(25.1, 25.8, 23.9, 26.8, 24.9, 24), 
    B = c(7.2, 7.5, 6.9, 6.9, 7.7, 7.2), C = c(15, 14.6, 18.8, 
    15.5, 16.4, 16.1), D = c(6, 5.9, 7.9, 6.8, 6, 5.4), E = c(NA, 
    NA, NA, NA, NA, NA), F = c(8.6, 8.5, 8.9, 8, 8.6, 9.2), G = c(NA, 
    NA, NA, NA, NA, NA), I = c(2.5, 3.3, 2.7, 2.9, 2.1, 2.8), 
    K = c(1.8, 1.4, 1.6, 1.2, 1.7, 1.8), M = c(NA, NA, 2.2, NA, 
    NA, NA), O = c(6.2, 5.3, 4.5, 5.5, 7.1, 6.2), P = c(NA, NA, 
    NA, NA, NA, NA), Q = c(0.1, 0.3, 0.2, 0.1, 0, 0.2), V = c(16.5, 
    15.1, 11.6, 14.4, 14.2, 14.8), Ø = c(8.9, 9.3, 9.4, 9.4, 
    8.3, 9.1), Å = c(1.2, 1.1, 0.7, 0.8, 0.9, 0.9), Æ = c(NA_real_, 
    NA_real_, NA_real_, NA_real_, NA_real_, NA_real_), noparty = c(24.9, 
    23.5, NA, NA, 25, 25.8), n = c(1002, 1003, 2015, 1008, 1015, 
    1024)), row.names = c(NA, -6L), class = c("tbl_df", "tbl", 
"data.frame"))

CodePudding user response:

If I understood what you are looking for correctly, then here is a solution using dplyr:

library(dplyr)
poll_22 %>% 
  mutate_at(vars(6:22), ~./100) %>% # performs p <- i/100 for columns 6:22
  mutate_at(vars(6:22), list(lowerinterval = ~100 * . - qnorm(0.975)*sqrt(.*(1-.)/n),
                             upperinterval = ~100 * .   qnorm(0.975)*sqrt(.*(1-.)/n))) %>% 
  mutate_at(vars(6:22), ~100*.) # reverts values in columns 6:22
  • Related