Home > Software engineering >  Rowwise proportion test and add p value as new column
Rowwise proportion test and add p value as new column

Time:09-07

My data:

c5 =structure(list(comorbid = c("heart", "ihd", "cabg", "angio", 
"cerebrovasc", "diabetes", "pvd", "amputation", "liver", "malig", 
"smoke", "ulcers"), AVF_Y = c(626L, 355L, 266L, 92L, 320L, 1175L, 
199L, 89L, 75L, 450L, 901L, 114L), AVG_Y = c(54L, 14L, 18L, 5L, 
21L, 37L, 5L, 7L, 5L, 29L, 33L, 3L), AVF_tot = c(2755L, 1768L, 
2770L, 2831L, 2844L, 2877L, 1745L, 2823L, 2831L, 2823L, 2798L, 
2829L), AVG_tot = c(161L, 61L, 161L, 165L, 166L, 167L, 61L, 165L, 
165L, 165L, 159L, 164L)), row.names = c(NA, -12L), class = "data.frame")

I want to perform a prop.test for each row ( a two-proportions z-test) and add the p value as a new column.

I've tried using the following code, but this gives me 24 1-sample proportions test results instead of 12 2-sample test for equality of proportions.

Map(prop.test, x = c(c5$AVF_Y, c5$AVG_Y), n = c(c5$AVF_tot, c5$AVG_tot))

CodePudding user response:

Use a lambda function and extract. When we concatenate the columns, it returns a vector and its length will be 2 times the number of rows of the data. We would need to concatenate within in the loop to create a vector of length 2 for each x and n from corresponding columns of '_Y', and '_tot'

mapply(function(avf, avg, avf_n, avg_n) prop.test(c(avf, avg), c(avf_n, avg_n))$p.value, c5$AVF_Y, c5$AVG_Y, c5$AVF_tot, c5$AVG_tot)

-output

 [1] 2.218376e-03 6.985883e-01 6.026012e-01 1.000000e 00 6.695440e-01 2.425781e-06 5.672322e-01 5.861097e-01 9.627050e-01 6.546286e-01 3.360300e-03 2.276857e-0

Or use do.cal with Map or mapply

do.call(mapply, c(FUN = function(x, y, n1, n2) 
    prop.test(c(x, y), c(n1, n2))$p.value, unname(c5[-1])))
 [1] 2.218376e-03 6.985883e-01 6.026012e-01 1.000000e 00 6.695440e-01 2.425781e-06 5.672322e-01 5.861097e-01 9.627050e-01 6.546286e-01 3.360300e-03 2.276857e-01

Or with apply

apply(c5[-1], 1, function(x) prop.test(x[1:2], x[3:4])$p.value)
 [1] 2.218376e-03 6.985883e-01 6.026012e-01 1.000000e 00 6.695440e-01 2.425781e-06 5.672322e-01 5.861097e-01 9.627050e-01 6.546286e-01 3.360300e-03 2.276857e-01

Or use rowwise

library(dplyr)
c5 %>%
   rowwise %>% 
   mutate(pval = prop.test(c(AVF_Y, AVG_Y), 
      n = c(AVF_tot, AVG_tot))$p.value) %>%
    ungroup

-output

# A tibble: 12 × 6
   comorbid    AVF_Y AVG_Y AVF_tot AVG_tot       pval
   <chr>       <int> <int>   <int>   <int>      <dbl>
 1 heart         626    54    2755     161 0.00222   
 2 ihd           355    14    1768      61 0.699     
 3 cabg          266    18    2770     161 0.603     
 4 angio          92     5    2831     165 1.00      
 5 cerebrovasc   320    21    2844     166 0.670     
 6 diabetes     1175    37    2877     167 0.00000243
 7 pvd           199     5    1745      61 0.567     
 8 amputation     89     7    2823     165 0.586     
 9 liver          75     5    2831     165 0.963     
10 malig         450    29    2823     165 0.655     
11 smoke         901    33    2798     159 0.00336   
12 ulcers        114     3    2829     164 0.228     
  •  Tags:  
  • r
  • Related