I am roughly reproducing this code:
library(haven)
library(survey)
library(dplyr)
nhanesDemo <- read_xpt(url("https://wwwn.cdc.gov/Nchs/Nhanes/2015-2016/DEMO_I.XPT"))
# Rename variables into something more readable
nhanesDemo$fpl <- nhanesDemo$INDFMPIR
nhanesDemo$age <- nhanesDemo$RIDAGEYR
nhanesDemo$gender <- nhanesDemo$RIAGENDR
nhanesDemo$persWeight <- nhanesDemo$WTINT2YR
nhanesDemo$psu <- nhanesDemo$SDMVPSU
nhanesDemo$strata <- nhanesDemo$SDMVSTRA
# Select the necessary columns
nhanesAnalysis <- nhanesDemo %>%
select(fpl, age, gender, persWeight, psu, strata)
# Set up the design
nhanesDesign <- svydesign(id = ~psu,
strata = ~strata,
weights = ~persWeight,
nest = TRUE,
data = nhanesAnalysis)
# Select those between the agest of 18 and 79
ageDesign <- subset(nhanesDesign, age > 17 & age < 80)
They calculate a simple arithmetic mean as follows:
# Arithmetic mean
svymean(~age, ageDesign, na.rm = TRUE)
I would like to calculate (1) the geometric mean using svymean
or some related method (2) with a 95% confidence interval of the geometric mean without having to manually use the SE, if possible. I tried
svymean(~log(age), log(ageDesign), na.rm = TRUE)
but that throws an error. How do I do this?
CodePudding user response:
You want to take the logarithm of the variable, not of the whole survey design (which doesn't make any sense)
Here's how to compute the mean of log(age), get a confidence interval, and then exponentiate it to the geometric mean scale
> svymean(~log(age), ageDesign, na.rm = TRUE)
mean SE
log(age) 3.7489 0.0115
> meanlog<-svymean(~log(age), ageDesign, na.rm = TRUE)
> confint(meanlog)
2.5 % 97.5 %
log(age) 3.726351 3.771372
> exp(confint(meanlog))
2.5 % 97.5 %
log(age) 41.52729 43.43963
Or, you can exponentiate first and then construct the confidence interval:
> (geomean<-svycontrast(meanlog, quote(exp(`log(age)`))))
nlcon SE
contrast 42.473 0.4878
> confint(geomean)
2.5 % 97.5 %
contrast 41.51661 43.42879
I would expect the first approach to give better confidence intervals in general, but it makes almost no difference here.