I am new to r and I have run into a problem. I am analysing a dataframe and the question I am working on had three possible answers. I now want to obtain the share per answer in my dataframe. This is my code so far:
BES%>%
group_by(y08) %>%
summarise(count = n())
and it yields
y08 count
<dbl lbl> <int>
1 1 [Yes: trade union] 261
2 2 [Yes: staff association] 25
3 3 [No] 1908
How can I obtain the absolute number of observations (sum of my integers) and based on that the share of each option? I'd like to create a stratified sample based on this.
(for the srs sample:
str_samp <-
BES%>%
mutate(strata = sample_size * share) %>%
group_by(y08) %>%
sample_n(strata) %>%
ungroup()
this is my code atm. Sample size is defined but I struggle w/ defining the share variable.)
Thank you for your help!
CodePudding user response:
To get the share/proportions just do this:
BES%>%
group_by(y08) %>%
summarise(count = n()) %>%
mutate(share = count/sum(count))
CodePudding user response:
BES %>% count(y08) %>% mutate(share=n/sum(n))