Home > other >  How to obtain the share of observations in a dataset in R?
How to obtain the share of observations in a dataset in R?

Time:02-24

I am new to r and I have run into a problem. I am analysing a dataframe and the question I am working on had three possible answers. I now want to obtain the share per answer in my dataframe. This is my code so far:

BES%>%
  group_by(y08) %>%
  summarise(count = n())

and it yields

 y08 count
                   <dbl lbl> <int>
1 1 [Yes: trade union]         261
2 2 [Yes: staff association]    25
3 3 [No]                      1908

How can I obtain the absolute number of observations (sum of my integers) and based on that the share of each option? I'd like to create a stratified sample based on this.

(for the srs sample:

str_samp <-
  BES%>%
  mutate(strata = sample_size * share) %>%
  group_by(y08) %>%
  sample_n(strata) %>%
  ungroup()

this is my code atm. Sample size is defined but I struggle w/ defining the share variable.)

Thank you for your help!

CodePudding user response:

To get the share/proportions just do this:

BES%>%
  group_by(y08) %>%
  summarise(count = n()) %>%
  mutate(share = count/sum(count))

CodePudding user response:

BES %>% count(y08) %>% mutate(share=n/sum(n))
  • Related