Home > database >  Count number of individuals with a condition (dummy) paneled data
Count number of individuals with a condition (dummy) paneled data

Time:03-29

Due to privacy issues, I can't share the original dataset or my original code. Therefore, I have created an example.

Suppose that I want to count how many individuals have obtained a degree in higher education. This means that I want to know for how many individuals the HEdummy == 0. I am struggling with how I can do this... In the example below, the correct answer would be 0. I have tried to create a table and to use the count/unique functions, but I have no clue how I can distinct between individuals without summing all '1's.

df <- data.frame (Individual  = c("1", "1", "1","1","2","2","2","3","4","4",'4',"4"),
                  Time = c("2011", "2012", "2013","2014","2011","2012","2012","2017","2014","2015",'2016',"2017"),
                  HigherEducationDummy = c("1", "1", "1","1","0","0","0","1","0","0",'0',"0"))

CodePudding user response:

Not sure why the answer would be 0, but based on the rest of the description it seems you could do summarize over the years for each individual.

library(dplyr)

df %>% 
  group_by(Individual) %>% 
  summarize(hasHE = !any(HigherEducationDummy == "1")) %>%
  select(hasHE) %>% 
  sum()

This would tell you how many people never achieved higher education in the years. You could also replace sum with table to get a count of all categories.

CodePudding user response:

In base R, we can do

with(df, sum(!unique(Individual) %in% 
       unique(Individual[HigherEducationDummy == 1])))
[1] 2
  • Related