Due to privacy issues, I can't share the original dataset or my original code. Therefore, I have created an example.
Suppose that I want to count how many individuals have obtained a degree in higher education. This means that I want to know for how many individuals the HEdummy == 0. I am struggling with how I can do this... In the example below, the correct answer would be 0. I have tried to create a table and to use the count/unique functions, but I have no clue how I can distinct between individuals without summing all '1's.
df <- data.frame (Individual = c("1", "1", "1","1","2","2","2","3","4","4",'4',"4"),
Time = c("2011", "2012", "2013","2014","2011","2012","2012","2017","2014","2015",'2016',"2017"),
HigherEducationDummy = c("1", "1", "1","1","0","0","0","1","0","0",'0',"0"))
CodePudding user response:
Not sure why the answer would be 0, but based on the rest of the description it seems you could do summarize over the years for each individual.
library(dplyr)
df %>%
group_by(Individual) %>%
summarize(hasHE = !any(HigherEducationDummy == "1")) %>%
select(hasHE) %>%
sum()
This would tell you how many people never achieved higher education in the years. You could also replace sum
with table
to get a count of all categories.
CodePudding user response:
In base R
, we can do
with(df, sum(!unique(Individual) %in%
unique(Individual[HigherEducationDummy == 1])))
[1] 2