Home > other >  How to deselect based on condition (R)
How to deselect based on condition (R)

Time:03-09

I have a dataset that looks at college enrollment. I'm trying to find the proportion of students enrolled in biology per institute. I find the enrollment(EFTOTLT) for each school first using:

    #find sum of students by school
    total_enrollment <- school_data_unit_cip %>%
    group_by(UNITID) %>%
    summarise(Freq = sum(EFTOTLT))

This yields a tibble that's 2,207 x 2, then I find the enrollment for Biology for each school using:

    #find total biology enrollment by school
    total_biol_enrollment <- school_data_unit_cip %>%
    group_by(UNITID) %>%
    filter(CIPCODE == "26") %>%
    summarise(Freq = sum(EFTOTLT))

Then I realize this yields a tibble that's 1,560 x 2. So there are obviously schools that don't offer biology or don't have biology students.

Is there a way to deselect schools from the first tibble that don't have the CIPCODE 26? Or I guess is there a way to remove schools from the first list that don't exist in the second list?

CodePudding user response:

Without sample data it's a guess, but ... assuming that each school may have more than one CIPCODE, and you want only schools that contain at least CIPCODE == "26", then perhaps

school_data_unit_cip %>%
  filter(! "26" %in% CIPCODE)

CodePudding user response:

updated after the remarks in the other answer.

i think you can filter them out if you group first, but don't no for sure without the data:

total_biol_enrollment <- school_data_unit_cip %>%
    group_by(UNITID) %>% 
    filter(!any(CIPCODE== "26"))
  • Related