I am using the data set Income_Democracy.dta
I am trying to find the name of countries that have an average dem_ind
value greater than 0.95
.
I figure I need to subset the countries, find the average, and return that as a new data set, but I can't figure out how to do it without the specific country names. I've fiddled with the which and subset functions but I'm only new to R
and need help.
For the specific countries I know you can do
mean(subset(incdem$dem_ind, incdem$country =="Australia"))
but I'm unsure how to generalise.
CodePudding user response:
Grouped by 'country', get the mean
of 'dem_ind', filter
the rows where the 'mean' column value is greater than 0.95 and pull
the 'country' column as a vector
library(dplyr)
incdem %>%
group_by(country) %>%
summarise(Avg = mean(dem_ind, na.rm = TRUE), .groups = 'drop') %>%
filter(Avg > 0.95) %>%
pull(country)
Or another option is
names(which(sapply(split(incdem$dem_ind, incdem$country), mean,
na.rm = TRUE) > 0.95))
If it is a range of values
names(which(sapply(split(incdem$dem_ind, incdem$country), function(x) {
avg <- mean(x, na.rm = TRUE)
avg > 0.2 & avg < 0.8})))