I have a dataframe of economics data, among others the infant mortality rate and the region of the world. I extracted outlier values per region.
income infant region oil
Saudi.Arabia 1530 650.0 Asia yes
Afganistan 75 400.0 Asia no
Libya 3010 300.0 Africa yes
Zambia 310 259.0 Africa no
Guinea 79 216.0 Africa no
Burma 73 200.0 Asia no
I have (1) the values of the outliers and (2) the name of the corresponding region for each value.
> out$out # values
[1] 300.0 170.0 650.0 400.0 44.8 43.3
> out$names[out$group] # name of region corresponding to each outlier
[1] "Africa" "Americas" "Asia" "Asia" "Europe" "Europe"
I now want to filter the dataframe for the rows corresponding to each value pair. E.g. I want the rowname for the row with an infant
value of 170 and region
being Americas. I cannot use the joint condition of the column values being in both lists, because there are other rows with the same values but different groups (as they are not outliers in their group).
I know that I could do this in two steps, but I wonder if there is a more elegant solution in R to do this.
Thanks a lot in advance!
CodePudding user response:
df <- data.frame(income = c(1530, 75, 3010, 310, 79, 73),
infant = c(650, 400, 300, 259, 216, 200),
region = c("Asia", "Asia", "Africa", "Africa", "Africa", "Asia"),
oil = c("yes", "no", "yes", "no", "no", "no"),
row.names = c("Saudi.Arabia", "Afganistan", "Libya",
"Zambia", "Guinea", "Burma"))
df
# income infant region oil
# Saudi.Arabia 1530 650 Asia yes
# Afganistan 75 400 Asia no
# Libya 3010 300 Africa yes
# Zambia 310 259 Africa no
# Guinea 79 216 Africa no
# Burma 73 200 Asia no
x <- c(300.0, 170.0, 650.0, 400.0, 44.8, 43.3)
y <- c("Africa", "Americas", "Asia", "Asia", "Europe", "Europe")
df[paste0(df$infant, df$region) %in% paste0(x, y),]
# income infant region oil
# Saudi.Arabia 1530 650 Asia yes
# Afganistan 75 400 Asia no
# Libya 3010 300 Africa yes