Home > Software engineering >  keep the order when filtering a data frame in R
keep the order when filtering a data frame in R

Time:06-14

I have a dataset ,df, as follow :

df = structure(list(filename = c("df1.txt", "df2.txt", "df3.txt", 
"df4.txt"), project = c("alpha", "beta", "gamma", "delta"), lat = c(55.1852777777778, 
52.6252777777778, 51.446272, 50.688111), long = c(7.15138888888889, 
3.96027777777778, 1.078051, -0.343189)), row.names = c(NA, 4L
), class = "data.frame")

I Would like to select 2 filename based on 2 given project :

selected = c("delta", "gamma")

To get the filename I'm using :

fname <- (df[df$project %in% selected,])$filename

But this will return fname as :

fname
[1] "df3.txt" "df4.txt"

Which the order is different by what I passed into the selected object ! I would like to get the fname as :

fname
[1] "df4.txt" "df3.txt"

since df4 is referring to delta and df3 is for gamma ! using dplyr and filter I'm getting the same result. How could I get the output in the same order as input ?

CodePudding user response:

Use match to get the position index to subset - %in% just returns a logical vector and thus it returns the subset values in the same order or occurrence in the data where there are TRUE values

> with(df, filename[match(selected, project)])
[1] "df4.txt" "df3.txt"

Or using slice in dplyr

library(dplyr)
df %>% 
  slice(match(selected, project)) %>% 
  pull(filename)
[1] "df4.txt" "df3.txt"
  • Related