I have this kind of df
f <- data.frame(gene=c("gene1", "gene1", "gene2", "gene2", "gene2", "gene3","gene3", "gene3"),
distance = c(10, -5, 40, -60, 0, -150, 5, -200))
And I would like to select only the genes with the closest distance to zero (but not zero) in order to get this result
gene distance
gene1 -5
gene2 40
gene3 5
I've tried this
distances <- f %>%
group_by(gene) %>%
filter(distance == min(abs(distance)) & distance != 0) %>%
ungroup
But it does not work as expected. Any advice would be great!
CodePudding user response:
We can subset the distance
value which is not 0 and select rows with the minimum value.
library(dplyr)
f %>%
group_by(gene) %>%
filter(abs(distance) == min(abs(distance[distance != 0]))) %>%
ungroup
# gene distance
# <chr> <dbl>
#1 gene1 -5
#2 gene2 40
#3 gene3 5