Home > Blockchain >  how to select the closest value to zero (positive and negative) in data frame
how to select the closest value to zero (positive and negative) in data frame

Time:09-22

I have this kind of df

f <- data.frame(gene=c("gene1", "gene1", "gene2", "gene2", "gene2", "gene3","gene3", "gene3"), 
                distance = c(10, -5, 40, -60, 0, -150, 5, -200))

And I would like to select only the genes with the closest distance to zero (but not zero) in order to get this result

gene   distance
gene1    -5
gene2    40
gene3     5

I've tried this

distances <- f %>%
  group_by(gene) %>%
  filter(distance == min(abs(distance)) & distance != 0) %>%
  ungroup

But it does not work as expected. Any advice would be great!

CodePudding user response:

We can subset the distance value which is not 0 and select rows with the minimum value.

library(dplyr)

f %>%
  group_by(gene) %>%
  filter(abs(distance) == min(abs(distance[distance != 0]))) %>%
  ungroup

#  gene  distance
#  <chr>    <dbl>
#1 gene1       -5
#2 gene2       40
#3 gene3        5
  • Related