Home > OS >  How to find the closest row in data frame to my sample row in R?
How to find the closest row in data frame to my sample row in R?

Time:04-11

I have data frame iris and my set of values my_row

structure(list(Sepal.Length = 4.65, Sepal.Width = 3.19, Petal.Length = 1.44, 
    Petal.Width = 0.3, Species = structure(1L, .Label = c("setosa", 
    "versicolor", "virginica"), class = "factor")), row.names = 1L, class = "data.frame")

> my_row
  Sepal.Length Sepal.Width Petal.Length Petal.Width Species
1         4.65        3.19         1.44         0.3  setosa

> head(iris)
  Sepal.Length Sepal.Width Petal.Length Petal.Width Species
1          5.1         3.5          1.4         0.2  setosa
2          4.9         3.0          1.4         0.2  setosa
3          4.7         3.2          1.3         0.2  setosa
4          4.6         3.1          1.5         0.2  setosa
5          5.0         3.6          1.4         0.2  setosa
6          5.4         3.9          1.7         0.4  setosa

How to find closest row in iris data frame to my_row?

CodePudding user response:

We may filter the rows of 'iris' where the 'Species' matches with the 'Species' from 'my_row', then get the absolute difference between the corresponding numeric columns of both datasets, get the rowSums of the difference, and slice the row with the minimum value in 'new' column

library(dplyr)
iris %>% 
   filter(Species == my_row$Species) %>% 
    mutate(new = rowSums(across(where(is.numeric), 
    ~ abs(.x - my_row[[cur_column()]])))) %>% 
    slice_min(n = 1, order_by = new)

-output

 Sepal.Length Sepal.Width Petal.Length Petal.Width Species new
1          4.6         3.2          1.4         0.2  setosa 0.2
  •  Tags:  
  • r
  • Related