Home > Software design >  Find the Maximum Value with respect to another within two data frames (VLOOKUP which returns Max Val
Find the Maximum Value with respect to another within two data frames (VLOOKUP which returns Max Val

Time:11-04

I am trying to find the do a function which is similar to a vlookup in excel but which returns the maximum value and the other values in the same row. The data frame looks like this:

enter image description here enter image description here

The data frame which I am dealing with are given below:

dput(Book3)
structure(list(Item = c("ABA", "ABB", "ABC", "ABD", "ABE", "ABF"
)), class = c("tbl_df", "tbl", "data.frame"), row.names = c(NA, 
-6L))

dput(Book4)
structure(list(Item = c("ABA", "ABB", "ABC", "ABD", "ABE", "ABF", 
"ABA", "ABB", "ABC", "ABD", "ABE", "ABF", "ABA", "ABB", "ABC", 
"ABD", "ABE", "ABF"), Max1 = c(12, 68, 27, 17, 74, 76, 78, 93, 
94, 98, 46, 90, 5, 58, 67, 64, 34, 97), Additional1 = c(40, 66, 
100, 33, 66, 19, 8, 70, 21, 93, 48, 34, 44, 89, 74, 20, 0, 47
), Additional2 = c(39, 31, 85, 58, 0, 2, 57, 28, 31, 32, 15, 
22, 93, 41, 57, 81, 95, 46)), class = c("tbl_df", "tbl", "data.frame"
), row.names = c(NA, -18L))

The Expected output for this is given below:

enter image description here

CodePudding user response:

You are looking for slice_max:

library(dplyr)
Book4 %>% 
  group_by(Item) %>% 
  slice_max(Max1)

#   Item   Max1 Additional1 Additional2
# 1 ABA      78           8          57
# 2 ABB      93          70          28
# 3 ABC      94          21          31
# 4 ABD      98          93          32
# 5 ABE      74          66           0
# 6 ABF      97          47          46

CodePudding user response:

Using base R

subset(Book4, Max1 == ave(Max1, Item, FUN = max))

-output

# A tibble: 6 × 4
  Item   Max1 Additional1 Additional2
  <chr> <dbl>       <dbl>       <dbl>
1 ABE      74          66           0
2 ABA      78           8          57
3 ABB      93          70          28
4 ABC      94          21          31
5 ABD      98          93          32
6 ABF      97          47          46

CodePudding user response:

An alternative base solution that is more resilient to floating-point precision problems (c.f., Why are these numbers not equal?, https://cran.r-project.org/doc/FAQ/R-FAQ.html#Why-doesn_0027t-R-think-these-numbers-are-equal_003f). It also allows two behavior options if there are duplicate max-values:

  1. if you want all of them, use ties.method = "min";
  2. if you want the first (or just one) of them, then ties.method = "first".
Book4[ave(Book4$Max1, Book4$Item, FUN = function(z) rank(-z, ties.method = "first")) == 1,]
# # A tibble: 6 x 4
#   Item   Max1 Additional1 Additional2
#   <chr> <dbl>       <dbl>       <dbl>
# 1 ABE      74          66           0
# 2 ABA      78           8          57
# 3 ABB      93          70          28
# 4 ABC      94          21          31
# 5 ABD      98          93          32
# 6 ABF      97          47          46

CodePudding user response:

Using R base aggregate max merge

   > merge(Book4, aggregate(Max1~Item, data = Book4, max), by = c("Item", "Max1"))
  Item Max1 Additional1 Additional2
1  ABA   78           8          57
2  ABB   93          70          28
3  ABC   94          21          31
4  ABD   98          93          32
5  ABE   74          66           0
6  ABF   97          47          46
  • Related