I am trying to find the do a function which is similar to a vlookup in excel but which returns the maximum value and the other values in the same row. The data frame looks like this:
The data frame which I am dealing with are given below:
dput(Book3)
structure(list(Item = c("ABA", "ABB", "ABC", "ABD", "ABE", "ABF"
)), class = c("tbl_df", "tbl", "data.frame"), row.names = c(NA,
-6L))
dput(Book4)
structure(list(Item = c("ABA", "ABB", "ABC", "ABD", "ABE", "ABF",
"ABA", "ABB", "ABC", "ABD", "ABE", "ABF", "ABA", "ABB", "ABC",
"ABD", "ABE", "ABF"), Max1 = c(12, 68, 27, 17, 74, 76, 78, 93,
94, 98, 46, 90, 5, 58, 67, 64, 34, 97), Additional1 = c(40, 66,
100, 33, 66, 19, 8, 70, 21, 93, 48, 34, 44, 89, 74, 20, 0, 47
), Additional2 = c(39, 31, 85, 58, 0, 2, 57, 28, 31, 32, 15,
22, 93, 41, 57, 81, 95, 46)), class = c("tbl_df", "tbl", "data.frame"
), row.names = c(NA, -18L))
The Expected output for this is given below:
CodePudding user response:
You are looking for slice_max
:
library(dplyr)
Book4 %>%
group_by(Item) %>%
slice_max(Max1)
# Item Max1 Additional1 Additional2
# 1 ABA 78 8 57
# 2 ABB 93 70 28
# 3 ABC 94 21 31
# 4 ABD 98 93 32
# 5 ABE 74 66 0
# 6 ABF 97 47 46
CodePudding user response:
Using base R
subset(Book4, Max1 == ave(Max1, Item, FUN = max))
-output
# A tibble: 6 × 4
Item Max1 Additional1 Additional2
<chr> <dbl> <dbl> <dbl>
1 ABE 74 66 0
2 ABA 78 8 57
3 ABB 93 70 28
4 ABC 94 21 31
5 ABD 98 93 32
6 ABF 97 47 46
CodePudding user response:
An alternative base solution that is more resilient to floating-point precision problems (c.f., Why are these numbers not equal?, https://cran.r-project.org/doc/FAQ/R-FAQ.html#Why-doesn_0027t-R-think-these-numbers-are-equal_003f). It also allows two behavior options if there are duplicate max-values:
- if you want all of them, use
ties.method = "min"
; - if you want the first (or just one) of them, then
ties.method = "first"
.
Book4[ave(Book4$Max1, Book4$Item, FUN = function(z) rank(-z, ties.method = "first")) == 1,]
# # A tibble: 6 x 4
# Item Max1 Additional1 Additional2
# <chr> <dbl> <dbl> <dbl>
# 1 ABE 74 66 0
# 2 ABA 78 8 57
# 3 ABB 93 70 28
# 4 ABC 94 21 31
# 5 ABD 98 93 32
# 6 ABF 97 47 46
CodePudding user response:
Using R base aggregate
max
merge
> merge(Book4, aggregate(Max1~Item, data = Book4, max), by = c("Item", "Max1"))
Item Max1 Additional1 Additional2
1 ABA 78 8 57
2 ABB 93 70 28
3 ABC 94 21 31
4 ABD 98 93 32
5 ABE 74 66 0
6 ABF 97 47 46