So I have something like this:
data.frame(content = c("a","a","b","b","c","c"),
eje = c("politics","sports","education","sports","health","politics"),
value = c(3,2,1,2,1,1))
And I'd like to group by content
and keep the values in eje
that has the highest value on value
, and to keep both values when it ties.
So on sample I'd stay with:
data.frame(content = c("a","b","c","c"),
eje = c("politics","sports","health","politics"),
value = c(3,2,1,1))`
On SQL I'd do something like RANK OVER PARTITION BY (content, DESC value) and then filter rows with value "1" on the RANK column created.
CodePudding user response:
d = data.frame(content = c("a","a","b","b","c","c"),
eje = c("politics","sports","education","sports","health","politics"),
value = c(3,2,1,2,1,1))
library(dplyr)
d %>%
group_by(content) %>%
slice_max(value)
# # A tibble: 4 × 3
# # Groups: content [3]
# content eje value
# <chr> <chr> <dbl>
# 1 a politics 3
# 2 b sports 2
# 3 c health 1
# 4 c politics 1
CodePudding user response:
data.table
option:
library(data.table)
dt <- data.table(df)
dt[dt[, .I[value == max(value)], by=content]$V1]
Output:
content eje value
1: a politics 3
2: b sports 2
3: c health 1
4: c politics 1