Home > front end >  How to PartitionBy (SQL) to rank on RStudio
How to PartitionBy (SQL) to rank on RStudio

Time:05-13

So I have something like this:

data.frame(content = c("a","a","b","b","c","c"),
           eje = c("politics","sports","education","sports","health","politics"),
           value = c(3,2,1,2,1,1))

And I'd like to group by content and keep the values in eje that has the highest value on value, and to keep both values when it ties.

So on sample I'd stay with:

data.frame(content = c("a","b","c","c"),
           eje = c("politics","sports","health","politics"),
           value = c(3,2,1,1))`

On SQL I'd do something like RANK OVER PARTITION BY (content, DESC value) and then filter rows with value "1" on the RANK column created.

CodePudding user response:

d = data.frame(content = c("a","a","b","b","c","c"),
           eje = c("politics","sports","education","sports","health","politics"),
           value = c(3,2,1,2,1,1))

library(dplyr)
d %>%
  group_by(content) %>%
  slice_max(value)
# # A tibble: 4 × 3
# # Groups:   content [3]
#   content eje      value
#   <chr>   <chr>    <dbl>
# 1 a       politics     3
# 2 b       sports       2
# 3 c       health       1
# 4 c       politics     1

CodePudding user response:

data.table option:

library(data.table)
dt <- data.table(df)
dt[dt[, .I[value == max(value)], by=content]$V1]

Output:

   content      eje value
1:       a politics     3
2:       b   sports     2
3:       c   health     1
4:       c politics     1
  •  Tags:  
  • r
  • Related