Home > OS >  Unable to take to N highest values from a dataframe
Unable to take to N highest values from a dataframe

Time:07-13

I've got a dataframe of the following pattern:

Date        Type Value
2020-02-09  BUS  12
2020-02-09  ARS  226394
2020-02-09  ZED  27566
2020-02-09  YED  217098
2020-02-09  DKK  208463
2020-02-09  GBR  9320
2020-02-09  INY  156607
2020-02-09  CHI  19790
2020-02-09  IDR  24541
2020-02-09  KRW  1074419
2020-02-09  WOK  17250
2020-02-09  STR  12249
2020-02-09  HUF  43651
2020-02-09  HAD  45121

I'm trying to create a subset that will be made up of the top 7 highest values, but for whatever reason, it won't allow me to do so. Data type for each column is: Date: chr, Type: chr, Value: int.

I've tried (using the data.frame library):

new_df <- as.data.table(df)[order(Type, -Value), head(.SD, 7), by = Type

And:

new_df <- df %>% group_by(Type) %>% slice_max(order_by = Value, n = 7)

And:

new_df <- df %>% group_by(Type) %>% slice_max(Value, n=7, with_ties = FALSE)

Does anyone have any idea where I'm going wrong?

CodePudding user response:

using the base syntax for example

vector <- 1:10
lettrs <- letters[1:10]
df <- data.frame("Value"=vector, "Type"=lettrs)
new_df <- df[order(df$Type, -df$Value), ]
new_df <- new_df[1:7,]            
  Value Type
1     1    a
2     2    b
3     3    c
4     4    d
5     5    e
6     6    f
7     7    g

CodePudding user response:

Try this,

order<-order(df$Value,decreasing = T)
new_df <- df[order[1:7],]
  • Related