Home > Blockchain >  Find the nearest bigger number in R
Find the nearest bigger number in R

Time:11-30

I have a dataset like this:

row  num Group
  1 3     B
  2 6     A
  3 12    A
  4 15    B
  5 16    A
  6 18    A
  7 20    B
  8 25    A
  9 27    B
 10 29    B

In R, I would like to compare an input number with the values in num, and I would like to find the location of the closest bigger value in Group A only.

For example, if the input number is 8, then the closest, bigger value in group A should be 12, and I would like to get its location which should be 3. If the input is 18, then the value returned should be 18, and the location should be 6. If the input is 20, then the value returned should be 25, and the location should be 8.

I tried which.min, but for some reason, index 1 is always returned regardless of my imput number.

#called the dataset f

which.min(f$num[f$Group=="A"][f$num[f$Group=="A"]>=8])

I would like to still use base R if possible I would appreciate any thoughts on which part I did wrong and how to fix it.

Thank you.

CodePudding user response:

As for nearest bigger num, you could formulate the function:

nearest_bigger_num <- function(num, vec) {
  which(min(vec[num < vec]) == vec)
}

nearest_bigger_num(8, df$num)
## 3

However, in your case, you want to even count the group.

nearest_bigger_num_in_group <- function(num, df, group) {
  df <- df[df$group == group]
  df <- df[num < df$num]
  df$row[which.min(df$num)]
}

nearest_bigger_num_in_group(8, df, "A")
## 3

CodePudding user response:

Here two ways to do it

dplyr

library(dplyr)

df <-
  data.frame(
    row = 1:10,
    num = cumsum(rep(3,10)),
    group = c("B","A","A","B","A","A","B","A","B","B")
  )

df %>% 
  filter(num >= 8) %>% 
  slice_min(order_by = row)

  row num group
1   3   9     A

Base R

df[min(df$row[(df$num >= 8)]),]

  row num group
1   3   9     A

CodePudding user response:

Use ifelse() to replace elements that don’t meet your conditions with NA, then use which.min() on the resulting vector:

which.min(ifelse(f$Group == "A" & f$num >= 8, f$num, NA))
# 3

which.min(ifelse(f$Group == "A" & f$num >= 18, f$num, NA))
# 6

Part of the reason your solution doesn’t work is that by subsetting, you change the element positions, so the position returned by which.min() doesn’t correspond to your original vector. By replacing with NAs instead, you preserve the original positions.

  • Related