Keep the row if the specific column is the minimum value of that row-CodePudding

I cannot share the dataset but I will explain it as best as I can. The dataset has 50 columns 48 of them are in Y/m/d h:m:s format. also the data has many NA, but it must not be removed.

Let's say there is a column B. I want to remove the rows if the value of B is not the earliest in that row.

How can I do this in R? For example, the original would be like this:

df <- data.frame(
  A = c(11,19,17,6,13),
  B = c(18,9,5,16,12),
  C = c(14,15,8,87,16))

   A  B  C
1 11 18 14
2 19  9 15
3 17  5  8
4  6 16 87
5 13 12 16

but I want this:

CodePudding user response：

You could use apply() to find the minimum for each row.

df |> subset(B == apply(df, 1, min, na.rm = TRUE))

#    A  B  C
# 2 19  9 15
# 3 17  5  8
# 5 13 12 16

The tidyverse equivalent is

library(tidyverse)

df %>% filter(B == pmap(across(A:C), min, na.rm = TRUE))

CodePudding user response：

If you are willing to use data.table, you could do the following.

library(data.table)

df <- as.data.table(df)

df[(B < A & B < C)]
    A  B  C
1: 19  9 15
2: 17  5  8
3: 13 12 16