Home > database >  Create a column that returns the min/max of certain rows
Create a column that returns the min/max of certain rows

Time:08-26

I have data like these:

col1  col2   col3  col4   col5
1      3      1     7      3
4      2      8     2      5
3      1      5     1      4

I want to add columns that show the minimum and maximum by row, but only for certain columns (2 - 4, for example):

col1  col2   col3  col4   col5   min   max
1      3      1     7      3      1    7
1      2      8     2      5      2    8
9      1      5     1      0      1    5

I know I could use select to subset those rows and then calculate the min/max and use cbind to merge with the original data, but I feel like there is a better approach. Thanks!

Data
df <- structure(list(col1 = c(1L, 4L, 3L), col2 = 3:1, col3 = c(1L, 8L, 5L),
  col4 = c(7L, 2L, 1L), col5 = c(3L, 5L, 4L)), 

class = "data.frame", row.names = c(NA, -3L))

CodePudding user response:

We could use pmin/pmax after selecting the columns

df$min <- do.call(pmin, c(df[2:4], na.rm = TRUE))
df$max <- do.call(pmax, c(df[2:4], na.rm = TRUE))

-output

> df
  col1 col2 col3 col4 col5 min max
1    1    3    1    7    3   1   7
2    4    2    8    2    5   2   8
3    3    1    5    1    4   1   5

Or using tidyverse, we can do

library(dplyr)
df %>%
    mutate(min = exec(pmin, !!! rlang::syms(names(.)[2:4]), na.rm = TRUE),
     max = exec(pmax, !!! rlang::syms(names(.)[2:4]), na.rm  =TRUE))

-output

   col1 col2 col3 col4 col5 min max
1    1    3    1    7    3   1   7
2    4    2    8    2    5   2   8
3    3    1    5    1    4   1   5

data

df <- structure(list(col1 = c(1L, 4L, 3L), col2 = 3:1, col3 = c(1L, 
8L, 5L), col4 = c(7L, 2L, 1L), col5 = c(3L, 5L, 4L)),
 class = "data.frame", row.names = c(NA, 
-3L))

CodePudding user response:

you can create a trigger in the database that will do this work when inserting or changing data

CodePudding user response:

With dplyr, you could use across pmin/pmax:

library(dplyr)

df %>%
  mutate(min = do.call(pmin, c(across(col2:col4), na.rm = TRUE)),
         max = do.call(pmax, c(across(col2:col4), na.rm = TRUE)))

# # A tibble: 3 × 7
#    col1  col2  col3  col4  col5   min   max
#   <int> <int> <int> <int> <int> <int> <int>
# 1     1     3     1     7     3     1     7
# 2     4     2     8     2     5     2     8
# 3     3     1     5     1     4     1     5

or c_across min/max:

df %>%
  rowwise() %>%
  mutate(min = min(c_across(col2:col4), na.rm = TRUE),
         max = max(c_across(col2:col4), na.rm = TRUE)) %>%
  ungroup()

CodePudding user response:

because you tagged the question with dplyr here is a dplyr solution

library(dplyr)
mt2 <- mtcars %>%
      mutate(pmax  = pmax(cyl,carb),
             pmin  = pmin(cyl,carb))

CodePudding user response:

Here is one with rowwise() combined with c_across():

library(dplyr)

df %>% 
  rowwise() %>% 
  mutate(min = min(c_across(col1:col5)),
         max = max(c_across(col1:col5)))

   col1  col2  col3  col4  col5   min   max
  <int> <int> <int> <int> <int> <int> <int>
1     1     3     1     7     3     1     7
2     4     2     8     2     5     2     8
3     3     1     5     1     4     1     5
  • Related