Home > Net >  Compare values of a row with a value and return the smaller value
Compare values of a row with a value and return the smaller value

Time:03-31

I have a dataframe with several rows and column and I want to check for each row if the row values are higher than the value in the last column of the row. My data looks as follows:

        date      ABI.BR      NOVN.S       ROG.S      NESN.S   NOVOb.CO    HRMS.PA      TTEF.PA     OREP.PA    SASY.PA    LVMH.PA     MAX 
1 31.12.2001  7342987743 21072681660 19737660401 57324018814 3199830778 1226900000 105895000000 13740400000 6.4880e 09 1.2229e 10 163033294916
2 31.12.2002  6280730128 17748921146 20316428627 61501808862 3350848339 1242300000  92108991318 14288000000 7.4480e 09 1.2693e 10 144604417185
3 31.12.2003  6324404232 19755193920 20002601887 56367998444 3510991763 1230000000  93961037997 14029100000 8.0480e 09 1.1962e 10 122995279989
4 31.12.2004  7850834622 20947725570 19134043533 54889985325 3905732023 1331400000  87346032957 13641300000 1.5733e 10 1.2481e 10 123515812436
5 31.12.2005 12233279545 26451714210 22833829836 58587603997 4527870421 1427400000 122854000000 14532500000 2.8513e 10 1.3910e 10 171566971202
6 31.12.2006 12660668341 26602920050 26148301223 61238063838 5198638861 1514900000 126235000000 15790100000 2.9489e 10 1.5306e 10 173975809100

My aim is now to check if the values in each row of the columns 2:10 are higher than the corresponding value in column 11 (i.e. MAX). If in a row a value from a column is higher than the value in the MAX column I want to overright the actual value in the column with the value from the MAX column.

For example if I have the following data:

a b c MAX
2 3 4 3
4 5 6 6
6 7 9 8

I would like to have in the corrected data the following output

a b c MAX
2 3 **3** 3
4 5 6 6
6 7 **8** 8

Thank you very much for your help!

CodePudding user response:

Using your short example and the library dplyr, you could do:

Reprex

  • Data
df <- structure(list(a = c(2L, 4L, 6L), b = c(3L, 5L, 7L), c = c(4L, 
6L, 9L), MAX = c(3L, 6L, 8L)), class = "data.frame", row.names = c(NA, 
-3L))
  • Code
library(dplyr)

df %>% transmute(across(.cols = 1:3, ~ ifelse(.x > MAX, MAX, .x)))
  • Output
#>   a b c
#> 1 2 3 3
#> 2 4 5 6
#> 3 6 7 8

Created on 2022-03-31 by the reprex package (v2.0.1)


EDIT: With your real data and following @langtang's comment to keep first and last columns, you could do:

Reprex

  • Data
structure(list(date = c("31.12.2001", "31.12.2002", "31.12.2003", 
"31.12.2004", "31.12.2005", "31.12.2006"), ABI.BR = c(7342987743, 
6280730128, 6324404232, 7850834622, 12233279545, 12660668341), 
    NOVN.S = c(21072681660, 17748921146, 19755193920, 20947725570, 
    26451714210, 26602920050), ROG.S = c(19737660401, 20316428627, 
    20002601887, 19134043533, 22833829836, 26148301223), NESN.S = c(57324018814, 
    61501808862, 56367998444, 54889985325, 58587603997, 61238063838
    ), NOVOb.CO = c(3199830778, 3350848339, 3510991763, 3905732023, 
    4527870421, 5198638861), HRMS.PA = c(1226900000L, 1242300000L, 
    1230000000L, 1331400000L, 1427400000L, 1514900000L), TTEF.PA = c(1.05895e 11, 
    92108991318, 93961037997, 87346032957, 1.22854e 11, 1.26235e 11
    ), OREP.PA = c(13740400000, 1.4288e 10, 14029100000, 13641300000, 
    14532500000, 15790100000), SASY.PA = c(6.488e 09, 7.448e 09, 
    8.048e 09, 1.5733e 10, 2.8513e 10, 2.9489e 10), LVMH.PA = c(1.2229e 10, 
    1.2693e 10, 1.1962e 10, 1.2481e 10, 1.391e 10, 1.5306e 10
    ), MAX = c(163033294916, 144604417185, 122995279989, 123515812436, 
    171566971202, 173975809100)), class = "data.frame", row.names = c("1", 
"2", "3", "4", "5", "6"))
  • Code
library(dplyr)

df %>% mutate(across(.cols = 2:10, ~ ifelse(.x > MAX, MAX, .x)))
  • Output
#>         date      ABI.BR      NOVN.S       ROG.S      NESN.S   NOVOb.CO
#> 1 31.12.2001  7342987743 21072681660 19737660401 57324018814 3199830778
#> 2 31.12.2002  6280730128 17748921146 20316428627 61501808862 3350848339
#> 3 31.12.2003  6324404232 19755193920 20002601887 56367998444 3510991763
#> 4 31.12.2004  7850834622 20947725570 19134043533 54889985325 3905732023
#> 5 31.12.2005 12233279545 26451714210 22833829836 58587603997 4527870421
#> 6 31.12.2006 12660668341 26602920050 26148301223 61238063838 5198638861
#>      HRMS.PA      TTEF.PA     OREP.PA    SASY.PA    LVMH.PA          MAX
#> 1 1226900000 105895000000 13740400000 6.4880e 09 1.2229e 10 163033294916
#> 2 1242300000  92108991318 14288000000 7.4480e 09 1.2693e 10 144604417185
#> 3 1230000000  93961037997 14029100000 8.0480e 09 1.1962e 10 122995279989
#> 4 1331400000  87346032957 13641300000 1.5733e 10 1.2481e 10 123515812436
#> 5 1427400000 122854000000 14532500000 2.8513e 10 1.3910e 10 171566971202
#> 6 1514900000 126235000000 15790100000 2.9489e 10 1.5306e 10 173975809100

Created on 2022-03-31 by the reprex package (v2.0.1)

CodePudding user response:

cols = c('a', 'b', 'c')
df[cols] = lapply(df[cols], \(x) ifelse(x > df$MAX, df$MAX, x))

#   a b c MAX
# 1 2 3 3   3
# 2 4 5 6   6
# 3 6 7 8   8
  • Related