I have a dataframe
with several rows and column and I want to check for each row if the row values are higher than the value in the last column of the row. My data looks as follows:
date ABI.BR NOVN.S ROG.S NESN.S NOVOb.CO HRMS.PA TTEF.PA OREP.PA SASY.PA LVMH.PA MAX
1 31.12.2001 7342987743 21072681660 19737660401 57324018814 3199830778 1226900000 105895000000 13740400000 6.4880e 09 1.2229e 10 163033294916
2 31.12.2002 6280730128 17748921146 20316428627 61501808862 3350848339 1242300000 92108991318 14288000000 7.4480e 09 1.2693e 10 144604417185
3 31.12.2003 6324404232 19755193920 20002601887 56367998444 3510991763 1230000000 93961037997 14029100000 8.0480e 09 1.1962e 10 122995279989
4 31.12.2004 7850834622 20947725570 19134043533 54889985325 3905732023 1331400000 87346032957 13641300000 1.5733e 10 1.2481e 10 123515812436
5 31.12.2005 12233279545 26451714210 22833829836 58587603997 4527870421 1427400000 122854000000 14532500000 2.8513e 10 1.3910e 10 171566971202
6 31.12.2006 12660668341 26602920050 26148301223 61238063838 5198638861 1514900000 126235000000 15790100000 2.9489e 10 1.5306e 10 173975809100
My aim is now to check if the values in each row of the columns 2:10
are higher than the corresponding value in column 11
(i.e. MAX
). If in a row a value from a column is higher than the value in the MAX
column I want to overright the actual value in the column with the value from the MAX
column.
For example if I have the following data:
a b c MAX
2 3 4 3
4 5 6 6
6 7 9 8
I would like to have in the corrected data the following output
a b c MAX
2 3 **3** 3
4 5 6 6
6 7 **8** 8
Thank you very much for your help!
CodePudding user response:
Using your short example and the library dplyr
, you could do:
Reprex
- Data
df <- structure(list(a = c(2L, 4L, 6L), b = c(3L, 5L, 7L), c = c(4L,
6L, 9L), MAX = c(3L, 6L, 8L)), class = "data.frame", row.names = c(NA,
-3L))
- Code
library(dplyr)
df %>% transmute(across(.cols = 1:3, ~ ifelse(.x > MAX, MAX, .x)))
- Output
#> a b c
#> 1 2 3 3
#> 2 4 5 6
#> 3 6 7 8
Created on 2022-03-31 by the reprex package (v2.0.1)
EDIT: With your real data and following @langtang's comment to keep first and last columns, you could do:
Reprex
- Data
structure(list(date = c("31.12.2001", "31.12.2002", "31.12.2003",
"31.12.2004", "31.12.2005", "31.12.2006"), ABI.BR = c(7342987743,
6280730128, 6324404232, 7850834622, 12233279545, 12660668341),
NOVN.S = c(21072681660, 17748921146, 19755193920, 20947725570,
26451714210, 26602920050), ROG.S = c(19737660401, 20316428627,
20002601887, 19134043533, 22833829836, 26148301223), NESN.S = c(57324018814,
61501808862, 56367998444, 54889985325, 58587603997, 61238063838
), NOVOb.CO = c(3199830778, 3350848339, 3510991763, 3905732023,
4527870421, 5198638861), HRMS.PA = c(1226900000L, 1242300000L,
1230000000L, 1331400000L, 1427400000L, 1514900000L), TTEF.PA = c(1.05895e 11,
92108991318, 93961037997, 87346032957, 1.22854e 11, 1.26235e 11
), OREP.PA = c(13740400000, 1.4288e 10, 14029100000, 13641300000,
14532500000, 15790100000), SASY.PA = c(6.488e 09, 7.448e 09,
8.048e 09, 1.5733e 10, 2.8513e 10, 2.9489e 10), LVMH.PA = c(1.2229e 10,
1.2693e 10, 1.1962e 10, 1.2481e 10, 1.391e 10, 1.5306e 10
), MAX = c(163033294916, 144604417185, 122995279989, 123515812436,
171566971202, 173975809100)), class = "data.frame", row.names = c("1",
"2", "3", "4", "5", "6"))
- Code
library(dplyr)
df %>% mutate(across(.cols = 2:10, ~ ifelse(.x > MAX, MAX, .x)))
- Output
#> date ABI.BR NOVN.S ROG.S NESN.S NOVOb.CO
#> 1 31.12.2001 7342987743 21072681660 19737660401 57324018814 3199830778
#> 2 31.12.2002 6280730128 17748921146 20316428627 61501808862 3350848339
#> 3 31.12.2003 6324404232 19755193920 20002601887 56367998444 3510991763
#> 4 31.12.2004 7850834622 20947725570 19134043533 54889985325 3905732023
#> 5 31.12.2005 12233279545 26451714210 22833829836 58587603997 4527870421
#> 6 31.12.2006 12660668341 26602920050 26148301223 61238063838 5198638861
#> HRMS.PA TTEF.PA OREP.PA SASY.PA LVMH.PA MAX
#> 1 1226900000 105895000000 13740400000 6.4880e 09 1.2229e 10 163033294916
#> 2 1242300000 92108991318 14288000000 7.4480e 09 1.2693e 10 144604417185
#> 3 1230000000 93961037997 14029100000 8.0480e 09 1.1962e 10 122995279989
#> 4 1331400000 87346032957 13641300000 1.5733e 10 1.2481e 10 123515812436
#> 5 1427400000 122854000000 14532500000 2.8513e 10 1.3910e 10 171566971202
#> 6 1514900000 126235000000 15790100000 2.9489e 10 1.5306e 10 173975809100
Created on 2022-03-31 by the reprex package (v2.0.1)
CodePudding user response:
cols = c('a', 'b', 'c')
df[cols] = lapply(df[cols], \(x) ifelse(x > df$MAX, df$MAX, x))
# a b c MAX
# 1 2 3 3 3
# 2 4 5 6 6
# 3 6 7 8 8