Home > database >  How to sum values in a row that are less/equal to the last value (for multiple rows)?
How to sum values in a row that are less/equal to the last value (for multiple rows)?

Time:03-19

I'm trying to calculate the percentile rank for 2021, relative to the 1981-2021 climatology for June/July/August average temperatures. My data looks like below but much larger:

1981 1982 1983 2021
22 34 40 42
25 36 32 33

For each row, I need to calculate the number of values which are lower or equal to the 2021 value. So for the top row it would be three as all values are less than 42, while the second row would equal 2.

I have written some code which tries to do this, but the problem I'm having is getting the function to use the 2021 value for each row and not just for one value.

L = <- apply(Temperature_df[,(3:50)],1,function(x) 

{
    sum(x <= Temperature_df[,50]) #50 is the 50th column which is always 2021

})

# the function works if I compare each row to one value of 2021: 
  sum(x <= Temperature_df[1,50]), but I need to sum the values of 
  each row against the 2021 value for that row. 

Any ideas would be greatly appreciated.

CodePudding user response:

A simple rowSums will do it. Pay attention to the way column 2021 is extracted, with double [[.

Temperature_df <- read.table(text = "
1981    1982    1983    2021
22  34  40  42
25  36  32  33
", header = TRUE, check.names = FALSE)

col2021 <- 4
Temperature_df[-col2021] <= Temperature_df[[col2021]]
#>      1981  1982 1983
#> [1,] TRUE  TRUE TRUE
#> [2,] TRUE FALSE TRUE

rowSums(Temperature_df[-col2021] <= Temperature_df[[col2021]])
#> [1] 3 2

Created on 2022-03-18 by the reprex package (v2.0.1)

  • Related