Home > OS >  For function over multiple rows (i 1)?
For function over multiple rows (i 1)?

Time:03-30

New to R, my apologies if there is an easy answer that I don't know of.

I have a dataframe with 127.124 observations and 5 variables

Head(SortedDF)

       number Retention.time..min. Charge      m.z Group
102864   6947             12.58028      5 375.0021 Pro
68971   60641             23.36693      2 375.1373 Pro
75001  104156             24.54187      3 375.1540 Pro
87435  146322             22.69630      3 375.1540 Pro
82658   88256             22.32042      3 375.1541 Pro
113553  97971             14.54600      3 375.1566 Pro
...

I want to compare every row with the row underneath it (so basically rownumber vs rownumber 1) and see if they match. After reading the For and if-else functions, I came up with this code:

for (i in 1:dim(SortedDF)) 
  if(abs(m.z[i]-m.z[i 1])<0.01 | abs(Retention.time..min.[i]-Retention.time..min.[i 1])<1 | (Charge[i]=Charge[i 1]) | Group[i]!=Group[i 1]) 
    print("Match")
  else
    print("No match")

However, this code does not work as it only prints out the first function function [1], and I'm not sure if i 1 is a thing. Is there any way to solve this not using i 1?

CodePudding user response:

library(tidyverse)

data <- tibble(x = c(1, 1, 2), y = "a")
data
#> # A tibble: 3 × 2
#>       x y    
#>   <dbl> <chr>
#> 1     1 a    
#> 2     1 a    
#> 3     2 a

same_rows <-
  data %>%
  # consider all columns
  unite(col = "all") %>%
  transmute(same_as_next_row = all == lead(all))

data %>%
  bind_cols(same_rows)
#> # A tibble: 3 × 3
#>       x y     same_as_next_row
#>   <dbl> <chr> <lgl>           
#> 1     1 a     TRUE            
#> 2     1 a     FALSE           
#> 3     2 a     NA

Created on 2022-03-30 by the reprex package (v2.0.0)

library(tidyverse)

data <- tibble::tribble(
  ~id, ~number, ~Retention.time..min., ~Charge, ~m.z, ~Group,
  102864, 6947, 12.58028, 5, 375.0021, "Pro",
  68971, 60641, 23.36693, 2, 375.1373, "Pro",
  75001, 104156, 24.54187, 3, 375.1540, "Pro",
  87435, 146322, 22.69630, 3, 375.1540, "Pro",
  82658, 88256, 22.32042, 3, 375.1541, "Pro",
  113553, 97971, 14.54600, 3, 375.1566, "Pro"
)

data %>%
  mutate(
    matches_with_next_row = (abs(m.z - lead(m.z)) < 0.01) |
      (abs(Retention.time..min. - lead(Retention.time..min.)) < 1)
  )
#> # A tibble: 6 × 7
#>       id number Retention.time..min. Charge   m.z Group matches_with_next_row
#>    <dbl>  <dbl>                <dbl>  <dbl> <dbl> <chr> <lgl>                
#> 1 102864   6947                 12.6      5  375. Pro   FALSE                
#> 2  68971  60641                 23.4      2  375. Pro   FALSE                
#> 3  75001 104156                 24.5      3  375. Pro   TRUE                 
#> 4  87435 146322                 22.7      3  375. Pro   TRUE                 
#> 5  82658  88256                 22.3      3  375. Pro   TRUE                 
#> 6 113553  97971                 14.5      3  375. Pro   NA

Created on 2022-03-30 by the reprex package (v2.0.0)

  • Related