New to R, my apologies if there is an easy answer that I don't know of.
I have a dataframe with 127.124 observations and 5 variables
Head(SortedDF)
number Retention.time..min. Charge m.z Group
102864 6947 12.58028 5 375.0021 Pro
68971 60641 23.36693 2 375.1373 Pro
75001 104156 24.54187 3 375.1540 Pro
87435 146322 22.69630 3 375.1540 Pro
82658 88256 22.32042 3 375.1541 Pro
113553 97971 14.54600 3 375.1566 Pro
...
I want to compare every row with the row underneath it (so basically rownumber vs rownumber 1) and see if they match. After reading the For and if-else functions, I came up with this code:
for (i in 1:dim(SortedDF))
if(abs(m.z[i]-m.z[i 1])<0.01 | abs(Retention.time..min.[i]-Retention.time..min.[i 1])<1 | (Charge[i]=Charge[i 1]) | Group[i]!=Group[i 1])
print("Match")
else
print("No match")
However, this code does not work as it only prints out the first function function [1], and I'm not sure if i 1 is a thing. Is there any way to solve this not using i 1?
CodePudding user response:
library(tidyverse)
data <- tibble(x = c(1, 1, 2), y = "a")
data
#> # A tibble: 3 × 2
#> x y
#> <dbl> <chr>
#> 1 1 a
#> 2 1 a
#> 3 2 a
same_rows <-
data %>%
# consider all columns
unite(col = "all") %>%
transmute(same_as_next_row = all == lead(all))
data %>%
bind_cols(same_rows)
#> # A tibble: 3 × 3
#> x y same_as_next_row
#> <dbl> <chr> <lgl>
#> 1 1 a TRUE
#> 2 1 a FALSE
#> 3 2 a NA
Created on 2022-03-30 by the reprex package (v2.0.0)
library(tidyverse)
data <- tibble::tribble(
~id, ~number, ~Retention.time..min., ~Charge, ~m.z, ~Group,
102864, 6947, 12.58028, 5, 375.0021, "Pro",
68971, 60641, 23.36693, 2, 375.1373, "Pro",
75001, 104156, 24.54187, 3, 375.1540, "Pro",
87435, 146322, 22.69630, 3, 375.1540, "Pro",
82658, 88256, 22.32042, 3, 375.1541, "Pro",
113553, 97971, 14.54600, 3, 375.1566, "Pro"
)
data %>%
mutate(
matches_with_next_row = (abs(m.z - lead(m.z)) < 0.01) |
(abs(Retention.time..min. - lead(Retention.time..min.)) < 1)
)
#> # A tibble: 6 × 7
#> id number Retention.time..min. Charge m.z Group matches_with_next_row
#> <dbl> <dbl> <dbl> <dbl> <dbl> <chr> <lgl>
#> 1 102864 6947 12.6 5 375. Pro FALSE
#> 2 68971 60641 23.4 2 375. Pro FALSE
#> 3 75001 104156 24.5 3 375. Pro TRUE
#> 4 87435 146322 22.7 3 375. Pro TRUE
#> 5 82658 88256 22.3 3 375. Pro TRUE
#> 6 113553 97971 14.5 3 375. Pro NA
Created on 2022-03-30 by the reprex package (v2.0.0)