I am trying to do some spectral analysis and am aiming to isolate spectral peaks that are within /-1 MZ values of each other. I have made an R script to combine all of my spectral information into one data frame with all possible peptides, but it is a very large data frame and I am trying to find a way to streamline finding tandem peaks. I have no experience with R or any programming language so any help would be appreciated.
I have tried by making a column that finds the difference in MZ values between adjacent rows and then filtered by looking for those with only a value of 1 but this causes me to miss the first/last peak in the tandem spectra.
Here is an example of what a portion of my data frame looks like where the diff column is
mutate(diff = Mz_Round - lag(Mz_Round))
And then filtered to only include where diff == 1.
precursorMz Mz_Round HW Intensity Reg Intensity diff
136 256.6814 251.15 2108 2305 NA
137 256.6814 255.18 6491 3910 NA
138 256.6814 255.68 2292 1114 NA
139 256.6814 260.20 43010 23230 NA
140 256.6814 261.20 9452 6388 1
141 256.6814 262.19 6440 3487 NA
For this specific example, I want to extract rows 139 and 140 because they are both within 1 mz unit of each other but if I were to filter solely by which rows have a diff value of 1 then row 139 would be missing.
CodePudding user response:
data <-
data.frame(
Mz_Round = c(
251.15,
255.18,
255.68,
260.20,
261.20,
262.19
)
)
data |>
dplyr::mutate(
diff_1 = Mz_Round - dplyr::lag(Mz_Round),
diff_2 = dplyr::lead(Mz_Round) - Mz_Round
) |>
dplyr::filter(
diff_1 == 1 | diff_2 == 1
)
#> Mz_Round diff_1 diff_2
#> 1 260.2 4.52 1.00
#> 2 261.2 1.00 0.99
Created on 2022-11-08 with reprex v2.0.2