I have one excel formula that i am trying to convert in R:QTL_interval = IF(OR(D3-D2>1000000,C3<>C2),E2 1,E2)
. enter image description here
D3 -> Maximum Position
D2 -> Minimum position
C3 -> Chromosome Number
C2 -> Chromosome Number
CodePudding user response:
I simulated the table based on the linked image:
df <- data.frame("CHROM" = c("1A", "4A", "5A", "5A", "5A"),
"POS" = c(560469514, 2846288687, 568234305, 568464107, 568465833),
"QTL interval" = c(0, 0, 0, 0, 0))
Proposed Solution:
for (i in 1:nrow(df)) {
df$QTL.interval[i] <- ifelse( (df$POS[i 1] - df$POS[i] > 1000000) | (df$CHROM[i 1] != df$CHROM[i]), df$QTL.interval[i] 1, df$QTL.interval[i])
}
The last row will end up being NA
since the last iteration will have no table values i 1:
CHROM POS QTL.interval
1 1A 560469514 1
2 4A 2846288687 1
3 5A 568234305 0
4 5A 568464107 0
5 5A 568465833 NA
CodePudding user response:
Here is an answer using tidyverse
.
library(tidyverse)
df <- tibble("CHROM" = c("1A", "4A", "5A", "5A", "5A"),
"POS" = c(560469514, 2846288687, 568234305, 568464107, 568465833))
df %>%
mutate(QTL_interval = cumsum(POS - lag(POS) > 1000000 | CHROM != lag(CHROM) | row_number() == 1))
The |
is r's OR operator. I added a third condition to check if you're at the first row. Otherwise the first row is NA
and the cumulative sum doesn't work.