I have a classic output of the BLAST tool that it is like the table below. To make the table easier to read, I reduced the number of columns.
query | subject | startinsubject | endinsubject |
---|---|---|---|
1 | SRR | 50 | 100 |
1 | SRR | 500 | 450 |
What I would need would be to create another column, called "strand", where when the query is forward as in the first row, and therefore the startinsubject is less than the endinsubject, writes in the new column F. On the other hand, when the query is in reverse, as in the second row, where the startinsubject is higher than the endinsubject, it adds an R in the new "strand" column.
I would like to get a new table like this one below. Could anyone help me? a thousand thanks
query | subject | startinsubject | endinsubject | strand |
---|---|---|---|---|
1 | SRR | 50 | 100 | F |
1 | SRR | 500 | 450 | R |
CodePudding user response:
This is an ifelse
option. You can use the following code:
df <- data.frame(query = c(1,1),
subject = c("SRR", "SRR"),
startinsubject = c(50, 500),
endinsubject = c(100, 450))
library(dplyr)
df %>%
mutate(strand = ifelse(startinsubject > endinsubject, "R", "F"))
Output:
query subject startinsubject endinsubject strand
1 1 SRR 50 100 F
2 1 SRR 500 450 R
CodePudding user response:
We may either use ifelse/case_when
or just convert the logical to numeric index for replacement
library(dplyr)
df1 <- df1 %>%
mutate(strand = c("R", "F")[1 (startinsubject < endinsubject)])
-output
df1
query subject startinsubject endinsubject strand
1 1 SRR 50 100 F
2 1 SRR 500 450 R
data
df1 <- structure(list(query = c(1L, 1L), subject = c("SRR", "SRR"),
startinsubject = c(50L, 500L), endinsubject = c(100L, 450L
)), class = "data.frame", row.names = c(NA, -2L))