I'm trying to make the complementary sequence of a dna chain stored in a vector. It's supposed to change the "A" for the "T" and the "C" for the "G" and vice versa, the thing is, I need this to happen to the first vector and print the complementary sequence correctly. This is what I tried but got stucked:
pilot_sequence <- c("C","G","A","T","C","C","T","A","T")
complement_sequence_display <- function(pilot_sequence){
complement_chain_Incom <- gsub("A", "T", pilot_sequence)
complement_chain <- paste(complement_chain_Incom, collapse = "")
cat("Complement sequence: ", complement_chain, "\n")
}
complement_chain_Incom <- gsub("A","T", pilot_sequence)
complement_chain <- paste(complement_chain_Incom, collapse= "")
complement_sequence_display(pilot_sequence)
I got as answer: CGTTCCTTT,just the second and penultimate T are correct, how do I solve to the rest of letters ?
the pilot_sequence vector is character type and the functions displays no execution errors.
CodePudding user response:
You can do this with purrr::map
:
pilot_sequence |> purrr::map_chr(~case_when(
.x == "T" ~ "A",
.x == "G" ~ "C",
.x == "A" ~ "T",
.x == "C" ~ "G"
))
#> [1] "G" "C" "T" "A" "G" "G" "A" "T" "A"
CodePudding user response:
This is a ideal use case for chartr
function:
chartr("ATGC","TACG",pilot_sequence)
output:
[1] "G" "C" "T" "A" "G" "G" "A" "T" "A"
CodePudding user response:
You can use recode
from dplyr
library(dplyr)
recode(pilot_sequence, "C" = "G", "G" = "C", "A" = "T", "T" = "A")
Or in base R, create a named vector and use match
to match the values location in the named vector and then call name
to get the names
pilot_sequence <- c("C","G","A","T","C","C","T","A","T")
values = c("G" = "C", "C" = "G", "A" = "T", "T" = "A")
names(values[match(pilot_sequence, values)])
"G" "C" "T" "A" "G" "G" "A" "T" "A"