Complementary sequence using gsub-CodePudding

I'm trying to make the complementary sequence of a dna chain stored in a vector. It's supposed to change the "A" for the "T" and the "C" for the "G" and vice versa, the thing is, I need this to happen to the first vector and print the complementary sequence correctly. This is what I tried but got stucked:

pilot_sequence <- c("C","G","A","T","C","C","T","A","T")

complement_sequence_display <- function(pilot_sequence){
  complement_chain_Incom <- gsub("A", "T", pilot_sequence)
  complement_chain <- paste(complement_chain_Incom, collapse = "")
  cat("Complement sequence: ", complement_chain, "\n")
} 
complement_chain_Incom <- gsub("A","T", pilot_sequence)
complement_chain <- paste(complement_chain_Incom, collapse= "")
complement_sequence_display(pilot_sequence)

I got as answer: CGTTCCTTT,just the second and penultimate T are correct, how do I solve to the rest of letters ?

the pilot_sequence vector is character type and the functions displays no execution errors.

CodePudding user response：

You can do this with purrr::map:

pilot_sequence |> purrr::map_chr(~case_when(
  .x == "T" ~ "A",
  .x == "G" ~ "C",
  .x == "A" ~ "T",
  .x == "C" ~ "G"
))
#> [1] "G" "C" "T" "A" "G" "G" "A" "T" "A"

CodePudding user response：

This is a ideal use case for chartr function:

chartr("ATGC","TACG",pilot_sequence)

output:

[1] "G" "C" "T" "A" "G" "G" "A" "T" "A"

CodePudding user response：

You can use recode from dplyr

library(dplyr)

recode(pilot_sequence, "C" = "G", "G" = "C", "A" = "T", "T" = "A")

Or in base R, create a named vector and use match to match the values location in the named vector and then call name to get the names

pilot_sequence <- c("C","G","A","T","C","C","T","A","T")

values = c("G" = "C", "C" = "G", "A" = "T", "T" = "A")

names(values[match(pilot_sequence, values)])

"G" "C" "T" "A" "G" "G" "A" "T" "A"