I have the following function, which basically replace the ?
with replacement string bb_seq
.
library(tidyverse)
replace_bb_with_str <- function (seed_pattern = NULL, bb_seq = NULL) {
sp <- seed_pattern
gr <- gregexpr("\\? ", sp)
csml <- lapply(gr, function(sp) cumsum(attr(sp, "match.length")))
regmatches(sp, gr) <- lapply(csml, function(sp) substring(bb_seq, c(1, sp[1]), sp))
sp
}
It works well with single run:
plist <- c(
"??????????DRHRTRHLAK??????????",
"????????????????????TRCYHIDPHH",
"FKDHKHIDVK????????????????????TRCYHIDPHH",
"FKDHKHIDVK????????????????????"
)
replace_bb_with_str(seed_pattern = plist[1], bb_seq = "ndqeegillkkkkfpssyvv")
# [1] "ndqeegillkDRHRTRHLAKkkkkfpssyvv"
But when I run it with dplyr::mutate :
expand.grid(seed_pattern = plist, bb_seq = "ndqeegillkkkkfpssyvv") %>%
rowwise() %>%
mutate(nseq = replace_bb_with_str(seed_pattern = seed_pattern, bb_seq = bb_seq))
I got this error:
Error in `mutate()`:
! Problem while computing `nseq = replace_bb_with_str(seed_pattern =
seed_pattern, bb_seq = bb_seq)`.
ℹ The error occurred in row 1.
Caused by error in `nchar()`:
! 'nchar()' requires a character vector
How can I resolve this issue?
CodePudding user response:
expand.grid()
coerces character vectors to factors, which don’t play nicely with your function. tidyr::expand_grid()
preserves input types, so your function works fine:
library(tidyr)
expand_grid(seed_pattern = plist, bb_seq = "ndqeegillkkkkfpssyvv") %>%
rowwise() %>%
mutate(nseq = replace_bb_with_str(seed_pattern = seed_pattern, bb_seq = bb_seq))
# A tibble: 4 × 3
# Rowwise:
seed_pattern bb_seq nseq
<chr> <chr> <chr>
1 ??????????DRHRTRHLAK?????????? ndqeegillkkkkfpssyvv ndqeegillkDRHRT…
2 ????????????????????TRCYHIDPHH ndqeegillkkkkfpssyvv ndqeegillkkkkfp…
3 FKDHKHIDVK????????????????????TRCYHIDPHH ndqeegillkkkkfpssyvv FKDHKHIDVKndqee…
4 FKDHKHIDVK???????????????????? ndqeegillkkkkfpssyvv FKDHKHIDVKndqee
Note that at least with your example data, there’s actually no need to use expand_grid()
(instead of data.frame()
or tibble()
). Or rowwise()
— you’d get the same output without it.