I am trying to create a df where I add a character into a string at a position defined by another column- but I only want the code to count letters and not numbers or other characters while it does this, hopefully the example tables make this clearer
my initial data looks like this.
String | Insert_pos |
---|---|
PEPTIDE | 3 |
PE[ 10]TIDE | 3 |
i use the following code
library(stringi)
stri_sub(df$String,df$Insert_pos 1, df$insert_pos-1) <- "[ 20]"
and it only half works--- I only want it to count the Letters and not the numbers already added when it adds the new characters. as shown below
what I get | What I want |
---|---|
PEP[ 20]TIDE | PEP[ 20]TIDE |
PE[[ 20] 10]TIDE | PE[ 10]P[ 20]TIDE |
I think the way to do it would be to specify to only count letters, but I cant find how to specify this in Stringi/ not sure if it is possible.
Any help would be great,
Thanks!
CodePudding user response:
You can change the value of your df$Insert_pos
column for the next possible position:
df <- data.frame(
String = c("PEPTIDE", "PE[ 10]TIDE"),
Insert_pos = c(3,3)
)
df$Insert_pos <- ifelse(
stri_sub(df$String,df$Insert_pos, df$Insert_pos) %in% c("[", "]", " ", 1:9),
df$Insert_pos 5 ,
df$Insert_pos)
library(stringi)
stri_sub(df$String,df$Insert_pos 1, df$Insert_pos-1) <- "[ 20]"
df
This gives you then:
String Insert_pos
1 PEP[ 20]TIDE 3
2 PE[ 10]T[ 20]IDE 8
CodePudding user response:
You can use sub
:
sub("(([[:alpha:]][^[:alpha:]]*){3})", "\\1[ 20]", df$String)
#[1] "PEP[ 20]TIDE" "PE[ 10]T[ 20]IDE"
In this case using only upper case will also work.
sub("(([A-Z][^A-Z]*){3})", "\\1[ 20]", df$String)
sub("(([[:upper:]][^[:upper:]]*){3})", "\\1[ 20]", df$String)