Home > Mobile >  R replace string with character after a sequence
R replace string with character after a sequence

Time:11-05

I'm trying to replace a string with the character comma ,, only if the string appears right after a digit.

Here is an example -

text1 = "PaulWilliamsnéle110187auCaire"
text2 = "StaceyMauranéele190991auMaroc"

When I try using the str_replace_all function from stringr, it replaces all instances of 'au' from the texts.

str_replace_all(text1,"au",",")
str_replace_all(text2,"au",",")

The above functions give the following outputs

P,lWilliamsnéle110187,Caire

StaceyM,ranéele190991,Maroc

However, I'd only like the "au" to be removed following the final digit in the texts, not before.

So ideally, the desired output would be -

PaulWilliamsnéle110187,Caire

StaceyMauranéele190991,Maroc

But I'm unable to figure out how to put this condition into the function, so it only removes the "au" following the final digits for both texts.

Any help would be appreciated

CodePudding user response:

We can use sub

 string <- c(text1, text2)
 sub("(.*\\d )(au)(.*)", "\\1,\\3", string)
[1] "PaulWilliamsnéle110187,Caire" "StaceyMauranéele190991,Maroc"

CodePudding user response:

You could use a look behind. ie (?<=\\d)au where we check for au that is preceeded by a digit:

sub("(?<=\\d)au", ",", string, perl = TRUE)
[1] "PaulWilliamsnéle110187,Caire" "StaceyMauranéele190991,Maroc"

You could also match the digit and au (\\d)au and capture the digit then replace the whole matched part with the captured group and a comma

sub("(\\d)au", "\\1,", string)
[1] "PaulWilliamsnéle110187,Caire" "StaceyMauranéele190991,Maroc"
  • Related