Home > OS >  Why group capturing extract eveything in str_replace in r
Why group capturing extract eveything in str_replace in r

Time:09-26

Context

a = 'g_pm10year1126.81 - 139.90'

I have a character vector a. I want to extract the content after year1 in the string a ("126.81 - 139.90").

By using str_extract(a, "(? <=year1).*") I successfully extracted the content I wanted.

After that, I tried to use group capturing in the str_replace function, but it returned the whole string a.

Question

My question is why str_replace(a, "(? <=year1)(. *)", '\\1') returns "g_pm10year1126.81 - 139.90".

As I understand it it should return 126.81 - 139.90.

Reproducible code:

library(stringr)

a = 'g_pm10year1126.81 - 139.90'

> str_extract(a, "(?<=year1).*")
[1] "126.81 - 139.90"

> str_replace(a, "(?<=year1)(.*)", '\\1')
[1] "g_pm10year1126.81 - 139.90"

CodePudding user response:

The issue is that you are replacing the captured group with itself. Hence you are not changing anything and end up with your input string.

To achieve your desired result using str_replace you have to replace the part before the captured group, i.e. you could do:

library(stringr)

a = 'g_pm10year1126.81 - 139.90'

str_replace(a, "^.*?(?<=year1)(.*)", '\\1')
#> [1] "126.81 - 139.90"
  • Related