Home > Mobile >  Conditional REGEX in R to select text in front of or behind a specific character --- "%" i
Conditional REGEX in R to select text in front of or behind a specific character --- "%" i

Time:01-09

I have a character vector of addresses which is formed by merging contents of two different vectors. A "%" separates the data in each observation, left(1) from right(2). And the data looks like this:

ShippingStreet <- c("123 Main St#4 Center Street", "U5 Folsom Street",
                    "59 Hyde Street%") 

I want to keep the data on the left side of % even if there is something on the right, and on the right side if there is nothing on the left.

So output should look like this:

123 Main St
555 Folsom Street
59 Hyde street

I wrote a conditional regex as follows and use it in the gsub, but it is not doing what I though it should do.

pattrn_pct <- "/(?(?=%)..(%.*$)|(^.*%))/gm"`   <<< looks for % and then selects behind the % to drop if there is something in front of the %, or after the % if nothing in front ...

gsub(pattrn_pct, "", ShippingStreet, perl=T)  <<< replace selection with ""

CodePudding user response:

We can use str_extract() here with the regex pattern [^%] :

str_extract(ShippingStreet, "[^%] ")

[1] "123 Main St"       "555 Folsom Street" "59 Hyde Street"

Data:

ShippingStreet <- c("123 Main St#4 Center Street", "U5 Folsom Street",
                    "59 Hyde Street%")

CodePudding user response:

Using sub in base R

sub("^%?([^%] ).*", "\\1", ShippingStreet)
[1] "123 Main St"       "555 Folsom Street" "59 Hyde Street"   
  • Related