Home > Software design >  Regular expression to remove a portion of text from each entry in commas separated list
Regular expression to remove a portion of text from each entry in commas separated list

Time:01-18

I have a string of comma separated values, that I want to trim down for display purpose.

The string is a comma separated list of values of varying lengths and number of list entries.

Each entry in the list is formatted as a five character pattern in the format "##-NX" followed by some text.

e.g., "01-NX sometext, 02-NX morertext, 09-NX othertext, 12-NX etc..."

Is there an regular expression function I can use to remove the text after the 5 character prefix portion of each entry in the list, returning "01-NX, 02-NX, 09-NX, 12-NX,..."?

I am a novice with regular expressions and I haven't been able figure out how to code the pattern.

CodePudding user response:

I think what you need is

regexp_replace(regexp_replace(mystring, '(\d{2}-NX)(.*?)(,)', '\1\3'), '(\d{2}.*NX).*', '\1')

The inner REGEXP_REPLACE looks for a pattern like nn-NX (two numeric characters followed by "-NX") and any number of characters up to the next comma, then replaces it with the first and third term, dropping the "any number of characters" part.

The outer REGEXP_REPLACE looks for a pattern like two numeric characters followed by any number of characters up to the last NX, and keeps that part of the string.

Here is the Oracle code I used for testing:

with a as (
  select '01-NX sometext, 02-NX morertext, 09-NX othertext, 12-NX etc.' as myString
  from dual
)
select mystring
, regexp_replace(regexp_replace(mystring, '(\d{2}-NX)(.*?)(,)', '\1\3'), '(\d{2}.*NX).*', '\1') as output
from a
  • Related