Home > Back-end >  Regex to replace everything after last occurrence of substring in postgres
Regex to replace everything after last occurrence of substring in postgres

Time:03-18

I have certain format of a values in column, I want replace everything after last occurrence of specific substring. For e.g.

  • When I have a string like 110\u001F122\u001F2344, I want a to replace everything after last occurrence of \u001F with 'NOTHING'
  • When I have a string like 0100\u001F79033\u001F000000\u001F0000\u001FA044101\u001F0112400\u001FBA, I want a to replace everything after last occurrence of \u001F with 'NOTHING'

so for above 2 cases, the output should be -

  • 110\u001F122\u001FNOTHING
  • 0100\u001F79033\u001F000000\u001F0000\u001FA044101\u001F0112400\u001FNOTHING

CodePudding user response:

The following regex should do that:

(.*\\u001F).*$

It matches the last \u001F because it's anchored at the end of the string. The group (...) there matches everything before the last \u001F (and should be kept). With a reference to that part of the string in the replacement, we keep it in the output:

with data (input) as (
  values 
    ('0100\u001F79033\u001F000000\u001F0000\u001FA044101'), 
    ('110\u001F122\u001F2344')
)
select input, regexp_replace(input, '(.*\\u001F).*$', '\1NOTHING')
from data

returns:

input                                              | regexp_replace                                    
--------------------------------------------------- ---------------------------------------------------
0100\u001F79033\u001F000000\u001F0000\u001FA044101 | 0100\u001F79033\u001F000000\u001F0000\u001FNOTHING
110\u001F122\u001F2344                             | 110\u001F122\u001FNOTHING                         
  • Related