I have a utm value column, 'source_medium' with values that appears in my dataset as:
0.9899.com/(none)
0.9239.com/(referral)
0.77569.com/(cpc)
0.9899.com/(organic)
0.54966/(email)
I want to get the (none) or (referral) or (organic) part separately as the case may be but every time I try I get this figure in the resulting column :
[...] (none)
I have tried different approaches to remove the [...] like:
substring(split_part(source_medium,'/',2) from 1 for 9) as utm_medium
The above only works for the '0.9899' part when I use the function like this:
split_part(source_medium,'/',1) as utm_source,
It doesn't work well when I try to separate the [...] from the utm_source (none). I'm using PostgreSQL
CodePudding user response:
Use regexp_replace
:
select regexp_replace(source_medium, '.*\((.*)\).*', '\1')
Match the whole string, capturing your target, replacing with the capture.
CodePudding user response:
I updated the DBFIDDLE (from the one given in the comments)
This shows the second example given is not correct in all situations.
One example (for all see the dbfiddle):
select
source_medium,
replace(replace(split_part(source_medium,'/',2),'(',''),')','') as utm_medium
from tbl;
output:
source_medium | utm_medium |
---|---|
0.9899.com/(none) | none |
0.9239.com/(referral) | referral |
0.77569.com/(cpc) | cpc |
0.9899.com/(organic) | organic |
0.54966/(email) |