I am new to regex and need to search a string field in Impala for multiple matches to this exact sequence of characters: ~FC*
since the ~
and *
are not letters or digits, I am unsure on how to search for these in this specific order and not just for any of these single characters occuring.
This is my code so far, have tried both of these [~FC*]
or ^~FC*$
This is a test string, it has 2 occurrences:
N4*CITY*STATE*2155446*2120~FC*C*IND*30*MC*blah blah fjdgfeufh*27*0*****Y~FC*Z*IND*39*MC*jhlkfhfudfgsdkufgkusgfn*23*0*****Y~
CodePudding user response:
.*(~FC\*).* or .*(\~FC\*).*
.* - Zero or more characters
.*(~FC\*).* - Means Search for ~FC*
if 1st one does not work, please try second one, it might work if tilde symbol is reserved in regex.
CodePudding user response:
You can use a simple SQL like below. This will work only on hardcoded string.
select (length(mycol)- length (replace(mycol,'~FC*','')))/length('~FC*') as occurance_str
Here is the SQL i tested ok
select
(length('N4*CITY*STATE*2155446*2120~FC*C*IND*30*MC*blah blah fjdgfeufh*27*0*****Y~FC*Z*IND*39*MC*jhlkfhfudfgsdkufgkusgfn*23*0*****Y~')
- length(replace('N4*CITY*STATE*2155446*2120~FC*C*IND*30*MC*blah blah fjdgfeufh*27*0*****Y~FC*Z*IND*39*MC*jhlkfhfudfgsdkufgkusgfn*23*0*****Y~','~FC*',''))
)/length('~FC*') as occurance_str