Home > other >  Mysql: extract a string from field between delimiters (backwards)
Mysql: extract a string from field between delimiters (backwards)

Time:08-20

I have a Column 'ACCOUNT_NUMBER' from a table 'BankingActivity' which contains data as follow :

example:

ManualBanking-BankDeposit-350-1006590343--INTERNAL_A

or

MyPayCard-MyPayDeposit-620-989228234--TL

I need to extract the number '1006590343' or '989228234'

Initially i execute the following query:

select substr(  `BankingActivity`.`ACCOUNT_NUMBER`,(
    locate( '--', `BankingActivity`.`ACCOUNT_NUMBER` ) - 9  ),9 ) * 1 
from BankingActivity

Which works fine if the length of the string does not exceed 9 digits. Over 9 digits, I obviously have issues and can not get the full string.

How can i look backwards for the delimiter '--' and then extract the value between the '--' delimiter and the previous '-' delimiter?

I tried with some Regex but I am not familiar enough with it to get a correct result.

CodePudding user response:

If I have understood your question correctly then you can try something like this -

select SUBSTRING_INDEX(SUBSTRING_INDEX('ManualBanking-BankDeposit-350-1006590343--INTERNAL_A', '-' ,-3), '--', 1);

select SUBSTRING_INDEX(SUBSTRING_INDEX('MyPayCard-MyPayDeposit-620-989228234--TL', '-' ,-3), '--', 1);

CodePudding user response:

Try

SELECT regexp_substr(
         regexp_substr(acct, '-\\d --'), '\\d ') 
  FROM (
    SELECT 'ManualBanking-BankDeposit-350-1006590343--INTERNAL_A' as acct 
    UNION
    SELECT 'MyPayCard-MyPayDeposit-620-989228234--TL'
  ) accounts;

The inner regexp_substr extracts a substring that begins with a dash followed by 1 or more digits and ends with two dashes. That would be e. g. '-1006590343--'. From this, the outer regexp_substr extracts all consecutive digits, that is '1006590343'.

More detailed information about regular expressions in MySQL can be found in the documentation.

CodePudding user response:

This is probably a job for SUBSTRING_INDEX().

Check it out. Fiddle here.

SET @s = 'ManualBanking-BankDeposit-350-1006590343--INTERNAL_A';
SELECT SUBSTRING_INDEX(@s, '-', -3);

This splits your string on '-'. It takes everything after the third '-' delimiter from the end, and gives you back 1006590343--INTERNAL_A.

Then we use SUBSTRING_INDEX() again on that.

SELECT SUBSTRING_INDEX(SUBSTRING_INDEX(@s, '-', -3), '-', 1);

Lo and behold, this gets us 1006590343.

But. This is a brittle way to do it. MySQL's string processing isn't easy to program in detailed ways. This solution doesn't take into account things like missing dashes at the end of the string. Garbage in, garbage out. Use a host language like C# / php / nodejs / Java etc to do this kind of string analysis if you want it to be super-robust for real world data.

  • Related