Home > Software engineering >  Extract a substring from a string that the substring is at different position using SQL Server
Extract a substring from a string that the substring is at different position using SQL Server

Time:04-26

I having a tough stuff on SQL server that have been working on but cannot solve.

I want to extract a numeric substring (and put to a new field) from the original string that the substring is at different positions of the original string. The substring starts with a number and the lenght of the substring is 4. If the lenght of the substring less than 4 then returning NULL. The reason behind this is that people entry data in different ways but a norm is that there is a numeric string containing 4 digits.

This is data on table TB_SAMPLE

Product Sample 
A26     #S #3455 Bag1   
A26     Bag 1 #S 3455 Flush 32
MWP1    #T 3457 Bag3
MIP2    Test #S3457 Bag2 342
AMP2    Test#S #3458 Test546
CMP3    Test S 5658 Test
CMP2    Sample2 E 552 Sample2
CMP3    Sample3 E 553 Sample3

and wanting result as below

Product   Sample                     Sample_Extract
A26       #S #3455 Bag1              3455
A26       Bag 1 #S 3455 Flush 32     3455
MWP1      #T 3457 Bag3               3457 
MIP2      Test #S3457 Bag2 342       3457
AMP2      Test#S #3458 Test546       3458
CMP3      Test S 5658 Test           5658
CMP2      Sample2 E 552 Sample2      NULL
CMP3      Sample3 E 553 Sample3      NULL

I have tried using built-in functions such as SUBSTRING, CHARINDEX, LOCATE, LEN... as well as created user functions but did not help

If you will help me and want to create the table that contains the above data, below are scripts

CREATE TABLE TB_SAMPLE
    (
    Product nvarchar(50),
    Sample nvarchar(30)
    )

INSERT INTO tb_sample VALUES('A26','#S #3455 Bag1')
INSERT INTO tb_sample VALUES('A26','Bag 1 #S 3455 Flush 32')
INSERT INTO tb_sample VALUES('MWP1','#T 3457 Bag3')
INSERT INTO tb_sample VALUES('MIP2','Test #S3457 Bag2 342')
INSERT INTO tb_sample VALUES('AMP2','Test#S #3458 Test546')
INSERT INTO tb_sample VALUES('CMP3','Test S 5658 Test')
INSERT INTO tb_sample VALUES('CMP2','Sample2 E 552 Sample2')
INSERT INTO tb_sample VALUES('CMP3','Sample3 E 553 Sample3')

Any help is great appriceated! Thanks

CodePudding user response:

Get the position of the pattern, and then use SUBSTRING:

SUBSTRING(Sample,NULLIF(PATINDEX('%[0-9][0-9][0-9][0-9]%',Sample),0),4)

The NULLIF is there to return NULL in the event there isn't a 4 digit number.

  • Related