Home > front end >  SQL Server, TSQL regex_exp to extract date which can have variable length
SQL Server, TSQL regex_exp to extract date which can have variable length

Time:04-13

maybe someone has similar problem and can help or advice something, how resolve this in SQL Server, language: TSQL

I have some string in variable:

declare @importedFileName varchar(max)='C:\FileSystem\Import\2022\4\12\Document.Import.NewCustomer.j4898u9sdfsdf4345jh3b4h.docx'
    

I need extract only this piece:

2022\4\12

which can have different number of signs, for example:

2022\12\12 - length = 10
2022\1\1   - length = 8
2022\4\12  - length = 9

I did it, like that:

declare @extractDate varchar(max) = substring(@importedFileName,PATINDEX('%[0-9]%[0-9]%[0-9]%',@importedFileName),10)
declare @extract int = len(@extractDate)- PATINDEX('%[0-9]%',reverse(@extractDate)) 2
declare @fileName varchar(max) =replace(substring(@extractDate,0,@extract),'\','_')
select @fileName

it works, but it looks, very unprofessional, even terrible

I wanted one regex_exp, something like: %[0-9]%[0-9]%[0-9] - but it doesn't work properly

Any suggestions?

CodePudding user response:

Here's how I'd do it.

DECLARE @importedFile VARCHAR(MAX) =
  'C:\FileSystem\Import\2022\04\04\Document.Import.NewCustomer.j4898u9sdfsdf4345jh3b4h.docx';

SELECT      TheDate = SUBSTRING(FrontOf.String,1,LEN(FrontOf.String)-Endof.String)
FROM        (VALUES(PATINDEX('%2[0-9][0-9][0-9]%',@importedFile)))  AS dt(Yr)
CROSS APPLY (VALUES(SUBSTRING(@importedFile,dt.Yr,10)))             AS FrontOf(String)
CROSS APPLY (VALUES(PATINDEX('%[0-9]%',REVERSE(FrontOf.String))-1)) AS Endof(String);

With a table:

DECLARE @imports TABLE(importFile VARCHAR(MAX));
INSERT @imports VALUES
('C:\FileSystem\Import\2022\04\04\Document.Import.NewCustomer.j4898u9sdfsdf4345jh3b4h.docx'),
('C:\FileSystem\Import\2022\6\14\??Document.Import.NewCustomer.5tttyadf4345jh3basff.docx'),
('C:\FileSystem\Import\2022\7\7\...Document.Import.NewCustomer.t09afgsv05jh3basff.docx');

SELECT
  Original  = i.importFile,
  NewString = f.TheDate
FROM  @imports AS i
CROSS APPLY
(
  SELECT      TheDate = SUBSTRING(FrontOf.String,1,LEN(FrontOf.String)-Endof.String)
  FROM        (VALUES(PATINDEX('%2[0-9][0-9][0-9]%',i.importFile)))  AS dt(Yr)
  CROSS APPLY (VALUES(SUBSTRING(i.importFile,dt.Yr,10)))             AS FrontOf(String)
  CROSS APPLY (VALUES(PATINDEX('%[0-9]%',REVERSE(FrontOf.String))-1)) AS Endof(String)
) AS f;

Results:

Original                                                                        NewString
------------------------------------------------------------------------------- ----------
C:\FileSystem\Import\2022\04\04\Document.Import.NewCustomer.j4898u9sdf3b4h.docx 2022\04\04
C:\FileSystem\Import\2022\6\14\??Document.Import.NewCustomer.5tttyadf4ff.docx   2022\6\14
C:\FileSystem\Import\2022\7\7\...Document.Import.NewCustomer.t09afgsv0.docx     2022\7\7

CodePudding user response:

Since you effectively only have four patterns to check (assuming no years prior to 1000 or after 9999...), PATINDEX can do it:

DECLARE @importDate DATE;
SELECT @importDate = CONVERT(DATE, 
    MAX(
        REPLACE(
            SUBSTRING(@importedFileName, NULLIF(PATINDEX(p, @importedFileName), 0)   1, l), 
            '\', '/'
        )
    ), 111
)
FROM (VALUES 
   ('%\[0-9][0-9][0-9][0-9]\[0-9][0-9]\[0-9][0-9]\%', 10),
   ('%\[0-9][0-9][0-9][0-9]\[0-9][0-9]\[0-9]\%', 9),
   ('%\[0-9][0-9][0-9][0-9]\[0-9]\[0-9][0-9]\%', 9),
   ('%\[0-9][0-9][0-9][0-9]\[0-9]\[0-9]\%', 8)
) _(p, l);

SELECT @importDate;

Of course this still doesn't look very nice, which is why you don't do string manipulation in T-SQL if you can avoid it. (Converting to a DATE wasn't asked for, but it's useful to show, and it's easy to remove if you don't need it.)

  • Related