I'm trying to extract date from a text field. I'm using PADINDEX to do this. My results are very inconsistent. Is there anyway I can do this through SQL. I don't have an application to do this for me. I am trying to get this for a report that is needed.
The text field has dates that are not always listed as MM/DD/YYYY sometimes its listed as M/DD/YYYY. Also,
Here is the query I am using:
select
substring(ar.finding_text,patindex('%[0-9]%/[0-9][0-9]/[0-9][0-9][0-9][0-9]%',ar.finding_text),10)
FROM [ARKPPDB].[PowerPath].[dbo].[accession_2] a
LEFT OUTER JOIN acc_results ar on a.id = ar.acc_id
CodePudding user response:
You could add a pattern "priority" like this:
DECLARE @Patterns TABLE(Pattern VARCHAR(50),PatternLength INT)
INSERT INTO @Patterns(Pattern, PatternLength)
VALUES('%[0-9][0-9]/[0-9][0-9]/[0-9][0-9][0-9][0-9] [0-9][0-9]:[0-9][0-9] AM%',19),
('%[0-9][0-9]/[0-9][0-9]/[0-9][0-9][0-9][0-9] [0-9][0-9]:[0-9][0-9] PM%',19),
('%[0-9][0-9]/[0-9][0-9]/[0-9][0-9][0-9][0-9] [0-9]:[0-9][0-9] AM%',18),
('%[0-9][0-9]/[0-9][0-9]/[0-9][0-9][0-9][0-9] [0-9]:[0-9][0-9] PM%',18),
('%[0-9][0-9]/[0-9]/[0-9][0-9][0-9][0-9] [0-9][0-9]:[0-9][0-9] AM%',18),
('%[0-9][0-9]/[0-9]/[0-9][0-9][0-9][0-9] [0-9][0-9]:[0-9][0-9] PM%',18),
('%[0-9][0-9]/[0-9]/[0-9][0-9][0-9][0-9] [0-9]:[0-9][0-9] AM%',17),
('%[0-9][0-9]/[0-9]/[0-9][0-9][0-9][0-9] [0-9]:[0-9][0-9] PM%',17),
('%[0-9]/[0-9][0-9]/[0-9][0-9][0-9][0-9] [0-9][0-9]:[0-9][0-9] AM%',18),
('%[0-9]/[0-9][0-9]/[0-9][0-9][0-9][0-9] [0-9][0-9]:[0-9][0-9] PM%',18),
('%[0-9]/[0-9][0-9]/[0-9][0-9][0-9][0-9] [0-9]:[0-9][0-9] AM%',17),
('%[0-9]/[0-9][0-9]/[0-9][0-9][0-9][0-9] [0-9]:[0-9][0-9] PM%',17),
('%[0-9]/[0-9]/[0-9][0-9][0-9][0-9] [0-9][0-9]:[0-9][0-9] AM%',17),
('%[0-9]/[0-9]/[0-9][0-9][0-9][0-9] [0-9][0-9]:[0-9][0-9] PM%',17),
('%[0-9]/[0-9]/[0-9][0-9][0-9][0-9] [0-9]:[0-9][0-9] AM%',16),
('%[0-9]/[0-9]/[0-9][0-9][0-9][0-9] [0-9]:[0-9][0-9] PM%',16)
SELECT IIF(pat1.Pattern is null, null,substring(replace(UCASE(ar.finding_text),' AT ',' '),patindex(pat1.Pattern,replace(UCASE(ar.finding_text),' AT ',' ')),pat1.PatternLength))
FROM [ARKPPDB].[PowerPath].[dbo].[accession_2] a
LEFT OUTER JOIN acc_results ar on a.id = ar.acc_id
LEFT OUTER JOIN @Patterns pat1 on patindex(pat1.Pattern,ar.finding_text) > 0
LEFT OUTER JOIN @Patterns pat2 on patindex(pat2.Pattern,ar.finding_text) > 0 and pat2.PatternLength > pat1.PatternLength
WHERE pat2.Pattern IS NULL
So, some example text:
Physician/Physician’s office called on 2/1/2022 at 3:27 PM Central
This should return '2/1/2022 3:27 PM' which you should be able to convert to a DATETIME. I did not bother with the Central. Every example you showed included Central, so you can probably just assume it is Central time.
CodePudding user response:
I got a little more information about this post from another thread including a hint of what the data actually looks like. Here's the same code part of the response I created including the test data if someone else wants to play.
DROP TABLE IF EXISTS #TestTable;
GO
--===== Create and populate the test table from the data provided.
-- This is NOT a part of the solution. We're just creating test data here.
SELECT *
INTO #TestTable
FROM (VALUES
('Physician/Physician’s office called on 2/1/2022 at 3:27 PM Central.')
,('Physician/Physician’s office called on 2/1/2022 at 3:34 PM Central.')
,('Physician/Physician’s office called on 2/1/2022 at 2:47 PM Central.')
,('Physician/Physician’s office called on 2/1/2022 at 4:17 PM Central.')
,('Physician/Physician’s office called on 2/1/2022 at 2:52 PM Central.')
,('Physician/Physician’s office called on 2/1/2022 at 2:51 PM Central.')
,('Physician/Physician’s office called on 2/1/2022 at 4:17 PM Central.')
,('Physician/Physician’s office called on 2/1/2022 at 4:34 PM Central.')
,('Physician/Physician’s office called on 2/1/2022 at 11:49 PM Eastern.')
,('Physician/Physician’s office called on 2/1/2022 at 11:27 AM Eastern.')
,('Physician/Physician’s office called on 2/1/2022 at 11:34 AM Eastern.')
,('Physician/Physician’s office called on 2/1/2022 at 10:47 AM Central.')
,('Physician/Physician’s office called on 2/1/2022 at 9:17 AM Mountain.')
,('Physician/Physician’s office called on 2/1/2022 at 10:52 AM Central.')
,('Physician/Physician’s office called on 2/1/2022 at 10:51 AM Central.')
,('Physician/Physician’s office called on 2/1/2022 at 9:17 AM Pacific.')
,('Physician/Physician’s office called on 2/1/2022 at 9:34 AM Pacific.')
,('Physician/Physician’s office called on 2/1/2022 at 11:49 AM Somewhere over the rainbow.')
)v(SomeString)
;
--===== Let's see what we've got
SELECT * FROM #TestTable
;
--===== Solve the given problem.
-- So long as the following pattern is true in the source data, this should always work.
-- WhoCalled by the string ' called on ' to determine the caller followed by...
-- by a date string followed by the string ' at ' followed by a meridian time (Has AM/PM) for the CallDT followed by...
-- a time zone name string for the TimeZone.
-- The result is available as a DATETIME2(0) with no decimal seconds.
SELECT SomeString
,WhoCalled = LEFT(SomeString,ca1.CalledOn-1)
,CallDT = CONVERT(DATETIME2(0),REPLACE(SUBSTRING(SomeString,ca1.CalledOn 10,ca2.Meridian-CalledOn-9),'at',''))
,TimeZone = TRIM(' .' FROM SUBSTRING(SomeString,ca2.Meridian 2,500))
FROM #TestTable
CROSS APPLY (VALUES(CHARINDEX(' called on ',SomeString)))ca1(CalledOn)
CROSS APPLY (VALUES(PATINDEX('% [AP]M %',SomeString) 2))ca2(Meridian)
;
GO