I have a text string in the following format:
09/05/2021: La persona seleccionada se encuentra en seguridad 05/09/2021: Se envia un nuevo grupo de candidatos 07/11/2021: El candidato contratado no inicia
So I want to separate each comment from the string by each date. How do i do that?
The result should be the following:
DATE | COMMENT |
---|---|
09/05/2021 | La persona seleccionada se encuentra en seguridad |
05/09/2021 | Se envia un nuevo grupo de candidatos |
07/11/2021 | El candidato contratado no inicia |
CodePudding user response:
I don't like solutions that rely on the order of output from SPLIT, because the manual states that it is not guaranteed to be the order of input.. In practice I believe that it uses .NET's split under the hood, and probably will output in order..
Caveat emptor!
declare @t VARCHAR(200) = '09/05/2021: La persona seleccionada se encuentra en seguridad 05/09/2021: Se envia un nuevo grupo de candidatos 07/11/2021: El candidato contratado no inicia';
WITH x AS (
SELECT
RIGHT(LAG(value) OVER(order by (select null)), 10) as d,
SUBSTRING(value, 2, LEN(value) - CASE WHEN value LIKE '%__/__/____' THEN 11 ELSE 1 END) as t
FROM
string_split(@t, ':')
)
SELECT d, t FROM x WHERE d IS NOT NULL
This splits on :
and then retrieves the date off the end of the "previous" value for pairing with the text from the value of the current. It examines whether there is a date on the end before deciding whether to substring up to the string end, or up to 10 chars back from the string end
If you want to do this safer in SQLS, you might do better to use a recursive CTE and SUBSTRING and CHARINDEX to look for the positions of ': '
progressively, and then cut from 10 chars back, up to the next index.
That looks like this:
declare @t VARCHAR(200) = '09/05/2021: La persona seleccionada se encuentra en seguridad 05/09/2021: Se envia un nuevo grupo de candidatos 07/11/2021: El candidato contratado no inicia';
WITH x(s, sp, p) as (
SELECT s, 1, CHARINDEX(': ', s) FROM (SELECT @t AS s) xx
UNION ALL
SELECT s, p 1, CHARINDEX(': ', s, p 1)
FROM X
WHERE p > 0
)
SELECT
SUBSTRING(s, sp - 11, 10) d,
SUBSTRING(s, sp 1, CASE WHEN p > 0 THEN p - sp - 12 ELSE LEN(s) END) c
FROM x
WHERE sp > 1
The recursive CTE repeatedly uses CHARINDEX to find ': '
and effectively gives a starting position (sp) and ending position (p) each time it goes round. By using the position from last time as the starting position this time, the string is searched incrementally. From these an adjustment of index to factor the date can be used, and similarly for the comment. Do a select * FROM x
if you want to see more info
It would be much safer to do this in the front end language, for example C# does guarantee split order, this logic is similar:
var a = string.Split(": ");
for(int i = 1; i < a.Length - 1; i )
Console.WriteLine( (a[i-1][^10..], a[i][..^11]) );
Console.WriteLine( (a[^2][^10..], a[^1]) );
This variation splits on ": "
, something SQLS cannot do, which reduces the need to trim extraneous spaces from the start. After that a loop processes all but the last line (it has no date on the end so must be treated differently) starting from 1 (so it can safely reference a previous element).
a[i-1][^10..]
means arraya
, element string ati-1
, substring 10 from end to 0 from end i.e. last 10 charsa[i][..^11]
means arraya
, elementi
, substring from start to 11 chars back from end i.e. all except last 11 charsa[^2][^10..]
means arraya
, element 2 from end i.e. last but one, last 10 charsa[^1]
means arraya
, last element of