I have a column that holds the data in the following format:
Field Name
123_456_ABC_DEF
12_34_456_XYZ_PQR
LMN_OPQ_123_456
In each case I require, the last two block of data i.e.
ABC_DEF
XYZ_PQR
123_456
Is there a way to use charindex()
in manner where it counts '_'
from the right side of the string?
CodePudding user response:
You can use:
STRING_SPLIT
to split your underscore-separated-values into single valuesROW_NUMBER
to identify a ranking in your values based on their position in the original string, and filter in last two valuesSTRING_AGG
to reaggregate your remaining (needed) values
WITH cte AS (
SELECT Field_Name,
value AS extracted_string,
ROW_NUMBER() OVER(PARTITION BY Field_Name ORDER BY CHARINDEX(value, Field_Name) DESC) AS rn
FROM tab
CROSS APPLY STRING_SPLIT(Field_Name, '_')
)
SELECT Field_Name,
STRING_AGG(extracted_string, '_') WITHIN GROUP(ORDER BY rn DESC)
FROM cte
WHERE rn <= 2
GROUP BY Field_Name
Check the demo here.
CodePudding user response:
Please try the following solution.
It is based on tokenization via XML and XQuery.
Notable points:
CROSS APPLY
is tokenizing input as XML.- The XPath predicate
[position() ge (last()-1)]
gives us last two tokens.
SQL
-- DDL and sample data population, start
DECLARE @tbl TABLE (tokens VARCHAR(256));
INSERT @tbl VALUES
('123_456_ABC_DEF'),
('12_34_456_XYZ_PQR'),
('LMN_OPQ_123_456');
-- DDL and sample data population, end
DECLARE @separator CHAR(1) = '_';
SELECT t.*
, REPLACE(c.query('data(/root/r[position() ge (last()-1)])').value('.', 'VARCHAR(256)')
, SPACE(1), @separator) AS Result
FROM @tbl AS t
CROSS APPLY (SELECT TRY_CAST('<root><r><![CDATA['
REPLACE(tokens, @separator, ']]></r><r><![CDATA[')
']]></r></root>' AS XML)) AS t1(c);
Output
tokens | Result |
---|---|
123_456_ABC_DEF | ABC_DEF |
12_34_456_XYZ_PQR | XYZ_PQR |
LMN_OPQ_123_456 | 123_456 |