I have ploughed through the massive amount of queries on here relating to SUBSTRING
and CHARINDEX
but I cannot find one that answers my query.
I am extracting a single part of a long piece of text but it is returning 48 characters too much. E.g. in the first row of the results the length should be 44 characters rather than the 92 characters it is giving. I split out the various parts of the query into the last 3 columns to get the starting position and the length but when I put it together it doesn't work correctly. What am I doing wrong?
My code: (Updated to make consumable I hope)
IF OBJECT_ID('tempdb.dbo.#WQuery', 'U') IS NOT NULL
DROP TABLE #WQuery
CREATE TABLE #WQuery
(
IncidentID VARCHAR(100),
Description VARCHAR(250),
)
INSERT INTO #WQuery
(IncidentID, Description)
VALUES
('B209BBA0-9039-ED11-81AD-0050569FE3BD','<b>Please indicate your company''s export status:</b><br />New to export (less than 1 years experience)<br /><br /><b>What is the destination market for your goods/services?</b><br />United Kingdom<br /><br />'),
('13A75070-0F38-ED11-81AD-0050569FE3BD','<b>Please indicate your company''s export status:</b><br />Novice exporter (1-3 years experience)<br /><br /><b>What is the destination market for your goods/services?</b><br />Kazakhstan <br /><br />'),
('D2926CA8-EB28-ED11-81AC-0050569FE3BD','<b>Please indicate your company''s export status:</b><br />New to export (less than 1 years experience)<br /><br /><b>What is the destination market for your goods/services?</b><br />Zambia (central Africa)<br/'),
('7226B826-DF24-ED11-81AC-0050569FE3BD','<b>Please indicate your company''s export status:</b><br />Experienced exporter (3 years experience in multiple markets)<br /><br /><b>What is the destination market for your goods/services?</b><br />Spain<br'),
('E636692C-3C22-ED11-81AC-0050569FE3BD','<b>Please indicate your company''s export status:</b><br />New to export (less than 1 years experience)<br /><br /><b>What is the destination market for your goods/services?</b><br />India<br /><br />'),
('C13937A0-EF16-ED11-81AC-0050569FE3BD','<b>Please indicate your company''s export status:</b><br />New to export (less than 1 years experience)<br /><br /><b>What is the destination market for your goods/services?</b><br />Rotterdam<br /><br /><b>')
;
select
i.incidentid
, i.description
, SUBSTRING(i.description,
CHARINDEX('export status:</b><br />', i.description) LEN('export status:</b><br />'),
CHARINDEX('<br /><br /><b>What is the destination market',i.description) - (CHARINDEX('export status:</b><br />', i.description)-LEN('export status:</b><br />')) ) As ExportStatus
--substring(string,start,length)--
--length is the number of characters to extract - must be positive--
--split the substring above to test the values--
,CHARINDEX('export status:</b><br />', i.description) LEN('export status:</b><br />') as start
,CHARINDEX('<br /><br /><b>What is the destination market',i.description) as secondQStart
,CHARINDEX('<br /><br /><b>What is the destination market',i.description) - CHARINDEX('export status:</b><br />', i.description)-LEN('export status:</b><br />') as length
--charindex(substring,string,start)--
from #WQuery i
DROP TABLE #WQuery;
The expected results for column 3 (export status) should be:
However I am getting this:
When I added the code from the column 'Length' into the parameter of the SUBSTRING query it gave an error.
Msg 537, Level 16, State 3, Line 3
Invalid length parameter passed to the LEFT or SUBSTRING function.
When I added the brackets around that part of the SUBSTRING the query ran but as you can see it is not returning the same number of characters and the 'length' in the last column.
I am struggling to understand why the length parameter in the SUBSTRING query is not working the same as the same code in the 'length' column. And I also do not understand why the code needs brackets in the parameter of SUBSTRING query to work or throws the error.
CodePudding user response:
Your text is almost valid XHTML. You just need to fix up the ends of the lines in some cases, I assume this is a copy-paste error.
For example, <br />Spain<br
should be <br />Spain<br />
, and Rotterdam<br /><br /><b>
should be Rotterdam<br /><br /><b />
.
Now you can just use XQuery to get the right piece of text, which appears to be the first text not enclosed by any node.
SELECT ExportStatus = CAST(wq.Description AS xml).value('(/text())[1]', 'nvarchar(max)')
FROM #WQuery wq