I have the following data in a single element in a database table:
Port Jefferson Station, NY 11776-2598
Northport, NY 11768
Garden City, NY 11530
I am trying to break each line up into a new table which contains a city field, a state field and a zip code field.
I am running a script with the following code to pick up the City value:
Create function [dbo].[fnBuildCityState](@AddressCity varchar(40))
RETURNS @NameTable TABLE
(
City VARCHAR(255),
State VARCHAR(255),
Zip VARCHAR(255)
)
as
BEGIN
-- Index and word variables.
DECLARE @AccountNameEndIndex INT;
DECLARE @FirstWordEndIndex INT;
DECLARE @SecondWordEndIndex INT;
DECLARE @ThirdWordEndIndex INT;
DECLARE @FirstWord NVARCHAR(8);
DECLARE @SecondWord NVARCHAR(50);
DECLARE @ThirdWord NVARCHAR(50);
SET @AddressCity = RTRIM(ISNULL(@AddressCity,'No Address, City ZipCode'))
if (LEN(@AddressCity)<3)
BEGIN
SET @AddressCity = 'No Address, City ZipCode'
END
-- Get the last index.
SET @AccountNameEndIndex = LEN(@AddressCity) - 1;
-- Get the first word.
SET @FirstWordEndIndex = CHARINDEX(',', @AddressCity, 0);
SET @FirstWord = SUBSTRING(@AddressCity, 0, @FirstWordEndIndex);
INSERT INTO @NameTable (City, State, Zip)
VALUES (@FirstWord, @SecondWord, @ThirdWord)
RETURN
END
When I run the code, and look at the resulting table, I see this output:
City State Zip
Port Jef NULL NULL
Northpor NULL NULL
Garden C NULL NULL
The line of code which should return the index to the location of the comma in the input is
SET @FirstWordEndIndex = CHARINDEX(',', @AddressCity, 0);
It appears to always return a value of 8 regardless of where the comma is in the string.
Any idea why?
CodePudding user response:
The specific answer to your question is because you defined @FirstWord
as NVARCHAR(8)
so its only going to contain up to 8 chars.
But the bigger question is why are you trying trying to do that RBAR? SQL is set-based for a reason... you can build a query to do everything you need in one hit e.g.
SELECT b.City, C.[State], c.ZipCode
FROM (
VALUES
('Port Jefferson Station, NY 11776-2598'),
('Northport, NY 11768'),
('Garden City, NY 11530')
) x ([Address])
CROSS APPLY (
VALUES
(CHARINDEX(',', x.[Address], 0))
) a (FirstComma)
CROSS APPLY (
VALUES
(SUBSTRING(x.[Address],1,a.FirstComma-1)
, SUBSTRING(x.[Address],a.FirstComma 2,LEN(x.[Address])))
) b (City,SecondString)
CROSS APPLY (
VALUES
(SUBSTRING(b.SecondString,1,2)
, SUBSTRING(b.SecondString,4,LEN(b.SecondString)))
) c ([State],ZipCode);
Returns:
City | State | ZipCode |
---|---|---|
Port Jefferson Station | NY | 11776-2598 |
Northport | NY | 11768 |
Garden City | NY | 11530 |
This is simplistic and doesn't cover your edge cases such as no address, or a state with a length other then 2 characters. But thats all stuff you can easily sort out.
The point of using using CROSS APPLY
is to avoid having to repeat the same calculations in multiple places - which a query without CROSS APPLY
would need to do.
CodePudding user response:
Please try the following solution.
It is based on tokenization instead of string parsing.
SQL
-- DDL and sample data population, start
DECLARE @tbl TABLE (ID INT IDENTITY PRIMARY KEY, tokens VARCHAR(100));
INSERT @tbl (tokens) VALUES
('Port Jefferson Station, NY 11776-2598'),
('Northport, NY 11768'),
('Garden City, NY 11530');
-- DDL and sample data population, end
DECLARE @separator CHAR(1) = ',';
SELECT t.*
, city = c.value('(/root/r[1]/text())[1]', 'VARCHAR(50)')
, state = LEFT(TRIM(c.value('(/root/r[2]/text())[1]', 'VARCHAR(50)')), 2)
, zip = SUBSTRING(TRIM(c.value('(/root/r[2]/text())[1]', 'VARCHAR(50)')), 3, 50)
FROM @tbl AS t
CROSS APPLY (SELECT TRY_CAST('<root><r><![CDATA['
REPLACE(tokens, @separator, ']]></r><r><![CDATA[')
']]></r></root>' AS XML)) AS t1(c);
Output
ID | tokens | city | state | zip |
---|---|---|---|---|
1 | Port Jefferson Station, NY 11776-2598 | Port Jefferson Station | NY | 11776-2598 |
2 | Northport, NY 11768 | Northport | NY | 11768 |
3 | Garden City, NY 11530 | Garden City | NY | 11530 |
CodePudding user response:
I tend to lean towards JSON when parsing strings
Example
Select City = JSON_VALUE(JS1,'$[0]')
,State = JSON_VALUE(JS2,'$[0]')
,ZIP = JSON_VALUE(JS2,'$[1]')
From YourTable A
Cross Apply (values ('["' replace(Address,',','","') '"]') ) B(JS1)
Cross Apply (values ('["' replace(ltrim(JSON_VALUE(JS1,'$[1]')),' ','","') '"]') ) C(JS2)
Results
City State ZIP
Port Jefferson Station NY 11776-2598
Northport NY 11768
Garden City NY 11530
If you wanted to use as a TVF
Declare @AddressCity varchar(max) = 'Port Jefferson Station, NY 11776-2598'
Select City = JSON_VALUE(JS1,'$[0]')
,State = JSON_VALUE(JS2,'$[0]')
,ZIP = JSON_VALUE(JS2,'$[1]')
From (values ('["' replace(@AddressCity,',','","') '"]') ) A(JS1)
Cross Apply (values ('["' replace(ltrim(JSON_VALUE(JS1,'$[1]')),' ','","') '"]') ) B(JS2)