Home > Enterprise >  SQL Server : how to split string in one column and assign values to another columns (no extra rows!)
SQL Server : how to split string in one column and assign values to another columns (no extra rows!)

Time:11-26

In one table I have a full address which I have to split into the city and postcode to store in other columns.

Here is sample address:

address= 'Marco Polo street 8a, 44000 Vienna'

Some rows may contain multiple addresses but I have to keep only the last portion after the last comma (44000 Vienna).

So the final result should be:

 post_office = 44000 
 city = Vienna

Tried so far: https://stackoverflow.com/a/13430958/2119941

with nested substrings, like this:

city = SUBSTRING(SUBSTRING(address, CHARINDEX(' ', address, 0)   1, LEN(address)), CHARINDEX(',', address, 0)   1, LEN(address))

and

post_office = SUBSTRING(SUBSTRING(address, CHARINDEX(',', address, 0)   1, LEN(address)), 0, CHARINDEX(' ', address, 0))

But this only partially works since cities can also have multiple words like:

city = '44300 Vienna Old Town'

The rule for my data is: find the last comma, take all numbers after the comma and define that as post_office and then, after the first space after post_office until rest of the string it should be defined as the city.

I'm running Microsoft SQL Server 2017 (v14.0.3391.2 - X64)

CodePudding user response:

Well, I guess the comments below your question have already gone in detail into why the way the data is stored in is a bad idea and in what ways what you attempt is likely to fail if your assumptions about the addresses are not 100% accurate.

That being said, you might be in the process of trying to remedy that and if you are sure that the rule you defined always fits, you can do what you want.

To achieve that, use the REVERSE() function twice to find the last occurrence of a substring. This way, you can separate everything starting at the last comma. Use the TRIM() function on that to remove surrounding whitespace that may occur, e.g. if there are spaces directly following the comma.

If you then are sure there are only numeric characters and spaces in the postcode and the postcode follows immediately, you can use PATINDEX together with LEFT() and RIGHT() to find the first non-numeric character, similar to this SO post.

Full solution: For the input data

address
Marco Polo street 8a, 44000 Vienna
Marco Polo street 8a, 44000 Vienna Old Town
Marco Polo street 8a,44000 Vienna
Marco Polo street 8a, Marco Polo street 8b, 44 000 Vienna Old Town
Marco Polo street 8a, 44 00 0 Vienna

the statement

with tmp as (
select TRIM(REVERSE(SUBSTRING(REVERSE(address),0,CHARINDEX(',',REVERSE(address))))) lastpart
  from addresstable
  ) 
  select LEFT(lastpart, PATINDEX('%[^0-9 ]%', lastpart) -1) as postcode,
         RIGHT(lastpart, len(lastpart) - PATINDEX('%[^0-9 ]%', lastpart)   1) as city
  from tmp

produces the result

postcode city
44000 Vienna
44000 Vienna Old Town
44000 Vienna
44 000 Vienna Old Town
44 00 0 Vienna

Notes:

  • if you don't want to allow spaces, remove them from the expression in PATINDEX(), i.e. '%[^0-9]%'
  • if you want your postcode to be of type int, use TRY_CONVERT() together with REPLACE() and NULLIF(). See this SO answer and dbfiddle below...

Alternatively, if you are sure that whatever string follows after the last comma until the next whitespace positively is the postcode and you want to allow alphabetic characters in the postcode, your attempt from before already does it. This will however break if your postcode has spaces.

with tmp as (
select TRIM(REVERSE(SUBSTRING(REVERSE(address),0,CHARINDEX(',',REVERSE(address))))) lastpart
  from addresstable
  ) 
  select SUBSTRING(lastpart,0,CHARINDEX(' ',lastpart,0) 1) as postcode,
         SUBSTRING(lastpart,CHARINDEX(' ',lastpart,0) 1,LEN(lastpart)) as city
  from tmp

See this db<>fiddle for a comparison of the two, including a few examples of addresses that will not work with one of the two.

CodePudding user response:

address= 'Marco Polo street 8a, 44000 Vienna' ;   Var 
 newadd=Split.address(,);  Var pincode =split.newadd( );   pincode[0]// is pincode 44000      pincode[1]// vienna
  • Related