Home > Software design >  Separate a field based on character position
Separate a field based on character position

Time:10-28

I have a field called Name that I am breaking out into first, last, and middle initial. My query is separating it into first and last, but first name contains the middle initial. Some names have a middle initial, some have it and a period, some have neither, some have an entire word, & others are the name of a group where this would not apply. Here's my query.

SELECT Name,
   CASE WHEN Name LIKE '%,%' THEN REPLACE(SUBSTRING(Name, 1, (CHARINDEX(',', Name))), ',', '') ELSE Name END AS LastName,
   CASE WHEN Name LIKE '%,%' THEN SUBSTRING(Name, (CHARINDEX(',', Name)   2), LEN(Name)) ELSE Name END AS FirstName
FROM Customer

How could I:

  1. change my substring on the first name column to only go up to the space in that field
  2. only put the first letter of the midddle name into MI field

Current result

enter image description here

Desired result

enter image description here

CodePudding user response:

OK I'm sorry, this is a real hack job.

My usual approach for these is to do the processing in a temp table - you update rows that fit certain criteria using one method; then use another method if the first didn't catch them; and repeat until you have none left (or only manually modifiable ones).

For this case, I have done the first step as you in a similar fashion as you did in your question (though I used LEFT and STUFF to find the name components - I find they have fewer issues than LEN and sometimes SUBSTRING).

However, I used those results in a CTE as input into the next step of the processing: if the first name had a space, take the LEFT component of that as the first name, and then LEFT(1) of the remaining part to get the initial.

In this case, the CTE is operating similar to what I would have done with a temp table.

Here is a db<>fiddle with data.

Note that there is some handling of extra spaces here (with LTRIM and RTRIM) but it's nowhere near perfect. In my data below I also added extra lines for Cat, Sylvester The with spacing changes as examples.

CREATE TABLE #Names (FullName nvarchar(100));
INSERT INTO #Names (FullName) VALUES
('Duck, Daffy'),
('Bunny, Bugs F.'),
('Cartoon Network'),
('Doo, Scooby D'),
('Cat, Sylvester The'),
('Cat,Sylvester The'),
('Cat,Sylvester     The');

WITH NameSplit AS
        (SELECT     FullName,
                    LTRIM(RTRIM(CASE WHEN FullName LIKE '%,%' THEN LEFT(FullName, CHARINDEX(',',FullName)-1) ELSE FullName END)) AS LastName,
                    LTRIM(RTRIM(CASE WHEN FullName LIKE '%,%' THEN STUFF(FullName, 1, CHARINDEX(',',FullName), '') ELSE FullName END)) AS NameRest
            FROM    #Names
        )
    SELECT      FullName,
                LastName,
                CASE WHEN FullName LIKE '%,%' AND NameRest LIKE '% %' THEN LEFT(NameRest, CHARINDEX(' ', NameRest)-1) ELSE NameRest END AS FirstName,
                CASE WHEN FullName LIKE '%,%' AND NameRest LIKE '% %' THEN LEFT(LTRIM(STUFF(NameRest, 1, CHARINDEX(' ', NameRest), '')), 1) ELSE '' END AS MiddleInitial
    FROM        NameSplit

Results

FullName               LastName         FirstName        MiddleInitial
Duck, Daffy            Duck             Daffy    
Bunny, Bugs F.         Bunny            Bugs             F
Cartoon Network        Cartoon Network  Cartoon Network 
Doo, Scooby D          Doo              Scooby           D
Cat, Sylvester The     Cat              Sylvester        T
Cat,Sylvester The      Cat              Sylvester        T
Cat,Sylvester     The  Cat              Sylvester        T

CodePudding user response:

My go to approach for these types of clean up is to use union all and compartmentalize the parsing logic for individual select by utilizing the where clause. This approach isn't too different than the answer you accepted but I am leaving it in there in case you find it helpful

with cte as

(select *, 
       trim(left(fullname,charindex(',',fullname ',')-1)) as fname,
       trim(stuff(fullname,1,charindex(',',fullname),'')) as lname
 from your_table)

select *, left(trim(stuff(lname,1,charindex(' ',lname),'')),1) as mid_init  
from cte
where fullname like'%,%' and lname like '% %' --when it looks like we have a middle name
union all
select *,null
from cte  
where fullname not like'%,%' or lname not like '% %' -- no middle name
  • Related