Home > OS >  Remove part of a string after a character until a charcter and repeat until the string end
Remove part of a string after a character until a charcter and repeat until the string end

Time:05-12

I am trying to filter part of a string until it reaches a specific character and repeat until the end of string.

Data loooks like

095930
CF0010 EN
060983 PS
086588 GG;086326 GG
900010;
CF0002;;CF0018;

Output should be

095930
CF0010
060983
086588;086326
900010;
CF0002;;CF0018;

I tried the following:

SUBSTRING(column, 1, CHARINDEX(' ',column  ' ', 1)-1)

But this removes everything after the which is not what I need.

CodePudding user response:

Removing everything after and including a character in a value is quite simple, just use LEFT and CHARINDEX:

LEFT(DS.Item,CHARINDEX(' ',DS.Item   ' ')-1)

The real problem you have is a little more complex:

  1. You have a denormalised design, making this far more difficult.
  2. You want to retain the denormalised design in the results, making this difficult again
  3. You are using a version of SQL Server that is very close to end of support that has no in built support for string splitting and aggregation.

Saying that, unless you were using Azure SQL Database (or somehow had a copy of SQL Server 2022) I wouldn't suggest STRING_SPLIT, as it doesn't provide an ordinal position parameter. Though STRING_AGG would make things far easier; and you could use a JSON splitter too.

Instead I use DelimitedSplit8K_LEAD here, and then "Ye Olde FOR XML PATHe (and STUFF)" for the string aggregation. This gives this clunky solution:

SELECT *
INTO dbo.YourTable
FROM (VALUES('095930'),
            ('CF0010 EN'),
            ('060983 PS'),
            ('086588 GG;086326 GG'),
            ('900010;'),
            ('CF0002;;CF0018;'))V(YourString);

GO

SELECT YT.YourString,
       STUFF((SELECT ';'   LEFT(DS.Item,CHARINDEX(' ',DS.Item   ' ')-1)
              FROM dbo.DelimitedSplit8K_LEAD(YT.YourString,';') DS
              ORDER BY DS.ItemNumber
              FOR XML PATH(''),TYPE).value('(./text())[1]','varchar(8000)'),1,1,'') AS NewString
FROM dbo.YourTable YT; 
     

GO
DROP TABLE dbo.YourTable;

If you were using Azure SQL Database (or SQL Server 2022) then the answer would be much simpler:

SELECT YT.YourString,
       STRING_AGG(LEFT(SS.Value,CHARINDEX(' ',SS.Value   ' ')-1),';') WITHIN GROUP (ORDER BY SS.Ordinal) AS NewString
FROM dbo.YourTable YT
     CROSS APPLY STRING_SPLIT(YT.YourString,';',1) SS
GROUP BY YT.YourString; --Assuming YourString has a unique value

CodePudding user response:

EDIT: I made a mistake and answered this pertaining to SSIS and answered as if you were loading this data originally. I am leaving it though because SSIS is a complimentary tool of SQL Server.

To do what you ask...

Script Component:

var firstSplit = Row.[your Column].Split(';');

List<string> formatedCols = new List<string>();
foreach(var col in firstSplit)
    formatedCols.Add(col.Split(' ')[0]);

Row.[Output Column Name] = string.Join(";", formatedCols.ToArray());

I would recommend a child table though with a 1 to many relationship to parent.

  • Related