How to group values belonging to SHIP_ID on the same row for SQL Server-CodePudding

I tried everything but still can't group values that belong to SHIP_ID on the same row.

How can I have all values belonging to the same SHIP_ID on the same row?

For example this is my original table:

DISTANCE_CODE	SHIP_ID	SHIP_COUNTRY	Unique_key
	1245_COD		54
		Tunisia	58
1C254			60
	1100_COD		82
		Oman	85
E9663			91
J9668			93
R9664			96
S9669			98
T9662			101
79663			106
E1661			108
	1245_CR		110
		Chile	115
99Z254			116
55X33			121

Expected results:

DISTANCE_CODE	SHIP_ID	SHIP_COUNTRY
1C254	1245_COD	Tunisia
E9663	1100_COD	Oman
J9668	1100_COD	Oman
R9664	1100_COD	Oman
S9669	1100_COD	Oman
T9662	1100_COD	Oman
79663	1100_COD	Oman
E1661	1100_COD	Oman
99Z254	1245_CR	Chile
55X33	1245_CR	Chile

Thank you in advance!

CodePudding user response：

I really hope that this dataset is being fixed, however here is some code that produces the output you are looking for. I do not know if your examples had the same number of columns as your dataset, but in my example the table in question was called Table_1

I assumed that UNIQUE_KEY was always increasing.

-- Start by obtaining all of the SHIP_IDs
WITH Ships AS (
    SELECT * 
    FROM Table_1 
    WHERE SHIP_ID IS NOT NULL

-- Compute the unique key of the next ship id for this set.
), NextShips AS (
    SELECT 
      Ships.UNIQUE_KEY as ShipKey, 
      MIN(NextShip.UNIQUE_KEY) as NextShipKey 
    FROM Ships
    LEFT JOIN Ships as NextShip ON NextShip.UNIQUE_KEY > Ships.UNIQUE_KEY
    GROUP BY Ships.UNIQUE_KEY

-- Due to the calculation above, the last ship id will always have NULL as the next ship id.
), NullFix AS (
    SELECT 
      ShipKey,
      CASE WHEN NextShipKey IS NULL THEN (SELECT TOP 1 UNIQUE_KEY FROM Table_1 ORDER BY UNIQUE_KEY DESC) ELSE NextShipKey END as NextShipKey
    FROM NextShips
)

-- Join the table onto itself, once for each column, being very mindful of which columns come from where.
SELECT 
  DistanceCodes.DISTANCE_CODE
, ShipId.SHIP_ID
, ShipCountries.SHIP_COUNTRY    
FROM NullFix 
JOIN Table_1 AS ShipId ON NullFix.ShipKey = ShipId.UNIQUE_KEY
JOIN Table_1 as DistanceCodes ON DistanceCodes.UNIQUE_KEY >= NullFix.ShipKey AND DistanceCodes.UNIQUE_KEY <= NullFix.NextShipKey AND DistanceCodes.DISTANCE_CODE IS NOT NULL
JOIN Table_1 as ShipCountries ON ShipCountries.UNIQUE_KEY >= NullFix.ShipKey AND ShipCountries.UNIQUE_KEY <= NullFix.NextShipKey AND ShipCountries.SHIP_COUNTRY IS NOT NULL

CodePudding user response：

Here is a possible solution using a stored procedure.

create procedure sp_seq
as

create table #shipping (
    [DISTANCE_CODE] [nvarchar](255) NULL,
    [SHIP_ID] [nvarchar](255) NULL,
    [SHIP_COUNTRY] [nvarchar](255) NULL,    
)

declare cur cursor for 

select [DISTANCE_CODE]
      ,[SHIP_ID]
      ,[SHIP_COUNTRY]
  from [dbo].[shipping] order by [Unique_key]

declare @DISTANCE_CODE varchar(255), @SHIP_ID varchar(255), @SHIP_COUNTRY varchar(255)

declare @distance varchar(255), @id varchar(255), @country varchar(255)

open cur
fetch next from cur into @DISTANCE_CODE,@SHIP_ID,@SHIP_COUNTRY

while @@FETCH_STATUS =0 
begin

    if (@SHIP_ID is not null and @SHIP_ID != @id) 
        select @distance=null,@id=null,@country=null

    if (@DISTANCE_CODE is not null) set @distance = @DISTANCE_CODE
    if (@SHIP_ID is not null) set @id = @SHIP_ID
    if (@SHIP_COUNTRY is not null) set @country = @SHIP_COUNTRY 
    
    if (@distance is not null and @id is not null and @country is not null)
        insert into #shipping(DISTANCE_CODE,SHIP_ID,SHIP_COUNTRY) 
        select @distance,@id,@country
            
    fetch next from cur into @DISTANCE_CODE,@SHIP_ID,@SHIP_COUNTRY
end
close cur
deallocate cur

select * from #shipping

go

Result:

exec sp_seq

/*
DISTANCE_CODE   SHIP_ID     SHIP_COUNTRY
1C254           1245_COD    Tunisia
E9663           1100_COD    Oman
J9668           1100_COD    Oman
R9664           1100_COD    Oman
S9669           1100_COD    Oman
T9662           1100_COD    Oman
79663           1100_COD    Oman
E1661           1100_COD    Oman
99Z254          1245_CR     Chile
55X33           1245_CR     Chile
*/

CodePudding user response：

Just thought I would share another solution.

WITH cte_set AS (
    -- Generate GroupID that increments each time SHIP_ID is populated
    SELECT s.DISTANCE_CODE, s.SHIP_ID, s.SHIP_COUNTRY
        , GroupID = COUNT(s.SHIP_ID) OVER (ORDER BY s.Unique_Key ROWS UNBOUNDED PRECEDING)
    FROM #ship s
), cte_ships AS (
    -- Return table of SHIP_ID SHIP_COUNTRY to GroupID mappings
    SELECT x.GroupID
        , SHIP_ID = MAX(x.SHIP_ID), SHIP_COUNTRY = MAX(x.SHIP_COUNTRY)
    FROM cte_set x
    GROUP BY x.GroupID
), cte_dc AS (
    -- Return table of DISTANCE_CODE to GroupID mappings
    SELECT x.DISTANCE_CODE, x.GroupID
    FROM cte_set x
    WHERE x.DISTANCE_CODE IS NOT NULL
)
SELECT d.DISTANCE_CODE, s.SHIP_ID, s.SHIP_COUNTRY
FROM cte_dc d
    JOIN cte_ships s ON s.GroupID = d.GroupID;

The concept here was that I first generated a unique "GroupID" based on whether the SHIP_ID was populated. Each time SHIP_ID was populated, the GroupID incremented.

Then from that dataset, I generated two new sets...one set contains the SHIP_ID, SHIP_COUNTRY, GroupID. The second set contains the DISTANCE_CODE, GroupID. And then I joined it all together.

I'd also like to note that it's fairly efficient as well, at least compared to other methods I tried. I ran this against a 2 million record table, and it took 19 seconds to output the results to a temp table.

Sample data

IF OBJECT_ID('tempdb..#ship','U') IS NOT NULL DROP TABLE #ship; --SELECT * FROM #ship
CREATE TABLE #ship (
    DISTANCE_CODE varchar(6) NULL,
    SHIP_ID varchar(8) NULL,
    SHIP_COUNTRY varchar(20) NULL,
    Unique_Key int NOT NULL PRIMARY KEY CLUSTERED,
)

INSERT INTO #ship (DISTANCE_CODE, SHIP_ID, SHIP_COUNTRY, Unique_Key)
VALUES (NULL,'1245_COD',NULL,54)
    ,  (NULL,NULL,'Tunisia',58)
    ,  ('1C254',NULL,NULL,60)
    ,  (NULL,'1100_COD',NULL,82)
    ,  (NULL,NULL,'Oman',85)
    ,  ('E9663',NULL,NULL,91)
    ,  ('J9668',NULL,NULL,93)
    ,  ('R9664',NULL,NULL,96)
    ,  ('S9669',NULL,NULL,98)
    ,  ('T9662',NULL,NULL,101)
    ,  ('79663',NULL,NULL,106)
    ,  ('E1661',NULL,NULL,108)
    ,  (NULL,'1245_CR',NULL,110)
    ,  (NULL,NULL,'Chile',115)
    ,  ('99Z254',NULL,NULL,116)
    ,  ('55X33',NULL,NULL,121);