The table data looks like the below :
INPUT :
version value code type year
PMS 0.00 01 HOURS 2006
000 312.00 01 HOURS 2006
000 0.00 04 HOURS 2006
PMS 0.00 01 NON STOCK 2006
000 835.00 01 NON STOCK 2006
000 835.00 04 NON STOCK 2006
000 0.00 04 HOURS 2007
I need to split the table based on code(01 and 04) and get the corresponding data then join both results. so my output should look like below
EXPECTED OUTPUT :
version value code type year version value code type year
PMS 0.00 01 HOURS 2006 000 0.00 04 HOURS 2006
000 312.00 01 HOURS 2006 000 835.00 04 NON STOCK 2006
PMS 0.00 01 NON STOCK 2006 000 0.00 04 HOURS 2007
000 835.00 01 NON STOCK 2006 null null null null null
I tried row_num method by following this post How join two tables using SQL without a common column
but my output was like this :
version value code type year version value code type year
PMS 0.00 01 HOURS 2006 000 0.00 04 HOURS 2006
000 312.00 01 HOURS 2006 000 835.00 04 NON STOCK 2006
I'm using Azure databricks(SQL-Server)
Thanks in advance !!
CodePudding user response:
I'm using SQL Server so I think this applies to the data-brick implementation.
Basically I have created 2 derived tables (I wrote 2 queries to give the row_number value per code) and then did a left join on those to give the result that matches your sample
declare @input table (version nvarchar(max), value nvarchar(max), code nvarchar(max), type nvarchar(max), year int);
insert @input (version, value, code, type, year)
values
('PMS', '0.00', '01', 'HOURS', 2006),
('000', '312.00', '01', 'HOURS', 2006),
('000', '0.00', '04', 'HOURS', 2006),
('PMS', '0.00', '01', 'NON STOCK', 2006),
('000', '835.00', '01', 'NON STOCK', 2006),
('000', '835.00', '04', 'NON STOCK', 2006),
('000', '0.00', '04', 'HOURS', 2007)
select *
from @input; /*original data*/
select *
from /*desired output*/
(
select ROW_NUMBER() over(order by code) as id, *
from @input
where code = '01'
) a
left join
(
select ROW_NUMBER() over(order by code) as id, *
from @input
where code = '04'
) b on a.id = b.id;