I have a table:
ID | col1 | col2 | col3 | col4 | col5 |
---|---|---|---|---|---|
1 | Y | G | CA | DA | EA |
2 | Y | G | CA | DA | EA |
3 | Y | G | CA | DA | EA |
1 | X | Q | RA | DA | EA |
2 | X | Q | CA | DA | EA |
For col1 and col2, I want to create 4 additional columns. Two for the distinct values for col1 (X,Y) and two for the distinct values for col2 (G,Q).
For example, I want to create a new column (col1_x) to display 'Y' if col1 have X. Basically for each ID, col3, col4 and col5 combination - I want to show if col1 have X , col1 have Y, col2 have G and col2 have Q. Each as an individual column. How do I do that? I think I'll need a pivot/group by as I would want to eliminate duplicated rows of data.
Desired output:
ID | col3 | col4 | col5 | col1_X | col1_Y | col2_G | col2_Q |
---|---|---|---|---|---|---|---|
1 | CA | DA | EA | Y | Y | Y | Y |
1 | RA | DA | EA | Y | Y | Y | Y |
2 | CA | DA | EA | Y | Y | Y | Y |
3 | CA | DA | EA | N | Y | Y | N |
CodePudding user response:
Test data
drop table if exists #test;
drop table if exists #result;
create table #test (
[ID] int,
[col1] varchar(1),
[col2] varchar(1),
[col3] varchar(2),
[col4] varchar(2),
[col5] varchar(2));
insert into #test ([ID], [col1], [col2], [col3], [col4], [col5])
values
(1, 'Y', 'G', 'CA', 'DA', 'EA'),
(2, 'Y', 'G', 'CA', 'DA', 'EA'),
(3, 'Y', 'Q', 'CA', 'DA', 'EA'),
(1, 'X', 'G', 'RA', 'DA', 'EA'),
(2, 'X', 'G', 'CA', 'DA', 'EA'),
(3, 'X', 'Q', 'CA', 'DA', 'EA');
1) Pivot solution
Here is a pivot approach - it is not purely dynamic as you have to provide "column names" in the select which are the values you are looking for (X/Y and G/Q) - it needs to be done twice, once for each column you are interested in (col1 and col2) so it ends up looking more complex than might be hoped for, but it is is really one simple pivot, done twice, then cleaned up per your desired output. It seems like a 'trick' to me, still - after all these years! More info on PIVOT here: Using PIVOT and UNPIVOT.
select
a.ID, a.col3, a.col4, a.col5,
case when cast(a.[X] as bit) = 1 then 'Y' else 'N' end as Col1_X,
case when cast(a.[Y] as bit) = 1 then 'Y' else 'N' end as col1_Y,
case when cast(b.[G] as bit) = 1 then 'Y' else 'N' end as col2_G,
case when cast(b.[Q] as bit) = 1 then 'Y' else 'N' end as col2_Q
from
(
select
ID, col3, col4, col5, [X], [Y]
from #test
pivot(
Count(col1) for [col1] in ([X], [Y])
) as tmp
) a
inner join
(
select
ID, col3, col4, col5, [G], [Q]
from #test
pivot(
Count(col1) for [col2] in ([G], [Q])
) as tmp
) b
on a.ID = b.ID
and a.col3 = b.col3
and a.col4 = b.col4
and a.col5 = b.col5
2) Naive solution
I created a "naive" solution that I think is not really so bad and has the benefit of being understandable to anyone with a little sql skill, whereas it is a little obscure how pivot is working in a case like this (where we are not "really" aggregating information such as sums, averages, or real counts).
-- create the output table
select
distinct ID, col3, col4, col5,
'N' as col1_X, 'N' as col1_Y, 'N' as col2_G, 'N' as col2_Q
into #result
from #Test;
-- update the results
update t1
set t1.col1_X = 'Y'
from #result t1
where exists (select * from #test t2
where t2.ID = t1.ID
and t2.col3 = t1.col3
and t2.col4 = t1.col4
and t2.col5 = t1.col5
and t2.col1 = 'X');
update t1
set t1.col1_Y = 'Y'
from #result t1
where exists (select * from #test t2
where t2.ID = t1.ID
and t2.col3 = t1.col3
and t2.col4 = t1.col4
and t2.col5 = t1.col5
and t2.col1 = 'Y');
update t1
set t1.col2_G = 'Y'
from #result t1
where exists (select * from #test t2
where t2.ID = t1.ID
and t2.col3 = t1.col3
and t2.col4 = t1.col4
and t2.col5 = t1.col5
and t2.col2 = 'G');
update t1
set t1.col2_Q = 'Y'
from #result t1
where exists (select * from #test t2
where t2.ID = t1.ID
and t2.col3 = t1.col3
and t2.col4 = t1.col4
and t2.col5 = t1.col5
and t2.col2 = 'Q');
select * from #result;
Result in both cases:
ID | col3 | col4 | col5 | col1_X | col1_Y | col2_G | col2_Q |
---|---|---|---|---|---|---|---|
1 | CA | DA | EA | N | Y | Y | N |
1 | RA | DA | EA | Y | N | Y | N |
2 | CA | DA | EA | Y | Y | Y | N |
3 | CA | DA | EA | Y | Y | N | Y |
CodePudding user response:
select col3, col4, col5,
max(col1_X),
max(col1_Y),
max(col2_G),
max(col2_Q)
from
(select col3, col4, col5,
case when col1 = 'X' then 'Y' else 'N' end as col1_X,
case when col1 = 'Y' then 'Y' else 'N' end as col1_Y,
case when col2 = 'G' then 'Y' else 'N' end as col2_G,
case when col2 = 'Q' then 'Y' else 'N' end as col1_Q
From table) query
Group by col3, col4, col5