I have the following table:
CREATE TABLE table_one( person varchar(55), date_value date, proj varchar(2), value int, time varchar(2 );
INSERT INTO table_one VALUES
('A1',2020-10-01'W',10,'T1')
('A1',2020-10-01'A',5,'T2')
('A1',2020-10-01'P',6,'T3')
('A1',2020-10-01'A',9,'T4')
('A1',2020-10-01'P',11,'T5')
('A1',2020-10-01'A',4,'T6')
('A1',2020-10-01'P',2,'T7')
('A1',2020-10-01'A',1,'T8')
('A1',2020-10-01'P',10,'T9')
('A1',2020-10-01'A',8,'T10')
I want an SQL query which creates a new column 'new_value'. The following are the conditions to fill that new column:
Case-1 When proj = A and next row proj = P , then take value of Proj=a in new_value column corresponding to proj=p.
For example, for row 2 the proj value is A and row 3 proj value is P. So the new column value correspoding to row-3 should be 5.
Case-2 When last row = A and the first row is W, then allocate value of last row in new_value column.
For example, row-10 has proj value A and row-1 has proj value W. So the new_value column correspoding to row-1 should be 8.
CASE-3 New_val should be NULL when proj = A.
Refer to the following picture for visual help
the above query needs to be aggregated at person,date_value column.
CodePudding user response:
SELECT person,date_value,proj,value,
lag(proj) ignore nulls over (PARTITION BY person order by row_num) last_proj,
lag(value) ignore nulls over (PARTITION BY person order by row_num) last_proj_value,
last_value(proj) ignore nulls over (PARTITION BY person order by row_num) person_last_proj,
first_value(proj) ignore nulls over (PARTITION BY person order by row_num) person_first_proj,
last_value(value) ignore nulls over (PARTITION BY person order by row_num) person_last_proj_value,
CASE
WHEN proj='P' AND last_proj = 'A' THEN last_proj_value --Case_1
WHEN person_last_proj = 'A' AND person_first_proj = 'W' AND row_num = 1 THEN person_last_proj_value --case_2
WHEN proj='A' THEN NULL --CASE_3
ELSE NULL
END AS New_value
FROM
(SELECT person,date_value,proj,value,row_number() over (partition by person order by date_value) as row_num
FROM table_one);
CodePudding user response:
I changed the sample data TIME column from varchar to int. The reason is that strings won't sort right. In string comparisons, '10' < '9'.
CREATE TABLE TABLE_ONE( person varchar(55), date_value date, proj varchar(2), value int, time int);
INSERT INTO TABLE_ONE VALUES
('A1','2020-10-01','W',10,1),
('A1','2020-10-01','A',5,2),
('A1','2020-10-01','P',6,3),
('A1','2020-10-01','A',9,4),
('A1','2020-10-01','P',11,5),
('A1','2020-10-01','A',4,6),
('A1','2020-10-01','P',2,7),
('A1','2020-10-01','A',1,8),
('A1','2020-10-01','P',10,9),
('A1','2020-10-01','A',8,10)
;
select *,
case
when PROJ = 'P' and lag(PROJ) over (partition by PERSON order by DATE_VALUE, TIME) = 'A' then
lag(VALUE) over (partition by PERSON order by DATE_VALUE, TIME)
when row_number() over (partition by PERSON order by DATE_VALUE, TIME) = 1 and
last_value(PROJ) over (partition by PERSON order by DATE_VALUE, TIME) = 'A' then
last_value(VALUE) over (partition by PERSON order by DATE_VALUE, TIME)
end as NEW_VALUE
from TABLE_ONE order by DATE_VALUE, TIME;
PERSON | DATE_VALUE | PROJ | VALUE | TIME | NEW_VALUE |
---|---|---|---|---|---|
A1 | 2020-10-01 00:00:00 | W | 10 | 1 | 8 |
A1 | 2020-10-01 00:00:00 | A | 5 | 2 | null |
A1 | 2020-10-01 00:00:00 | P | 6 | 3 | 5 |
A1 | 2020-10-01 00:00:00 | A | 9 | 4 | null |
A1 | 2020-10-01 00:00:00 | P | 11 | 5 | 9 |
A1 | 2020-10-01 00:00:00 | A | 4 | 6 | null |
A1 | 2020-10-01 00:00:00 | P | 2 | 7 | 4 |
A1 | 2020-10-01 00:00:00 | A | 1 | 8 | null |
A1 | 2020-10-01 00:00:00 | P | 10 | 9 | 1 |
A1 | 2020-10-01 00:00:00 | A | 8 | 10 | null |