I have data in my table which looks like the below :
INPUT :
version value code type
PMS 0.00 01 HOURS
000 312.00 01 HOURS
000 0.00 04 HOURS
PMS 0.00 01 NON STOCK
000 835.00 01 NON STOCK
000 835.00 04 NON STOCK
- step 1 : sort it by code, which is 01 and 04.
- step 2 : sort it by type, which is HOURS and NON STOCK.
- step 3 : Retrieve by version, if 0.00 then retrieve the record.
Explanation for the output :
For code = 01, we have 4 records, 2 records for the type = HOURS and 2 records for type = NON STOCK. but choose the record with Version not equal to PMS(due to duplicates in column type). hence we get two outputs with code 01.
000 312.00 01 HOURS
000 835.00 01 NON STOCK
For code = 04, we have 2 records, 1 record for the type = HOURS and 1 record for type = NON STOCK. since we don't have duplicates for column type here, need not compare version in this case. hence we get two outputs with code 01
000 0.00 04 HOURS
000 835.00 04 NON STOCK
(The database Engine is Azure databricks)
EXPECTED OUTPUT :
version value code type
000 312.00 01 HOURS
000 835.00 01 NON STOCK
000 0.00 04 HOURS
000 835.00 04 NON STOCK
CodePudding user response:
Assuming that if there is no duplicate rows you need to pull the record even if the version is PMS. Try this:
select
*
from(
select
a.*,
case when b.code is not null and version <>'PMS' then 1
when b.code is null then 1
else 0 end as filter_val
from
input a
left outer join
(Select
code,
type
from input
group by
code,
type having count(*) > 1) b
on a.code=b.code
and a.type=b.type) x
where filter_val=1;
CodePudding user response:
You can do:
select
version, value, code, type
from (
select *,
row_number() over(partition by code, type order by version) as rn
from t
) x
where rn = 1
order by code, type