I have a problem in finding the max value by comparing the values.
id | date | record |
---|---|---|
1 | 2022-01-01 | 100 |
2. | 2022-01-02 | 10 |
3. | 2022-01-03 | 5 |
4. | 2022-01-04 | 20 |
5 | 2022-01-06 | 22 |
6 | 2022-01-10 | 30 |
and so on 1000 records
column "date" should compare first value with second value and get the max "record", and then second date with third date and should return the max value and so on
so the final result should return
100,10,20,22,30...
please help with sql query
CodePudding user response:
I'm expressing the below in SQL - as it would work in any relational SQL database - I'm not sure if Spark's LEAD()
function needs the OVER()
clause or not, standard SQL does.
WITH
-- your input, names changed to avoid reserved words
indata(id,dt,rec) AS (
SELECT 1,DATE '2022-01-01',100
UNION ALL SELECT 2,DATE '2022-01-02',10
UNION ALL SELECT 3,DATE '2022-01-03',5
UNION ALL SELECT 4,DATE '2022-01-04',20
UNION ALL SELECT 5,DATE '2022-01-06',22
UNION ALL SELECT 6,DATE '2022-01-10',30
)
-- end of input, real query starts here ...
SELECT
*
, LEAD(rec) OVER(ORDER BY dt) as nextrec
, GREATEST(LEAD(rec) OVER(ORDER BY dt),rec) AS biggerofpair
FROM indata;
-- out id | dt | rec | nextrec | biggerofpair
-- out ---- ------------ ----- --------- --------------
-- out 1 | 2022-01-01 | 100 | 10 | 100
-- out 2 | 2022-01-02 | 10 | 5 | 10
-- out 3 | 2022-01-03 | 5 | 20 | 20
-- out 4 | 2022-01-04 | 20 | 22 | 22
-- out 5 | 2022-01-06 | 22 | 30 | 30
-- out 6 | 2022-01-10 | 30 | |