Pick lowest row in a group-CodePudding

There is 1 table wherein all these records are maintained for 1 employee. Due to one more column in this table which is not shown in this example the person has multiple rows for even 1 job , i.e. 2 rows for job J1 , 3 for Job J2 and again 2 rows for job J1 in order and same way for location. The person switches from job J1 to J2 and then again J1 and same way changes happens for location .How to extract the required rows whenever a switch happens as I have to extract the least of each row when job or location is changed. Please help me.

StDt              EdDt            Job       Location   Required_Rows
01-Jan-21      31-Jan-21        J1         L1          Y
01-Feb-21      30-Jun-21        J1         L1          N
01-Jul-21      30-Jul-21        J2         L1          Y
01-Aug-21      15-Aug-21        J2         L2          Y 
16-Aug-21      31-Aug-21        J2         L2          N
01-Sep-21      01-Nov-21        J1         L2          Y
02-Nov-21      31-Dec-21        J1         L1          Y

CodePudding user response：

Here's one way, assuming the table is named job_history, it has an additional column employee_id, you need the query to give you the answer for all employees in one pass, and your Oracle version is at least 12.1:

select *
from   job_history
match_recognize(
  partition by employee_id
  order     by stdt
  all rows per match
  pattern   ( y {- n* -} )
  define    n as job = prev(job) and location = prev(location)
);

All the query does is to "mark" ("classify") each row as either Y or N according to your definition, then the N rows are excluded from the output (the meaning of {- ... -} in the pattern clause).

If you need this for just one employee, add a where clause and remove the partition by clause in match_recognize since it's no longer needed.

CodePudding user response：

Almost a duplicate of your previous question:

From Oracle 12, you can use MATCH_RECOGNIZE:

SELECT *
FROM   table_name
MATCH_RECOGNIZE(
  ORDER BY stDt
  ALL ROWS PER MATCH
  PATTERN (same {- same* -})
  DEFINE
    same AS FIRST(job) = job AND FIRST(location) = location
)

and for earlier versions, you can use the LAG analytic function:

SELECT StDt,
       EdDt,
       Job,
       location
FROM   (
  SELECT t.*,
         LAG(job) OVER (ORDER BY StDt) AS prev_job,
         LAG(location) OVER (ORDER BY StDt) AS prev_location
  FROM   table_name t
)
WHERE  prev_job IS NULL
OR     prev_job != job
OR     prev_location IS NULL
OR     prev_location != location;

Which, for the sample data:

CREATE TABLE table_name (StDt, EdDt, Job, Location) AS
SELECT DATE '2021-01-01', DATE '2021-01-31', 'J1', 'L1' FROM DUAL UNION ALL
SELECT DATE '2021-02-01', DATE '2021-06-30', 'J1', 'L1' FROM DUAL UNION ALL
SELECT DATE '2021-07-01', DATE '2021-07-31', 'J2', 'L1' FROM DUAL UNION ALL
SELECT DATE '2021-08-01', DATE '2021-08-15', 'J2', 'L2' FROM DUAL UNION ALL
SELECT DATE '2021-08-16', DATE '2021-08-31', 'J2', 'L2' FROM DUAL UNION ALL
SELECT DATE '2021-09-01', DATE '2021-11-01', 'J1', 'L2' FROM DUAL UNION ALL
SELECT DATE '2021-11-01', DATE '2021-12-31', 'J1', 'L1' FROM DUAL

Both output:

STDT EDDT JOB LOCATION

2021-01-01 00:00:00 2021-01-31 00:00:00 J1 L1

2021-07-01 00:00:00 2021-07-31 00:00:00 J2 L1

2021-08-01 00:00:00 2021-08-15 00:00:00 J2 L2

2021-09-01 00:00:00 2021-11-01 00:00:00 J1 L2

2021-11-01 00:00:00 2021-12-31 00:00:00 J1 L1

STDT	EDDT	JOB	LOCATION
2021-01-01 00:00:00	2021-01-31 00:00:00	J1	L1
2021-07-01 00:00:00	2021-07-31 00:00:00	J2	L1
2021-08-01 00:00:00	2021-08-15 00:00:00	J2	L2
2021-09-01 00:00:00	2021-11-01 00:00:00	J1	L2
2021-11-01 00:00:00	2021-12-31 00:00:00	J1	L1

db<>fiddle here