I have a table with the following columns
Car | User | Location | Time | Type
The type can be
'OUT' if the user's request to rent the car is accepted
'IN' when the user stops using the car and registers it back into the system
'DENIED' when the access to rent the car is denied
So the order of the records regarding to the type should be OUT-IN (time ascending)
There are inconsistent data where there might be several rows with the same car,user,location and type values following each other like below
Car | User | Location | Time | Type
-----------------------------------------------------------------
1 | Ben | Chicago | 2022-02-12 03:12:45 | OUT //should be deleted
-----------------------------------------------------------------
1 | Ben | Chicago | 2022-02-12 04:12:45 | OUT //should be deleted
-----------------------------------------------------------------
2 | Sam | New York | 2022-02-12 04:42:45 | OUT
-----------------------------------------------------------------
1 | Ben | Chicago | 2022-02-12 04:50:00 | OUT //should be kept
-----------------------------------------------------------------
1 | Sam | New York | 2022-02-12 07:32:12 | IN
-----------------------------------------------------------------
1 | Ben | Chicago | 2022-02-12 08:18:45 | IN //should be deleted
-----------------------------------------------------------------
3 | Mia | San Franc | 2022-02-12 09:12:43 | OUT
-----------------------------------------------------------------
1 | Ben | Chicago | 2022-02-12 09:27:23 | IN //should be kept
So only the last OUT and IN records should be kept and the other duplicates should be removed.
To make it easier for help
CREATE TABLE rent_logs (
car varchar(30),
user varchar(30),
location varchar(30),
time datetime
type varchar(10)
);
INSERT INTO rent_logs ( car, username, location, time, type ) VALUES
( 1, 'Ben', 'Chicago','2022-02-12 03:12:45', 'OUT' ), ( 1, 'Ben', 'Chicago',' 2022-02-12 04:12:45 ', 'OUT' ), ( 2, 'Sam', 'New york','2022-02-12 04:42:45', 'OUT' ), ( 1, 'Ben', 'Chicago','2022-02-12 04:50:00', 'OUT' ), ( 2, 'Sam', 'New york','2022-02-12 07:32:12 ', 'IN' ), ( 1, 'Ben', 'Chicago','22022-02-12 08:18:45', 'IN' ), ( 3, 'Mia', 'Chicago','2022-02-12 09:12:43', 'OUT' ), ( 1, 'Ben', 'Chicago','2022-02-12 09:27:23', 'IN' )
CodePudding user response:
You can use a pair of window functions on the Type
field:
These window functions will apply according to specific partitions involving the following fields in this order: User
, Car
, Location
.
Once you get these values, you can retrieve the rows when one of the following conditions on the field Type
is met:
Type = OUT
andTypeNext = IN
(lastOUT
)Type = IN
andTypePrev = IN
and notTypeNext = IN
(lastIN
)Type = DENIED
If you have more complex conditions on near rows, you can handle them through these three computed values.
WITH cte AS (
SELECT Car,
User,
Location,
Time,
LAG(Type) OVER(
PARTITION BY User, Car, Location
ORDER BY Time) AS TypePrev,
Type,
LEAD(Type) OVER(
PARTITION BY User, Car, Location
ORDER BY Time) AS TypeNext
FROM rent_logs
)
SELECT Car,
User,
Location,
Time,
Type
FROM cte
WHERE (Type = 'OUT' AND TypeNext = 'IN')
OR (Type = 'IN' AND TypePrev = 'IN' AND (TypeNext = 'OUT' OR TypeNext IS NULL))
OR (TypePrev IS NULL AND TypeNext IS NULL)
OR (Type = 'DENIED')
ORDER BY User,
Time
Check the SQL Fiddle here.
If you want to play a bit to understand how it works and the intermediate result in the fiddle link, I'd suggest to remove comments from the commented lines and to comment the conditions in the WHERE
clause.