Home > Back-end >  Using SQL - Select instances where the current row = 'X' and the previous Row = 'Y�
Using SQL - Select instances where the current row = 'X' and the previous Row = 'Y�

Time:08-19

Let's say we have a table 'Food_Request' that contains the following values:

Name Food Date
John Egg 7/1
Meg Egg 7/10
Sue Egg 7/5
John Salad 6/10
Meg Cheese 6/20
Sue Salad 6/25

I'm trying to create a query that returns the name that meet the following conditions:

Someone ordered a salad FIRST, then the immediate following order from the same person is an Egg.

I've been trying to use the lag function to select the previous row of the customer's name, but I don't think I'm getting the correct results. Here's what I have so far:

Select Name

from (select Name, Date, Food, case when Food = 'Egg' and lag(Food) over (order by name, date asc) = 'Salad' then 1 else 0 as Prev_Order from Food_Request) table1

where Prev_Order = 1

The desired result should be Sue and John.

Thank you!

CodePudding user response:

You can use a correlated query with exists

select name
from t
where food = 'salad'
and exists (
  select * from t t2 
  where t2.name = t.name 
    and t2.food = 'Egg' 
    and t2.date > t.date
);

CodePudding user response:

Here's your solution with LAG (which you mentioned, so I assumed you're using a RDBMS that has window function and the LAG() function in particular). I ran it in Snowflake and it returns Paul and Meg from my data sample (I took different names, sorry).

WITH food_requests AS (
  -- This is just for the data.
  -- you obviously don't need to do this because you already have the table.
  -- I need to create a sample to play with it.
  select * 
  from (values 
        ('2022-08-08 10:10:10', 'Meg', 'SALAD'), 
        ('2022-08-08 10:10:11', 'Meg', 'STEAK'), 
        ('2022-08-08 10:10:12', 'Meg', 'DESSERT'),
        ('2022-08-08 10:10:10', 'Paul', 'DESSERT'), 
        ('2022-08-08 10:10:11', 'Paul', 'SALAD'), 
        ('2022-08-08 10:10:12', 'Paul', 'STEAK'),
        ('2022-08-08 10:10:10', 'Ana', 'DESSERT'), 
        ('2022-08-08 10:10:11', 'Ana', 'DESSERT'), 
        ('2022-08-08 10:10:12', 'Ana', 'STEAK')
       ) as v1 (date,name,food)
), food_with_before_and_current AS (
  -- This is what you want to do.
  -- you create an extra column with the LAG window function.
  -- This column has the value of the previous food for the user.
  SELECT date, 
        name, 
        food, 
        lag(food) over (partition by name order by date) AS food_before
  FROM food_requests
)
-- You query food_with_before_and_current wich has the extra column
-- You filter against food and food_before. Voilà!
SELECT name
FROM food_with_before_and_current
WHERE food = 'STEAK' 
AND food_before = 'SALAD';

I took 'STEAK' AND 'SALAD' instead of eggs but it doesn't really matter. You see the concept.

  •  Tags:  
  • sql
  • Related