Home > Back-end >  How do I filter SQL Query to get the required rows
How do I filter SQL Query to get the required rows

Time:09-01

Hey this is my first Stack Overflow Question so if I have not asked it in the best way please let me know and I can add more info.

Context:

I have a sql table with Four Columns Called:

ACC_ID, BLOCK_CATEGORY, START_DATE, VALUE

The Columns are: (Number), (Number), (Date), (Number)

The table records all changes made to an account as a new row and the ACC_ID is a unique columns associated to the account. The START_DATE is the date the change was made. For this we can ignore the Value Column.

I have a requirement to run a query to understand when all the accounts changed to the current BLOCK_CATEGORY that they are on. The problem I am facing is that the Block Category is a number 1-8 and they may have been on the same BLOCK_CATEGORY before but we need to know when it changed to it this time?

Here is some sample Data to help you Understand (For my sample the Date Format is DD/MM/YYYY):

ACC_ID BLOCK_CATEGORY START_DATE Value
1001 7 14/08/2022 5
1001 2 16/08/2022 5
1001 7 17/08/2022 10
1001 7 19/08/2022 10
1002 4 14/08/2022 3
1002 3 15/08/2022 3
1002 3 17/08/2022 9
1003 1 14/08/2022 10
1003 1 17/08/2022 13
1004 3 14/08/2022 2
1005 7 14/08/2022 11
1005 2 16/08/2022 34
1005 3 19/08/2022 1
1005 7 21/08/2022 12

The Desired end result of this is:

ACC_ID BLOCK_CATEGORY START_DATE
1001 7 17/08/2022
1002 3 15/08/2022
1003 1 14/08/2022
1004 3 14/08/2022
1005 7 21/08/2022

I hope through the above example and question you understand the need. Please ask any questions you have.

The current query I am using gives me the below incorrect result:

ACC_ID BLOCK_CATEGORY START_DATE
1001 7 14/08/2022
1002 3 15/08/2022
1003 1 14/08/2022
1004 3 14/08/2022
1005 7 14/08/2022

Here is the Query I am using. How can we run a query to give the correct desired result which is when it changed to the current BLOCK_CATEGORY.

SELECT *

FROM (

SELECT ACC_ID,
BLOCK_CATEGORY,
START_DATE,
ROW_NUMBER() OVER (PARTITION BY acc_id order by start_date DESC) RowNum

FROM
(
SELECT ACC_ID,
BLOCK_CATEGORY,
MIN(START_DATE) 'START_DATE'

FROM [dbo].[ACCOUNTCHANGES]
WHERE
BLOCK_CATEGORY IS NOT NULL
GROUP BY ACC_ID,BLOCK_CATEGORY
) A
) B
WHERE B.RowNum = 1

CodePudding user response:

Just looking at your required data, the following provides your desired results and should work on most RDBMS (assuming SQL Server though) - does this work for you?

Note I omitted value as it's not present in your desired results.

with bc as (
  select distinct ACC_ID, 
    First_Value(BLOCK_CATEGORY) over(partition by ACC_ID order by START_DATE desc) bc, 
    Dense_Rank() over(partition by ACC_ID order by BLOCK_CATEGORY)
      Dense_Rank() over(partition by ACC_ID order by BLOCK_CATEGORY desc) -1 cnt /* count of distinct categories */
  from t
)
select t.ACC_ID, t.BLOCK_CATEGORY, Min(t.START_DATE)
from bc
join t on t.ACC_ID = bc.ACC_ID and t.ACC_ID = bc.ACC_ID
where bc.cnt = 1 
  or t.START_DATE >= (
    select Max(START_DATE) from t t2 
    where t2.ACC_ID = t.ACC_ID and t2.BLOCK_CATEGORY != t.BLOCK_CATEGORY
  ) 
group by t.ACC_ID, t.BLOCK_CATEGORY
order by ACC_ID;

enter image description here

Working Demo Fiddle

CodePudding user response:

You may try the following:

With Create_Groups AS
(
 Select D.ACC_ID, D.BLOCK_CATEGORY, D.START_DATE, D.RN,
        SUM(D.g_edge) Over (Partition By ACC_ID Order By START_DATE) As GRP
 From
 (
  Select ACC_ID, BLOCK_CATEGORY, START_DATE,
         Case When LAG(BLOCK_CATEGORY, 1, BLOCK_CATEGORY) 
                   Over (Partition By ACC_ID Order By START_DATE) <> BLOCK_CATEGORY
              Then 1 Else 0 
         End As g_edge,
         ROW_NUMBER() Over (Partition By ACC_ID Order By START_DATE DESC) As RN
  From ACCOUNTCHANGES
 ) D
)
Select T.ACC_ID, T.BLOCK_CATEGORY, D.Fisrt_GRP_Date As START_DATE
From Create_Groups T
Join (Select ACC_ID, GRP, MIN(START_DATE) AS Fisrt_GRP_Date
      FRom Create_Groups
      Group By ACC_ID, GRP) D
On T.ACC_ID = D.ACC_ID And T.GRP = D.GRP
Where T.RN = 1
Order By T.ACC_ID

See a demo from db<>fiddle.

The idea is to define groups for consecutive similar values of 'BLOCK_CATEGORY' for each 'ACC_ID', this is done in Create_Groups CTE. Then find the minimum date for each defined group and join it to the last 'BLOCK_CATEGORY' entry for each 'ACC_ID'.

  •  Tags:  
  • sql
  • Related