The following query:
WITH CTE AS(
SELECT
PaymentDate,
GroupNumber
FROM DBO
GROUP BY PaymentDate, GroupNumber
)
SELECT
PaymentDate,
MONTH(PaymentDate) AS Payment_Month,
COUNT(DISTINCT GroupNumber) AS GP_CNT,
GroupNumber
INTO #Groupings
FROM CTE
GROUP BY PaymentDate, GroupNumber WITH ROLLUP
ORDER BY PaymentDate, GroupNumber
SELECT * FROM #Groupings
Gives the following output:
PaymentDate | Payment_month | GP_cnt | GroupNumber |
---|---|---|---|
2022-01-01 | 1 | 2 | 1111 |
2022-01-01 | 1 | 2 | 1112 |
2022-02-01 | 2 | 4 | 2221 |
2022-02-01 | 2 | 4 | 2222 |
2022-02-01 | 2 | 4 | 2223 |
2022-02-01 | 2 | 4 | 2224 |
2022-03-01 | 3 | 4 | 2221 |
2022-03-01 | 3 | 4 | 2222 |
2022-03-01 | 3 | 4 | 3333 |
2022-03-01 | 3 | 4 | 3334 |
I seek to find the list of Group numbers from the present month that are not present from the prior month, aka new groups that have been added to the present month. I am thinking about starting with the Lag function but I am not sure where to go. Would this solution be easier to solve in Python? Would it be better to generate a CSV of the output above and use pandas? Thank you!
Desired Result/Output:
| GroupNumber |
|-------------|
| 3333 |
| 3334 |
CodePudding user response:
You check your table on itself to look for groups not existing in previous months
SELECT *
FROM #Groupings AS G
WHERE NOT EXISTS (
SELECT 'X' X
FROM #Groupings AS GG
WHERE GG.GroupNumber = G.GroupNumber
AND GG.Payment_month < G.Payment_month
)