I have a table t with:
PLACE | LOCATION | TIMESTAMP | ID | AMOUNT | IN | OUT |
---|---|---|---|---|---|---|
1 | 10 | 2020-10-01 | 1 | 100 | 10 | 0 |
1 | 10 | 2020-10-02 | 1 | 110 | 5 | -50 |
1 | 10 | 2020-10-03 | 1 | 75 | 0 | -100 |
1 | 10 | 2020-10-04 | 1 | -25 | 30 | 0 |
1 | 10 | 2020-10-05 | 1 | 5 | 0 | 0 |
1 | 10 | 2020-10-06 | 1 | 5 | 38 | -300 |
1 | 10 | 2020-10-07 | 1 | -257 | 0 | 0 |
1 | 10 | 2020-10-01 | 2 | 1 | 10 | 0 |
1 | 10 | 2020-10-02 | 2 | 11 | 0 | -12 |
1 | 10 | 2020-10-03 | 2 | -1 | 0 | -100 |
1 | 10 | 2020-10-04 | 2 | -101 | 0 | 0 |
2 | 20 | 2020-11-15 | 1 | 18 | 20 | 0 |
2 | 20 | 2020-11-16 | 1 | 38 | 0 | 0 |
2 | 20 | 2020-11-15 | 3 | -9 | 20 | -31 |
2 | 20 | 2020-11-16 | 3 | -20 | 0 | 0 |
So due to SAP legacy stuff some logistic data is mangled which may lead to negative inventory.
To check how severe the error is I need to count for each PLACE, LOCATION, ID
- the number of rows that have a positive AMOUNT AND which do not have a negative AMOUNT before
- the number of rows that have a negative AMOUNT AND any positive AMOUNT that has a negative AMOUNT anywhere before
As you can see in my table there are (for PLACE=1, LOCATION=10, ID=1) 3 rows with a positive AMOUNT without any negative AMOUNT before. But then there is a negative AMOUNT and some positive AMOUNTS afterwards --> those 4 rows should not be counted for COUNT_CORRECT but should count for COUNT_WRONG.
So in this example table my query should return:
PLACE | LOCATION | TOTAL | COUNT_CORRECT | COUNT_WRONG | RATIO |
---|---|---|---|---|---|
1 | 10 | 11 | 5 | 6 | 0.55 |
2 | 20 | 4 | 2 | 2 | 0.5 |
My code so far:
CREATE OR REPLACE TABLE t (
PLACE INT NOT NULL
, LOCATION INT NOT NULL
, TIMESTAMP DATE NOT NULL
, ID INT NOT NULL
, AMOUNT INT NOT NULL
, IN INT NOT NULL
, OUT INT NOT NULL
, PRIMARY KEY(PLACE, LOCATION, ID, TIMESTAMP)
);
INSERT INTO t
(PLACE, LOCATION, TIMESTAMP, ID, AMOUNT, IN, OUT)
VALUES
(1, 10, '2020-10-01', 1, 100, 10, 0)
, (1, 10, '2020-10-02', 1, 110, 5, -50)
, (1, 10, '2020-10-03', 1, 75, 0, -100)
, (1, 10, '2020-10-04', 1, -25, 30, 0)
, (1, 10, '2020-10-05', 1, 5, 0, 0)
, (1, 10, '2020-10-06', 1, 5, 38, 300)
, (1, 10, '2020-10-07', 1, -257, 0, 0)
, (1, 10, '2020-10-04', 2, 1, 10, 0)
, (1, 10, '2020-10-05', 2, 11, 0, -12)
, (1, 10, '2020-10-06', 2, -1, 0, -100)
, (1, 10, '2020-10-07', 2, -101, 0, 0)
, (2, 20, 456, '2020-11-15', 1, 18, 12, 0)
, (2, 20, 456, '2020-11-16', 1, 30, 0, 0)
, (2, 20, 456, '2020-11-15', 3, -9, 20, -31)
, (2, 20, 456, '2020-11-16', 3, -20, 0, 0)
;
Then
SELECT PLACE
, LOCATION
, SUM(CASE WHEN AMOUNT >= 0 THEN 1 ELSE 0 END) AS 'COUNT_CORRECT'
, SUM(CASE WHEN AMOUNT < 0 THEN 1 ELSE 0 END) AS 'COUNT_WRONG'
, ROUND((SUM(CASE WHEN AMOUNT < 0 THEN 1 ELSE 0 END) / COUNT(AMOUNT)) * 100, 2) AS 'ratio'
FROM t
GROUP BY PLACE, LOCATION
ORDER BY PLACE, LOCATION
;
But I don't know how I can filter for "AND which do not have a negative AMOUNT before". Any help appreciated.
CodePudding user response:
You can use this to calculate the number of correct entries before a negative entry
SELECT PLACE,LOCATION,MAX(COUNT_CORRECT) as NUM_CORRECT_ENTRIES
FROM
(SELECT PLACE
, LOCATION
, ROW_NUMBER() OVER(PARTITION BY PLACE,LOCATION ORDER BY DATE) AS
'COUNT_CORRECT'
FROM t
WHERE AMOUNT>=0)
;
CodePudding user response:
I'm not sure if I understand your question correctly, but the following gives you the number of rows before the first negative amount per (place, location) partition.
The subselect computes the row numbers of all rows with a negative amount. Then we can select the minimum of this as the first row with a negative amount.
SELECT
place,
location,
COUNT(*) - NVL(MIN(pos) - 1, COUNT(*)) AS COUNT_WRONG,
COUNT(*) - local.COUNT_WRONG AS COUNT_CORRECT,
ROUND(local.COUNT_WRONG / COUNT(*),2) AS RATIO
FROM
( SELECT
amount,
place,
location,
CASE
WHEN amount < 0
THEN ROW_NUMBER() over (
PARTITION BY
place,
location
ORDER BY
"TIMESTAMP")
ELSE NULL
END pos -- Row numbers of rows with negative amount, else NULL
FROM
t)
GROUP BY
place,
location;