Home > Enterprise >  SQL date Filter | Model Performace
SQL date Filter | Model Performace

Time:04-24

We have to split the data into train and test sets and calculate the True positive rate metric in SQL. We have the data until ID 1000, Below is the sample table _score

ID _date prediction actual
1 2020-02-01 0.81 1
2 2020-05-04 0.22 0
3 2020-01-18 0.84 1
4 2020-07-11 0.92 1
5 2020-12-31 0.44 0
6 2020-06-02 0.71 1
7 2020-03-02 0.11 0
1000 2020-11-22 0.61 0

Question:

  • let _date before 2020-11-01 as our training period, and the date after that as the test period. Calculate the positive rate in these two date range

Formula: positive rate = number of positive / (number of positive number of negative)

Positive = 1 and negative = 0

Please help with calculating the TPR rate in a single query, I am confused about filtering these records in train and test set.

Expected Output:

ID label Positive Rate
1 Training Data X
2 Test Data Y

We need the Positive rate column for both train and test set data in a single column, which we have to split upon the date value as mentioned above. We will have only 2 rows, one for training set and other for the test set.

Train data : Date is before 2020-11-01 Test data : Date after 2020-11-01

  • Request to comment if any edits or clarifying questions is required.

CodePudding user response:

You Can do it like this fist you create a subselect, that sorts the data for the id and label and count the positives and negartives.

Then you use the sunbquery to get your wanted data

SELECT `ID`,`Label`, SUM(positive) * 1.0/ (SUM(positive) SUM(negative))
FROM
(SELECT 
    IF(`_date` <= '2020-11-01',1,2) AS 'ID',
    IF(`_date` <= '2020-11-01','Training Data','Test Data') AS 'Label',
    IF(`actual` = 1,1,0) as positive ,
    IF(`actual` = 0,1,0) as negative 
FROM predictions) pred
GROUP BY `ID`,`Label`
ID | Label         | SUM(positive) * 1.0/ (SUM(positive) SUM(negative))
-: | :------------ | -------------------------------------------------:
 1 | Training Data |                                            0.66667
 2 | Test Data     |                                            0.00000

db<>fiddle here

  • Related