Home > Software design >  Adding Classifier via SQL based on Date / Category
Adding Classifier via SQL based on Date / Category

Time:04-03

I have a 3 column table (CompanyName, CompanyCategory, StartDate), that's a company name followed by a company category and a start date for the category per the sample below. Based on the category's start date by company, I'm working on labeling the category as either NEW or EXISTING under column STATUS which doesn't currently exist.

FIRST would be the first category by date within a company, with all future categories by date for that given company being labeled as NOT FIRST. I have 3 columns, looking to generate the 4th (Status) but am not clear how to leverage methods such as PARTITION effectively to classify the correct categories by date within a company.

CompanyName | CompanyCategory | StartDate | Status (Need to Generate)
---------------------------------------------------------------------
CompanyA      CategoryA         2020-01-01  First
CompanyA      CategoryB         2020-02-01  Not First
CompanyB      CategoryA         2020-01-01  First
CompanyB      CategoryB         2020-03-01  Not First
CompanyC      CategoryA         2020-04-01  First
CompanyC      CategoryB         2020-06-01  Not First
CompanyC      CategoryC         2020-10-01  Not First
CompanyD      CategoryA         2021-10-01  First
CompanyD      CategoryB         2021-11-01  Not First
CompanyD      CategoryC         2021-02-01  Not First

Tried the code below, but didn't come out correctly

(case when row_number() over (partition by table.companycategory order by o.startdate 
) = 1 then 'FIRST' else 'NOT FIRST' end) as STATUS

CodePudding user response:

I'm pretty sure you've just partitioned by the wrong column. This produces the results in the question:

SELECT CompanyName, CompanyCategory, StartDate, 
  Status = CASE (ROW_NUMBER() OVER 
    (PARTITION BY CompanyName ORDER BY StartDate)) 
    WHEN 1 THEN '' ELSE 'NOT ' END   'FIRST'
FROM dbo.SomeTable;

If you are trying to classify categories by age, please correct the desired results listed in the question to reflect that, but that would just change to:

SELECT CompanyName, CompanyCategory, StartDate, 
  Status = CASE (ROW_NUMBER() OVER 
    (PARTITION BY CompanyCategory ORDER BY StartDate)) 
    WHEN 1 THEN '' ELSE 'NOT ' END   'FIRST'
FROM dbo.SomeTable;
  • Related