Home > OS >  SQL, during query create new column based on data conditions
SQL, during query create new column based on data conditions

Time:12-10

I've managed to cobble together a SQL query that works, using a combination of unions and joins of tables that gives me the interim results I need.

SELECT  n.study_id AS StudyId,
        n.practice_id AS PracticeId,
    n.FluVaxCode,
    n.Date,
    date(p.BaseStart / 1000, 'unixepoch') AS BaseStart, 
    date(p.BaseEnd / 1000, 'unixepoch') AS BaseEnd,
    date(p.OutcomeStart / 1000, 'unixepoch') AS OutcomeStart, 
    date(p.OutcomeEnd / 1000, 'unixepoch') AS OutcomeEnd,
    CASE WHEN Date BETWEEN BaseStart AND BaseEnd THEN 'Y' ELSE 'N' END AS BaseVax,
    CASE WHEN Date BETWEEN OutcomeStart AND OutcomeEnd THEN 'Y' ELSE 'N' END AS OutcomeVax
FROM toypractice p INNER JOIN
     (SELECT  t.study_id, t.practice_id,
     date(t.event_date / 1000, 'unixepoch') AS Date, 
     t.code_id AS FluVaxCode
     FROM toytherapy t
     WHERE t.code_id IN ('dher.', 'a6b1.', 'bk31.')
     UNION 
     SELECT  c.study_id, c.practice_id, 
     DATE(c.event_date / 1000, 'unixepoch') AS Date, 
     c.code_id AS FluVaxCode
     FROM toyclinical c
     WHERE c.code_id IN ('1383.', '229..', 'X77RW')
     ORDER BY FluVaxCode DESC
     ) n 
     ON p.practice_id = n.practice_id;

I end up with multiple columns including three as:

Date, BaseStart, BaseEnd

all of which are in YYYY-MM-DD format (I think). Is there a way, during the query I can tack some code on the end of the query such that it will create a new column in the outputs, such that if Date falls between BaseStart and BaseEnd, the value in the new column will be 'Y', otherwise value in new column will be 'N'? In this case I'm using sqlite to develop/test but it will eventually have to work in mssqlserver 2019. Thx. J

CodePudding user response:

Try adding the following to your current select list:

    , CASE WHEN n.Date BETWEEN date(p.BaseStart / 1000, 'unixepoch') AND date(p.BaseEnd / 1000, 'unixepoch') THEN 'a' ELSE 'b' END AS is_between

This can be done in different ways. But the above should work.

Here's another adjustment that should work. I've added the original event_date to the result of the n derived table, plus added a corresponding CASE expression to the outer SELECT list. The other CASE expressions probably need to be similarly adjusted.

SELECT n.study_id AS StudyId
     , n.practice_id AS PracticeId
     , n.FluVaxCode
     , n.Date
     , date(p.BaseStart / 1000, 'unixepoch') AS BaseStart
     , date(p.BaseEnd / 1000, 'unixepoch') AS BaseEnd
     , date(p.OutcomeStart / 1000, 'unixepoch') AS OutcomeStart
     , date(p.OutcomeEnd / 1000, 'unixepoch') AS OutcomeEnd
     , CASE WHEN Date BETWEEN BaseStart AND BaseEnd THEN 'Y' ELSE 'N' END AS BaseVax
     , CASE WHEN Date BETWEEN OutcomeStart AND OutcomeEnd THEN 'Y' ELSE 'N' END AS OutcomeVax
     , CASE WHEN n.event_date BETWEEN p.BaseStart AND p.BaseEnd THEN 'a' ELSE 'b' END AS is_between
  FROM toypractice p
  JOIN (
         SELECT t.study_id, t.practice_id
              , t.event_date
              , date(t.event_date / 1000, 'unixepoch') AS Date
              , t.code_id AS FluVaxCode
           FROM toytherapy t
          WHERE t.code_id IN ('dher.', 'a6b1.', 'bk31.')
          UNION 
         SELECT c.study_id, c.practice_id
              , c.event_date
              , DATE(c.event_date / 1000, 'unixepoch') AS Date
              , c.code_id AS FluVaxCode
           FROM toyclinical c
          WHERE c.code_id IN ('1383.', '229..', 'X77RW')
          ORDER BY FluVaxCode DESC
       ) n 
    ON p.practice_id = n.practice_id
;

The fiddle

Something like this:

Specific test case for sqlite, as a starting point.

  • cte - provides a couple of rows representing your current query result
  • cte2 - shows how we can convert the date (conv_date) to be compatible with the Base values
  • Final query expression - shows how to use the derived conv_date from cte2 to calculate BETWEEN properly.

Notice, I left the attempt to use date directly, which produces the wrong is_between result. is_between2 shows the correct result.

WITH cte (date, BaseStart, BaseEnd) AS (
         SELECT 1467241200000, '2016-06-20', '2016-06-30' UNION
         SELECT 1467241200000, '2017-06-20', '2017-06-30'
     )
SELECT date(date / 1000, 'unixepoch'), BaseStart, BaseEnd
     , CASE WHEN Date BETWEEN BaseStart AND BaseEnd THEN 'a' ELSE 'b' END AS is_between
     , CASE WHEN date(date / 1000, 'unixepoch') BETWEEN BaseStart AND BaseEnd THEN 'a' ELSE 'b' END AS is_between2
  FROM cte
;

and also:

WITH cte (date, BaseStart, BaseEnd) AS (
         SELECT 1467241200000, '2016-06-20', '2016-06-30' UNION
         SELECT 1467241200000, '2017-06-20', '2017-06-30'
     )
   , cte2 AS (
         SELECT *
              , date(date / 1000, 'unixepoch') AS conv_date
           FROM cte
     )
SELECT date(date / 1000, 'unixepoch'), BaseStart, BaseEnd
     , CASE WHEN Date      BETWEEN BaseStart AND BaseEnd THEN 'a' ELSE 'b' END AS is_between
     , CASE WHEN conv_date BETWEEN BaseStart AND BaseEnd THEN 'a' ELSE 'b' END AS is_between2
  FROM cte2
;

Result:

date(date / 1000, 'unixepoch') BaseStart BaseEnd is_between is_between2
2016-06-29 2016-06-20 2016-06-30 b a
2016-06-29 2017-06-20 2017-06-30 b b

General answer:

SELECT Date, BaseStart, BaseEnd
     , CASE WHEN Date BETWEEN BaseStart AND BaseEnd THEN 'a' ELSE 'b' END AS is_between
  FROM ( your SQL ) AS derived_table
;

or

WITH cte AS (
       your sql
     )
SELECT Date, BaseStart, BaseEnd
     , CASE WHEN Date BETWEEN BaseStart AND BaseEnd THEN 'a' ELSE 'b' END AS is_between
  FROM cte
;

You may need to add quoting if you chose identifiers which conflict with reserved or keywords in the SQL language supported by your database.

  • Related