Home > Back-end >  SQL Server join on subquery
SQL Server join on subquery

Time:09-16

I have two SQL tables : TableA and Table B

TableA

batch_identifier (varchar) batch_status (bit) dt_utc (datetime) batch_tasks
batch_id1 0 2022-09-15 10:00:00 1
batch_id1 1 2022-09-15 11:00:00 1
batch_id2 0 2022-09-15 10:30:00 4

TableB

batch_identifier (varchar) task_status(bit) task_description(varchar) task_identifier(varchar)
batch_id1 0 2022-09-15 10:00:00 task_id1
batch_id1 1 2022-09-15 10:50:00 task_id1
batch_id2 0 2022-09-15 10:30:00 task_id1
batch_id2 1 2022-09-15 10:35:00 task_id1
batch_id2 0 2022-09-15 10:36:00 task_id2
batch_id2 0 2022-09-15 10:37:00 task_id3

What I'm trying to do is the following:
Get a table of 'in progress' batches (i.e batch_status = 0) along with the tasks that have been performed. So for instance in the example above the only batch still running is batch_id2 and the associated completed task is task_id1.

So far I've tested this without success:

select
*
from
 (
    select
        batch_identifier,
        MAX(CAST(batch_status as int)) as _status
    from
        dbo.TableA
    group by
        batch_identifier
) data
where _status = 0 ??

Can you please help with that?

batch_identifier and task_identifier are SQL index.

CodePudding user response:

Maybe something like one of these?

SELECT batch_identifier, db_utc, Zero, One
FROM TableA
    LEFT JOIN (
        SELECT batch_identifier,
            SUM(CASE WHEN task_status = 0 THEN 1 ELSE 0 END) AS Zero,
            SUM(CASE WHEN task_status = 1 THEN 1 ELSE 0 END) AS One
        FROM TableB
        GROUP BY batch_identifier
    ) AS b
        ON a.batch_identifier = b.batch_identifier

SELECT *
FROM TableA a
WHERE NOT EXISTS (
    SELECT NULL FROM TableB b WHERE b.batch_identifier = a.batch_identifier AND b.task_status = 0
)

CodePudding user response:

If I'm understanding you correctly, I don't think you need a subquery.

Try:

SELECT batch_identifier, task_identifier, MAX(CAST(batch_status as int)) as _status
FROM dbo.TableA a
INNER JOIN dbo.TableB b ON a.batch_identifier = b.batch_identifier AND task_status = 1
WHERE batch_status = 0
GROUP BY batch_identifier, task_identifier

CodePudding user response:

Using common table expression to get what batch(es) are still in progress. Then I will get the details in TableB and group by batch and task identifier and check that that max value in task status is zero (meaning task is still in-progress).

WITH data as (
    select
        batch_identifier 
    from
        dbo.TableA
    group by
        batch_identifier
    having 
        MAX(CAST(batch_status as int)) = 0)  
select
      details.batch_identifier,
      MIN(CAST(details.task_status as int)) as task_status,
      MIN(details.task_description) as task_description,
      details.task_identifier
from 
        dbo.TableB as details
inner join 
        data on data.batch_identifier = details.batch_identifier  
group by
        details.batch_identifier, 
        task_identifier
having
        MAX(CAST(details.task_status as int)) = 0

RESULT:

batch_identifier    task_status task_description    task_identifier 
----------------    ----------- ----------------    ------------
batch_id2           0           2022-09-15 10:36:00 task_id2
batch_id2           0           2022-09-15 10:37:00 task_id3
  • Related