Home > Software design >  Avoid double counting - only count first occurrence in table
Avoid double counting - only count first occurrence in table

Time:02-13

I am trying to do a count by month of the total number of items (serialnumber) that appears in inventory.

This usually can be easily solved with distinct, however, I only want to count if it is the first occurrence that it appears (first insert).

This query gets me most of the way there.

select date_trunc (‘month’,date) as Date,productid, count(distinct serialnumber) from inventory
where date_trunc(‘month’,date)>= ‘2016-01-01’ and productID in ('1','2') and status = ‘INSERT’
group by date_trunc(‘month’,date), productid
order by date_trunc(‘month’,date) desc

But I realize I am double/triple/quadruple counting some serial numbers because an item can reappear in our inventory multiple times over the course of its lifecycle.

The query above covers these scenarios since the serial numbers appear once:

  • Shows up as new
  • Shows up as used

Below are the use cases where I realize I may be double/triple/quadruple counting:

  • Shows up as new then comes back around as used (no limit to how many times it can appear used)
  • Shows up used then comes back again as used (no limit to how many times it can appear used)

Here's an example I ran into.

(Note: I have added the condition column to better illustrate this). But the particular serial number has been in inventory three times (first as new, then as used twice)

Date ProductID Count Condition
7-1-21 1 1 u
11-1-18 1 1 u
2-1-17 1 1 n

In my current query results, each insert gets counted (once in Feb 2017, once in Nov 2018 and once in July 2021).

How can I amend my query to make sure I'm only counting the very first instance (insert) a particular serial number appears in the inventory table?

CodePudding user response:

In the subquery calculate first insert date only of each product/item using min aggregate function. Then count the items on that result:

select Date, productid, count(serialnumber)
from (
    select min(date_trunc(‘month’,date)) as Date, productid, serialnumber
    from inventory
    where date_trunc(‘month’,date) >= ‘2016-01-01’
    and productID in ('1','2')
    and status = ‘INSERT’
    group by productid, serialnumber
) x
group by Date, productid
order by Date desc;
  • Related