Home > Mobile >  Join tables and sum specific data based on a mapping
Join tables and sum specific data based on a mapping

Time:05-14

I'm trying to learn SSMS and I've got this thing I'm trying to do. I've got this table called CompanyGroup:

| SourceCompanyId | TargetCompanyId  |
| ---  | ------ |
| 4626 | 359468 |
| 4626 | 7999   |
| 56167| 11947  |

This table represents the companies that are part of the same group. For example if I am company 4626 then my company group will be formed of [4626, 359468, 7999] and if I was company 7999 my company group would still be [4626, 359468, 7999]. Therefor if I were company 56167 my company group would be [56167, 11947] (I hope this makes it clear, will provide more details if needed).
Then there is this other table called Sales

| CompanyId| Product| Sales|
|:-----|:-------| ------|
| 4626  | Monitors| 1000|
| 4626  | Locks   | 300|
| 359468| Chairs  | 500|
| 359468| Lights  | 300|
| 7999  | Chairs  | 500|
| 4626  | Locks   | 300|
| 56167 | Locks   | 1000|
| 11947 | Tables  | 300|

This table represents the sales of a company.
I'm working on this query to get the total amount of sales per company group when you have "selected" either company in that group, eg: for the group 4626 is in the total sales would be 2300 (again, if this is unclear I'm sorry I can get into more detail if needed).

Here is my query so far:

SELECT 
    s.CompanyId,
    SUM(s.Sales) AS 'Sales'
    FROM Sales AS s 
    JOIN CompanyGroup AS cg ON s.SourceCompanyId = cg.CompanyId
    WHERE s.CompanyId = 4626
    GROUP BY s.CompanyId

which has this result

|CompanyId | Sales  |
| -------- | ------ |
| 4626     | 1600   |

But I would like to get just the total sum of sales for the entire group company, eg:

| Sales  |
| ------ |
| 2900   |

I'm just not sure how to add make it sum the data from the other companies in the current group and add them together.

Hopefully you can get a clear idea of what I've done here. Thank you.

CodePudding user response:

Although you are new to SQL, your data structure as it is will be nothing but trouble. Can the query be done, yes, but harder. I would first like to suggest an alternative to what you have to identify "groups". Create a second table of groups, then have all companies associated with said group. You could even have some clear-text content of the group such as

CompanyGroups
CompanyGroupID  CompanyGroupName
1               Eastern Group
2               Northern Group
3               Technical Group
4               Furniture Group

Then the companies
SourceCompanyId  CompanyGroupID
4626             3
359468           3
7999             3
56167            4
11947            4

So, there is one record per company and the known group associated. If a company can POSSIBLY be associated with multiple groups, you could have additional records per company and alternate group as well.

Now, back to the regularly scheduled program and your query. You need to have one "common" group so all targets are associated, INCLUDING the underlying source company in the group, such as your 4626 was the source, and the other two of 359468, 7999 are in the same. It expands on the other answer, but forces the left-most ID into a primary position.

select distinct 
      SourceCompanyID as GrpParent,
      SourceCompanyID as IncludedCompany
   from
      CompanyGroup cg
UNION
select 
      cgParent.SourceCompanyID as GrpParent,
      cgTarget.TargetCompanyId as IncludedCompany
   from
      CompanyGroup cgParent
         JOIN CompanyGroup cgTarget
            on cgParent.SourceCompanyID = cgTarget.SourceCompanyID

Notice the first part of the query is getting the Source once even if they are associated with 5 other targets. We dont want to duplicate counts because of duplicate sources. It holds its own ID as both the parent AND the company to be included as part-of the group.

The second starts again with the same parent, but gets the TARGET as the included company. So, based on YOUR data

SourceCompanyId  TargetCompanyId
4626             359468
4626             7999
56167            11947

Would result as

GrpParent   IncludedCompany
-- first the distinct portion before union
4626        4626
56167       56167
-- now the union portion
4626        359468
4626        7999
56167       11947

And you can see the 5 total records and the 4626 "Group" shows all three company IDs including itself on the right-side, similarly for 56167 having two entries with each respective on the included companies right-side.

Now with this, you should be able to join the summation of data by the GROUP and not cause duplicated aggregations.

select
      CompGrps.GrpParent,
      sum( CompSales.Sales ) as GroupTotalSales
   from
      ( select distinct 
              SourceCompanyID as GrpParent,
              SourceCompanyID as IncludedCompany
           from
              CompanyGroup cg
        UNION
        select 
              cgParent.SourceCompanyID as GrpParent,
              cgTarget.TargetCompanyId as IncludedCompany
           from
              CompanyGroup cgParent
                 JOIN CompanyGroup cgTarget
                    on cgParent.SourceCompanyID = cgTarget.SourceCompanyID
       ) as CompGrps
         JOIN 
         ( SELECT 
                 s.CompanyId,
                 SUM(s.Sales) AS Sales
              FROM 
                 Sales s
             group by
                 s.CompanyId ) CompSales
            on CompGrps.IncludedCompany = CompSales.CompanyID
   group by
      CompGrps.GrpParent
   order by
      sum( CompSales.Sales ) desc

So notice the first query getting distinct group companies, and the secondary querying from its own per-company sales can be joined on the company ID of itself, but summed based on the common group parent, thus giving totals the the outer level per GROUP.

I also tacked on a simple order by to get largest sales sorted at the top. As you can see it's a bit messier with existing structure, but can be done.

The output should look something like

GrpParent  GroupTotalSales
4626       2900       (4626 had 1600, 359468 had 800, and 7999 had 500)
56167      1300       (56167 had 1000, 11947 had 300) 

CodePudding user response:

On the (bad) implicit assumption that any company not listed as a Target Company is therefore a Group Company...

SELECT
  COALESCE(c.SourceCompanyId, s.CompanyId)   AS GroupCompanyId,
  SUM(s.Sales)   AS Sales
FROM
  Sales          s
LEFT JOIN
  CompanyGroup   c
    ON c.TargetCompanyId = s.CompanyId
GROUP BY
  COALESCE(c.SourceCompanyId, s.CompanyId)

Demo: https://dbfiddle.uk/?rdbms=sqlserver_2019&fiddle=0fbc8e7d00625d3eae72e1763c9e2688

But, honestly, the design is bad.

  • Related