I'd like to run the following analysis in SQL Server, but since my SQL knowledge is rather limited, I can't get to where I'm trying to.
I have the following tables that show how Loans connect to Borrowers, and how Loans connect to Real Estate property collateral pledged by the Borrowers. Some of the Loans are connected to 2 Borrowers. I want to see how much money all Borrowers that connect to an RE Property owe, without double-counting for loans where both borrowers are obligors. For example, assume the following tables:
1. Loans
Loan_ID | Total_Claim |
---|---|
Loan_1 | 50,000 |
Loan_2 | 30,000 |
Loan_3 | 40,000 |
2. Borrowers_Loans
Borrower_ID | Loan_ID |
---|---|
Borrower_1 | Loan_1 |
Borrower_1 | Loan_2 |
Borrower_2 | Loan_2 |
Borrower_2 | Loan_3 |
3. RE_Properties
Property_ID | Loan_ID |
---|---|
Property_1 | Loan_1 |
Property_2 | Loan_2 |
Property_3 | Loan_3 |
Now, I would like to calculate how much in total Borrowers that are connected to an RE property owe, as below:
Property_ID | Total_Owed_By_Borrowers |
---|---|
Property_1 | 50,000 |
Property_2 | 120,000 |
Property_3 | 70,000 |
However, running the below double-counts Loan_2 (and shows the sum of Property_2 as 150,000), as it is connected to both Borrowers:
SELECT RE.Property_ID, SUM(L.Total_Claim) FROM RE_Properties RE
INNER JOIN Borrowers_Loans BL ON BL.Loan_ID = RE.Loan_ID
INNER JOIN Borrowers_Loans BLA ON BLA.Borrower_ID = BL.Borrower_ID
INNER JOIN Loans L ON L.Loan_ID = BLA.Loan_ID
GROUP BY RE.Property_ID
Any suggestions? Thanks!
CREATE TABLE RE_Properties (
Property_ID nvarchar(255),
Loan_ID nvarchar(255));
CREATE TABLE Borrowers_Loans (
Borrower_ID nvarchar(255),
Loan_ID nvarchar(255));
CREATE TABLE Loans (
Loan_ID nvarchar(255),
Total_Claim int);
INSERT INTO RE_Properties (Property_ID, Loan_ID) VALUES ('Property_1', 'Loan_1');
INSERT INTO RE_Properties (Property_ID, Loan_ID) VALUES ('Property_2', 'Loan_2');
INSERT INTO RE_Properties (Property_ID, Loan_ID) VALUES ('Property_3', 'Loan_3');
INSERT INTO Borrowers_Loans (Borrower_ID, Loan_ID) VALUES ('Borrower_1', 'Loan_1');
INSERT INTO Borrowers_Loans (Borrower_ID, Loan_ID) VALUES ('Borrower_1', 'Loan_2');
INSERT INTO Borrowers_Loans (Borrower_ID, Loan_ID) VALUES ('Borrower_2', 'Loan_2');
INSERT INTO Borrowers_Loans (Borrower_ID, Loan_ID) VALUES ('Borrower_2', 'Loan_3');
INSERT INTO Loans (Loan_ID, Total_Claim) VALUES ('Loan_1', 50000);
INSERT INTO Loans (Loan_ID, Total_Claim) VALUES ('Loan_2', 30000);
INSERT INTO Loans (Loan_ID, Total_Claim) VALUES ('Loan_3', 40000);
CodePudding user response:
Risk to each borrower is sum of all loans associated with them (assuming they adopt the whole loan if any associated borrower defaults)
select bl.Borrower_ID, sum(loans.total_claim) borrower_total_risk
from Borrowers_Loans as bl
inner join Loans on bl.loan_id = loans.loan_id
Risk to each property is the value of the loans to that property
select p.Property_ID, sum(loans.total_claim) property_loan_total
from RE_Properties as p
inner join Loans on p.Loan_ID = loans.Loan_ID
group by p.Property_ID
If I was to combine this I wouldn't just do a simple sum, e.g.:
SELECT
RE.Property_ID
, max(L.Total_Claim) AS Total_Claim
, count(distinct bl.Borrower_ID) num_borrowers
, min(b_risk.borrower_total_risk) as min_borrower_total_risk
, max(b_risk.borrower_total_risk) as max_borrower_total_risk
FROM RE_Properties RE
INNER JOIN Loans L ON L.Loan_ID = RE.Loan_ID
LEFT JOIN Borrowers_Loans BL ON BL.Loan_ID = RE.Loan_ID
LEFT JOIN (
select bl.Borrower_ID, sum(loans.total_claim) borrower_total_risk
from Borrowers_Loans as bl
left join Loans on bl.loan_id = loans.loan_id
group by bl.Borrower_ID
) as b_risk on bl.Borrower_ID = b_risk.Borrower_ID
GROUP BY RE.Property_ID;
Results:
| Property_ID | Total_Claim | num_borrowers | min_borrower_total_risk | max_borrower_total_risk |
|-------------|-------------|---------------|-------------------------|-------------------------|
| Property_1 | 50000 | 1 | 80000 | 80000 |
| Property_2 | 30000 | 2 | 70000 | 80000 |
| Property_3 | 40000 | 1 | 70000 | 70000 |
Note that the count of borrowers shows 2 when it needs to and that does NOT require multiple joins to the same table.
http://sqlfiddle.com/#!9/9280d4/26/1
CodePudding user response:
There seems to be no need to involve Borrowers. Join only Properties and Loans
RE_Properties Loans
Property_ID Loan_ID Loan_ID Total_Claim
Property_1 Loan_1 Loan_1 50000
Property_2 Loan_2 Loan_2 30000
Property_3 Loan_3 Loan_3 40000
Loan_4 10000
Loan_5 70000
CodePudding user response:
There may be an easier way, but using your query as a starting point you can ensure each loan is only counted once by grouping by Loan_ID
and taking MAX
for the amount (or FIRST
it doesn't matter since we know its the same value) as follows:
WITH Cte as (
SELECT RE.Property_ID
, L.Loan_ID, MAX(L.Total_Claim) Total_Claim
FROM RE_Properties RE
INNER JOIN Borrowers_Loans BL ON BL.Loan_ID = RE.Loan_ID
INNER JOIN Borrowers_Loans BLA ON BLA.Borrower_ID = BL.Borrower_ID
INNER JOIN Loans L ON L.Loan_ID = BLA.Loan_ID
GROUP BY RE.Property_ID, L.Loan_ID
)
SELECT Property_ID, SUM(Total_Claim)
FROM Cte
GROUP BY Property_ID;