Home > database >  BigQuery producing "left join" results rather than a Full outer join
BigQuery producing "left join" results rather than a Full outer join

Time:07-05

I have the below query:

SELECT
  OUTPUTS.calculation_id,
  SCENARIOS.scenario_id,
  OUTPUTS.account_id,
  ACCOUNTS.account_name,
  OUTPUTS.output_date,
  OUTPUTS.output_value

FROM
 `p.Trunk.OUTPUTS` as OUTPUTS

  full outer join `p.Trunk.ACCOUNTS` as ACCOUNTS
  on OUTPUTS.account_id = ACCOUNTS.account_id

  join `p.Trunk.SCENARIOS` as SCENARIOS
  on OUTPUTS.calculation_id = SCENARIOS.calculation_id

I want to use the resulting values to create a pivot table that has output_date grouped by year in the columns, account_name in the rows and the output_value as values.

All account_id in OUTPUTS have a corresponding account_id in ACCOUNTS from which I get the account_name

However, there are many account_id in ACCOUNTS without an entry in OUTPUTS but I still need to see these accounts as rows in the pivot table, with zero value for output_value.

When I try the query, it only provides me the account_name for which there is a corresponding entry in OUTPUTS (i.e. LEFT JOIN)

I was wondering what do I need to do to get a FULL RIGHT JOIN that shows me all the accounts, with those that have no entries in OUTPUTS showing a zero value for all the years?

Thanks

CodePudding user response:

You can use a LEFT JOIN and COALESCE

SELECT OUTPUTS.calculation_id
     , SCENARIOS.scenario_id
     , OUTPUTS.account_id
     , ACCOUNTS.account_name
     , OUTPUTS.output_date

     -- First non-null argument from the left
     , COALESCE(OUTPUTS.output_value, 0) AS output_value

FROM p.Trunk.OUTPUTS as OUTPUTS
FULL JOIN p.Trunk.ACCOUNTS as ACCOUNTS
   ON OUTPUTS.account_id = ACCOUNTS.account_id

LEFT JOIN p.Trunk.SCENARIOS as SCENARIOS
   ON OUTPUTS.calculation_id = SCENARIOS.calculation_id

CodePudding user response:

You're using an inner join on the result of an outer join which will remove the rows you were expecting. These would be rows from accounts table for which there is no match in outputs table; the null in outputs.calculation_id will not match the on clause of inner join. Change join to left join:

from `p.Trunk.OUTPUTS` as OUTPUTS
full outer join `p.Trunk.ACCOUNTS` as ACCOUNTS on OUTPUTS.account_id = ACCOUNTS.account_id
left join `p.Trunk.SCENARIOS` as SCENARIOS on OUTPUTS.calculation_id = SCENARIOS.calculation_id
  • Related