I managed to join two tables together on BigQuery, however, I don't think it matches correctly. clients
have descriptive client names with 75 rows, while stats
contain client stats with 37342 rows. I'm trying to join these two tables together using their ExternalCustomerId
with this query:
SELECT
clients.AccountDescriptiveName AS client_name,
stats.ExternalCustomerId AS client_id,
AverageCost,
AverageCpc,
AverageCpm,
AveragePosition
FROM `298114322003.google_ads1.p_Customer_2670156874` AS clients
JOIN `298114322003.google_ads1.p_AccountStats_2670156874` AS stats
ON clients.ExternalCustomerId = stats.ExternalCustomerId
However, the results came out to 113026 rows. I expect the results to be 37342 since that's the results of the stats
table. I've used RIGHT/LEFT JOIN
function and it still came out the same. Any suggestions on how I should tackle this problem? Thank you!
AK
CodePudding user response:
It looks like your stats
table has mutiple rows for the same ExternalCustomerId
(and it can be understandable for example if it is partitioned and you have different data during the days).
Try to explore a little bit more as it follows:
SELECT count(*) as total, count(distinct ExternalCustomerId) as uniques
FROM `298114322003.google_ads1.p_AccountStats_2670156874`
If you have duplicated ExternalCustomerId
then every client row will be multiplied by the number of matching rows in the stats
table.