Home > database >  How to speed-up a spatial join in BigQuery?
How to speed-up a spatial join in BigQuery?

Time:05-26

I have a BigQuery table with point registers along a whole country, and I need to assign a "censal zone" to each one of them, which polygons are contained in another table. I've been trying to do so using a query like this one:

SELECT id_point, code_censal_zone
    FROM `points_table`
    JOIN `zones_table`
    ON ST_CONTAINS(zone_polygon, point_geo)

The first table is quite large, so the query performes very inefficiently as it is comparing each possible pairs of (point, censal zone). However, both tables have a column identifier for the municipality in which they are in, so the question is, can rewrite my query in some way that ST_CONTAINS(*) is performed for each (point, censal zone) pair that belongs to the same municipality, hence not comparing all posible censal zones within the country for each point? Can I do this without having to read points_table multiple times?

SELECT id_point, code_censal_zone
    FROM `points_table`
    JOIN `zones_table`
    ON 1.municipality = 2.municipality
    AND ST_CONTAINS(zone_geo, point_geo)

I'm quite new to BigQuery so I don't really know if a query like this would actually do what I'am expecting, as I couldn't find anything in the documentation.

Thanks!

CodePudding user response:

SELECT id_point, code_censal_zone
    FROM `points_table`
    JOIN `zones_table`
    ON 1.municipality = 2.municipality
    AND ST_CONTAINS(zone_geo, point_geo)
  • Related