Home > Net >  How to replace values in a polygons shapefile through spatial join in python
How to replace values in a polygons shapefile through spatial join in python

Time:10-01

I have polygons created from an intersection with other polygons existing in a single shapefile:

enter image description here

And so the dataframe for this shapefile consists of 4 rows, with separate geometries, each row corresponding to a different intersection zone polygon, and so a row for each color. In this dataframe, there is a “Value” column, where the values of these rows are shown in each section of the intersected polygon. And now, in another shapefile, I have a single point (marked as a red dot), containing a value. When overlayed on top of the first shapefile, we now see this, with the point being spatially “contained” within the lavender zone:

enter image description here

What I am trying to do, using python and geopandas, is to "spatially join" the red point to the lavender intersection zone, so that the current 23 value of the lavender intersection zone is replaced by the 19 value from the point, while the values from the other intersection zones are left unchanged.

I have been trying to use the spatial join function from geopandas to accomplish this, specifically the .sjoin() function, but am having some confusion here, because when I try to spatially join the point to the polygons, all that remains is the single lavender intersection zone, while the other intersection zones (rows) are missing from the output dataframe. I suppose this is because spatially joining only leaves those features that are actually "spatially joined", though I want to keep all of the intersected zone polygons, just with the intended value replaced.

So far I have tried this:

point_in_poly = gpd.sjoin(intersected_polygons, point, how='inner', op='contains')

which just produces a geodataframe with a single row. How can I augment my spatial join code so that I am given a geodataframe of all of my intersected polygons, but with just the lavender intersection zone's "Value" value replaced with 19, and so four polygon rows?

CodePudding user response:

  • you have not provided sample data so have generated 5 polygons when dissolved would generate a circle similar to your image
  • simply sjoin(how="left") is fundamental to get rows for each of the polygons
  • then simply fillna() values from points from values from polygons
# sjoin and replace
gdf_merged = (
    gpd.sjoin(gdf_polys, gdf_point, how="left")
    .assign(value=lambda d: d["value_right"].fillna(d["value_left"]))
    .pipe(
        lambda d: d.drop(
            columns=[
                c for c in d.columns if c.endswith("_left") or c.endswith("_right")
            ]
        )
    )
)



visualize

enter image description here

simulated data

gdf_polys

   value                                           geometry
0     23  POLYGON ((-0.10569 51.50166, -0.10708 51.50183...
1     44  POLYGON ((-0.13165 51.50222, -0.13026 51.50205...
2     65  POLYGON ((-0.11911 51.49384, -0.11885 51.49471...
3     77  POLYGON ((-0.11823 51.51004, -0.11849 51.50918...
4     88  POLYGON ((-0.11809 51.51004, -0.11793 51.50916...

gdf_point

   value                   geometry
0    200  POINT (-0.11119 51.50625)

result

                                            geometry  value
0  POLYGON ((-0.10569 51.50166, -0.10708 51.50183... 200.00
1  POLYGON ((-0.13165 51.50222, -0.13026 51.50205...  44.00
2  POLYGON ((-0.11911 51.49384, -0.11885 51.49471...  65.00
3  POLYGON ((-0.11823 51.51004, -0.11849 51.50918...  77.00
4  POLYGON ((-0.11809 51.51004, -0.11793 51.50916...  88.00

full code

import geopandas as gpd
import pandas as pd
from shapely.geometry import Point, MultiPolygon
import matplotlib.pyplot as plt

# construct some geometry that is similar to image
gdf = gpd.read_file(gpd.datasets.get_path("naturalearth_cities")).loc[
    lambda d: d["name"].eq("London")
]

utm = gdf.estimate_utm_crs()
S = 1000

p = gdf.to_crs(utm)["geometry"].values[0]
geoms = [
    Point(p.x   x, p.y   y).buffer(S).intersection(p.buffer(S))
    for x, y in [[S, S], [-S, -S], [S, -S], [-S, S]]
]
geoms  = [p.buffer(S).difference(MultiPolygon(geoms))]

gdf_polys = gpd.GeoDataFrame(
    pd.DataFrame({"value": [23, 44, 65, 77, 88]}),
    geometry=geoms,
    crs=utm,
).to_crs(gdf.crs)

gdf_point = gpd.GeoDataFrame(
    pd.DataFrame({"value": [200]}),
    geometry=[Point(p.x   S // 2, p.y   S // 2)],
    crs=utm,
).to_crs(gdf.crs)


# sjoin and replace
gdf_merged = (
    gpd.sjoin(gdf_polys, gdf_point, how="left")
    .assign(value=lambda d: d["value_right"].fillna(d["value_left"]))
    .pipe(
        lambda d: d.drop(
            columns=[
                c for c in d.columns if c.endswith("_left") or c.endswith("_right")
            ]
        )
    )
)

# visualize
fig, (ax1, ax2) = plt.subplots(1, 2)
ax1 = gdf_polys.plot("value", ax=ax1)
gdf_polys.apply(
    lambda x: ax1.annotate(
        text=x["value"], xy=x.geometry.centroid.coords[0], ha="center", color="white"
    ),
    axis=1,
)
gdf_point.plot(ax=ax1, color="red")
ax2 = gdf_merged.plot("value", ax=ax2)
ax2.set_yticks([])
gdf_merged.apply(
    lambda x: ax2.annotate(
        text=x["value"], xy=x.geometry.centroid.coords[0], ha="center", color="white"
    ),
    axis=1,
)


  • Related