Home > Back-end >  Joining sf objects go wrong despite having the same crs
Joining sf objects go wrong despite having the same crs

Time:10-06

I'm trying to figure out why it is that I can't get two sf objects to join together.

I have a dataframe with coordinates:

# A tibble: 80,840 x 2
   ycoord xcoord
    <dbl>  <dbl>
 1   51.9   4.44
 2   52.1   4.31
 3   52.1   4.31
 4   52.1   4.31
 5   52.3   5.17
 6   51.9   4.45
 7   52.0   4.17
 8   52.0   5.64
 9   51.8   5.81
10   52.1   4.66

And a sf object of neighbourhood boundries with neighbourhood codes (say zipcodes).

Simple feature collection with 14175 features and 1 field
Geometry type: MULTIPOLYGON
Dimension:     XY
Bounding box:  xmin: 10425.16 ymin: 306846.2 xmax: 278026.1 ymax: 621876.3
Projected CRS: Amersfoort / RD New
First 10 features:
      bu_code                       geometry
1  BU00140000 MULTIPOLYGON (((233836.2 58...
2  BU00140001 MULTIPOLYGON (((233934 5819...
3  BU00140002 MULTIPOLYGON (((233998.8 58...
4  BU00140003 MULTIPOLYGON (((233187.2 58...
5  BU00140004 MULTIPOLYGON (((233379.8 58...
6  BU00140005 MULTIPOLYGON (((234098.5 58...
7  BU00140007 MULTIPOLYGON (((234261.6 58...
8  BU00140008 MULTIPOLYGON (((233337.5 58...
9  BU00140100 MULTIPOLYGON (((234798 5816...
10 BU00140101 MULTIPOLYGON (((234489 5816...

In order to get the bu_code variable I tried the following to create a sf object with the same crs as the neighbourhood codes sf object.

locaties_sf <- 
  locaties_df %>% 
  st_as_sf(coords = c(x = "xcoord", y = "ycoord"),
           crs = st_crs(neigbourhood_boundries_sf)) 

Now it seems to have the same crs data:

Simple feature collection with 80840 features and 0 fields
Geometry type: POINT
Dimension:     XY
Bounding box:  xmin: 3.359644 ymin: 50.76168 xmax: 7.21876 ymax: 53.48695
Projected CRS: Amersfoort / RD New
# A tibble: 80,840 x 1
              geometry
 *         <POINT [m]>
 1 (4.442658 51.91203)
 2 (4.314241 52.07851)
 3 (4.313637 52.05612)
 4 (4.313137 52.05622)
 5 (5.174235 52.26729)
 6  (4.44545 51.87896)
 7 (4.167927 52.00326)
 8  (5.643643 52.0424)
 9 (5.814333 51.83075)
10 (4.657972 52.06908)
# … with 80,830 more rows

Than I try to join the two together:

st_join(locaties_sf,
        neigbourhood_boundries_sf,
        join = st_within)

But I get only NA's.

If I plot the two they still seem to have a different coordinate system.

This is what neigbourhood boundriers_sf looks like:

enter image description here

And this is what locaties_sf looks like:

enter image description here

All help much appriciated.

P.S.: sorry that my data is not reproducible, but the multipolygons are too big for dput()

CodePudding user response:

You have a coordinate mismatch - the points seem to be latitude & longitude (measured in degrees on a sphere) while the polygons are in projected CRS (measured in meters on a plane).

What you did is that you "declared" your points to be meters on a plane = obviously very close to the origin, and thus nowhere close to the polygons (note the expected coordinate range of Amersfoort / RD New).

If my guess is correct you need to first declare the points in an unprojected CRS (WGS84 is a good default) and then (and only then) transform them to the projected CRS of your polygons.

I suggest changing your st_as_sf call like this:

locaties_sf <- locaties_df %>% 
  st_as_sf(coords = c(x = "xcoord", y = "ycoord"),
           crs = 4326) %>% # WGS84 - a good default for unprojected coords
  st_tranform(28992)
  • Related