I'm trying to figure out why it is that I can't get two sf objects to join together.
I have a dataframe with coordinates:
# A tibble: 80,840 x 2
ycoord xcoord
<dbl> <dbl>
1 51.9 4.44
2 52.1 4.31
3 52.1 4.31
4 52.1 4.31
5 52.3 5.17
6 51.9 4.45
7 52.0 4.17
8 52.0 5.64
9 51.8 5.81
10 52.1 4.66
And a sf object of neighbourhood boundries with neighbourhood codes (say zipcodes).
Simple feature collection with 14175 features and 1 field
Geometry type: MULTIPOLYGON
Dimension: XY
Bounding box: xmin: 10425.16 ymin: 306846.2 xmax: 278026.1 ymax: 621876.3
Projected CRS: Amersfoort / RD New
First 10 features:
bu_code geometry
1 BU00140000 MULTIPOLYGON (((233836.2 58...
2 BU00140001 MULTIPOLYGON (((233934 5819...
3 BU00140002 MULTIPOLYGON (((233998.8 58...
4 BU00140003 MULTIPOLYGON (((233187.2 58...
5 BU00140004 MULTIPOLYGON (((233379.8 58...
6 BU00140005 MULTIPOLYGON (((234098.5 58...
7 BU00140007 MULTIPOLYGON (((234261.6 58...
8 BU00140008 MULTIPOLYGON (((233337.5 58...
9 BU00140100 MULTIPOLYGON (((234798 5816...
10 BU00140101 MULTIPOLYGON (((234489 5816...
In order to get the bu_code
variable I tried the following to create a sf object with the same crs as the neighbourhood codes sf object.
locaties_sf <-
locaties_df %>%
st_as_sf(coords = c(x = "xcoord", y = "ycoord"),
crs = st_crs(neigbourhood_boundries_sf))
Now it seems to have the same crs data:
Simple feature collection with 80840 features and 0 fields
Geometry type: POINT
Dimension: XY
Bounding box: xmin: 3.359644 ymin: 50.76168 xmax: 7.21876 ymax: 53.48695
Projected CRS: Amersfoort / RD New
# A tibble: 80,840 x 1
geometry
* <POINT [m]>
1 (4.442658 51.91203)
2 (4.314241 52.07851)
3 (4.313637 52.05612)
4 (4.313137 52.05622)
5 (5.174235 52.26729)
6 (4.44545 51.87896)
7 (4.167927 52.00326)
8 (5.643643 52.0424)
9 (5.814333 51.83075)
10 (4.657972 52.06908)
# … with 80,830 more rows
Than I try to join the two together:
st_join(locaties_sf,
neigbourhood_boundries_sf,
join = st_within)
But I get only NA's.
If I plot the two they still seem to have a different coordinate system.
This is what neigbourhood boundriers_sf looks like:
And this is what locaties_sf looks like:
All help much appriciated.
P.S.: sorry that my data is not reproducible, but the multipolygons are too big for dput()
CodePudding user response:
You have a coordinate mismatch - the points seem to be latitude & longitude (measured in degrees on a sphere) while the polygons are in projected CRS (measured in meters on a plane).
What you did is that you "declared" your points to be meters on a plane = obviously very close to the origin, and thus nowhere close to the polygons (note the expected coordinate range of Amersfoort / RD New).
If my guess is correct you need to first declare the points in an unprojected CRS (WGS84 is a good default) and then (and only then) transform them to the projected CRS of your polygons.
I suggest changing your st_as_sf call like this:
locaties_sf <- locaties_df %>%
st_as_sf(coords = c(x = "xcoord", y = "ycoord"),
crs = 4326) %>% # WGS84 - a good default for unprojected coords
st_tranform(28992)