Home > Mobile >  Troubleshooting API coordinate request loop
Troubleshooting API coordinate request loop

Time:01-17

I want to find the coordinates for a list of addresses.

I am using a data set that can be found here: "https://www.data.gv.at/katalog/dataset/kaufpreissammlung-liegenschaften-wien"

I've inputed this using the read_csv function as "data". I'm using the tidyverse and jsonlite libraries. The only relevant columns are "Straße" which is the street name and "ON" which is the street number. The city for all of these is Vienna, Austria.

I'm using OpenStreetMap and have formatted my address data like the format requires:

data$formatted_address <- paste(ifelse(is.na(data$ON), "", data$ON), " ", tolower(data$Straße), ", vienna", sep = "")

This formats the adresses in this column as 1 milanweg, vienna and 12 granergasse, vienna. When I manually input this into the API format, it all works out and I get the coordinates: https://nominatim.openstreetmap.org/search?q=1 milanweg, vienna&format=json&polygon=1&addressdetails=1

Since I now want to do this for my entire row, I am using jsonlite to create requests in R.

data$coordinates <- data.frame(lat = NA, lon = NA)
for (i in 1:nrow(data)) {
  result <- try(readLines(paste0("https://nominatim.openstreetmap.org/search?q=", 
                                 URLencode(data$formatted_address[i]), "&format=json&polygon=1&addressdetails=1")), 
                silent = TRUE)
  if (!inherits(result, "try-error")) {
    if (length(result) > 0) {
      result <- fromJSON(result)
      if (length(result) > 0 && is.list(result[[1]])) {
        data$coordinates[i, ] <- c(result[[1]]$lat, result[[1]]$lon)
      }
    }
  }
}

This should theoretically create the exact same API request, however, the lat and lon columns are always empty.

How can I fix this script to create a list of coordinates for each address in the data set?

CodePudding user response:

Data setup

library(tidyverse) 
library(httr2) 

df <- df %>% 
  mutate(
    formatted_address = str_c(
      if_else(is.na(on), "", on), " ", str_to_lower(strasse), " vienna"
    ) %>% str_remove_all(" ")
  )

# A tibble: 57,912 × 7
   kg_code katastralgemeinde      ez   plz strasse                   on      formatted_address                
     <dbl> <chr>               <dbl> <dbl> <chr>                     <chr>   <chr>                            
 1    1617 Strebersdorf         1417  1210 Mühlweg                   13      13 mühlweg vienna                
 2    1607 Groß Jedlersdorf II   193  1210 Bahnsteggasse             4       4 bahnsteggasse vienna           
 3    1209 Ober St.Veit         3570  1130 Jennerplatz               34/20   34/20 jennerplatz vienna         
 4    1207 Lainz                 405  1130 Sebastian-Brunner-Gasse   6       6 sebastian-brunner-gasse vienna 
 5    1101 Favoriten            3831  1100 Laxenburger Straße        2C -2 D 2C-2D laxenburgerstraße vienna   
 6    1101 Favoriten            3827  1100 Laxenburger Straße        2 C     2C laxenburgerstraße vienna      
 7    1101 Favoriten            3836  1100 hinter Laxenburger Straße 2 C     2C hinterlaxenburgerstraße vienna
 8    1201 Auhof                 932  1130 Keplingergasse            10      10 keplingergasse vienna         
 9    1213 Speising              135  1130 Speisinger Straße         29      29 speisingerstraße vienna       
10    1107 Simmering            2357  1100 BATTIGGASSE               44      44 battiggasse vienna            
# … with 57,902 more rows
# ℹ Use `print(n = ...)` to see more rows

API call and getting coordinates. I gathered the display name matched by the API, and the lat & lon data.

get_coords <- function(address) {
  cat("Getting coordinates", address, "\n")
  str_c(
    "https://nominatim.openstreetmap.org/search?q=",
    address,
    "&format=json&polygon=1&addressdetails=1"
  ) %>%
    request() %>%
    req_perform() %>%
    resp_body_json(simplifyVector = TRUE) %>%
    as_tibble() %>%
    select(api_name = display_name,
           lat, lon) %>% 
    slice(1)
}

df %>% 
  slice_sample(n = 10) %>% 
  mutate(coordinates = map(
    formatted_address, possibly(get_coords, tibble(
      api_name = NA_character_, 
      lat = NA_character_, 
      lon = NA_character_
    ))
  )) %>% 
  unnest(coordinates) 

# A tibble: 10 × 10
   kg_code katastralgemeinde      ez   plz strasse               on    formatted_…¹ api_n…² lat   lon  
     <dbl> <chr>               <dbl> <dbl> <chr>                 <chr> <chr>        <chr>   <chr> <chr>
 1    1651 Aspern               3374  1220 ERLENWEG              8     8 erlenweg … 8, Erl… 48.2… 16.4…
 2    1613 Leopoldau            6617  1210 Oswald-Redlich-Straße 31    31 oswald-r… 31, Os… 48.2… 16.4…
 3    1006 Landstraße           2425  1030 HAGENMÜLLERGASSE      45018 45018 hagen… Hagenm… 48.1… 16.4…
 4    1101 Favoriten             541  1100 HERNDLGASSE           7     7 herndlgas… 7, Her… 48.1… 16.3…
 5    1607 Groß Jedlersdorf II   221  1210 Prager Straße         70    70 pragerst… Prager… 48.2… 16.3…
 6    1006 Landstraße           1184  1030 PAULUSGASSE           2     2 paulusgas… 2, Pau… 48.1… 16.3…
 7    1654 Eßling               2712  1220 KAUDERSSTRASSE        61    61 kauderss… 61, Ka… 48.2… 16.5…
 8    1401 Dornbach             2476  1170 Alszeile              NA     alszeile v… Alszei… 48.2… 16.2…
 9    1654 Eßling                745  1220 Kirschenallee         19    19 kirschen… 19, Ki… 48.2… 16.5…
10    1204 Hadersdorf           3139  1140 MITTLERE STRASSE      NA     mittlerest… Mittle… 48.2… 16.1…
# … with abbreviated variable names ¹​formatted_address, ²​api_name
  • Related