I'm trying to webscrape a map inorder to download all locations within a Street Lighting layer. I use RSelenium
to get the data:
library(tidyverse)
library(rvest)
library(RSelenium)
# Open a browser
rD <- rsDriver(browser="firefox", port=4545L, verbose=F)
remDr <- rD[["client"]]
# Navigate to site
remDr$navigate("https://gis2.westberks.gov.uk/webapps/OnlineMap/")
At this point, via the browser, I switch on the Street Lighting layer (under Highways) and then select a single Street Light on the map. If I then run:
h <- read_html(remDr$getPageSource()[[1]]) %>% html_nodes(".attrTable") %>% html_table()
I get the data for that single Street Light. However, I want to get the data for all street lighting displayed on the map. I don't know how to do this. Is it possible to programmatically select all lights on the map before running remDr$getPageSource()
?
I've looked at this post, but doesn't quite solve the problem: Issue scraping website with reactive blocks
CodePudding user response:
It displays layer as single image and you can't get positions.
It would need some Computer Vision
to detect circles on image.
When you click map then it sends coordinates to server
and it send back JSON data
which it displays as popup window.
It sends something like this (with some values xmin, xmax, ymin, ymax
)
(you can click link to see JSON data)
Maybe if you use it with xmin, xmax, ymin, ymax
for bigger area then you get all values.
EDIT:
I don't have experience in R
but I can show example in Python
.
It doesn't need Selenium (in Python
and in R
).
import requests
# full url with parameters
#url = 'https://gis2.westberks.gov.uk/arcgis/rest/services/maps/Wbc_Highways/MapServer/11/query?f=json&returnGeometry=true&spatialRel=esriSpatialRelIntersects&geometry={"xmin":442520.89976831735,"ymin":178788.66371417744,"xmax":443314.65135582053,"ymax":179582.41530168062,"spatialReference":{"wkid":27700,"latestWkid":27700}}&geometryType=esriGeometryEnvelope&inSR=27700&outFields=OBJECTID,Item_Type,Item_Identity_Code,Location_Description,Assigned_Street,Locality,Town,Type,Bracket_Type,Lantern_Type,Lamp_Type,Ballast_Type,Control_Type,Sign_Lantern_Type,Sign_Bracket_Type,Sign_Post_Type,Bollard_Base_Type,Bollard_Shell_Type,Column_Manufacturer,Material_Type,Lamp_Wattage,Lantern_Manufacturer,Number_of_Lamps,Switching_Regime_Code,Switching_Regime,Lamp_Type2,Easting,Northing&outSR=27700'
# only parameters
params = {
'f': ['json'],
'geometry': [
'{"xmin":442520.89976831735,"ymin":178788.66371417744,"xmax":443314.65135582053,"ymax":179582.41530168062,"spatialReference":{"wkid":27700,"latestWkid":27700}}'
],
'geometryType': ['esriGeometryEnvelope'],
'inSR': ['27700'],
'outFields': ['OBJECTID,Item_Type,Item_Identity_Code,Location_Description,Assigned_Street,Locality,Town,Type,Bracket_Type,Lantern_Type,Lamp_Type,Ballast_Type,Control_Type,Sign_Lantern_Type,Sign_Bracket_Type,Sign_Post_Type,Bollard_Base_Type,Bollard_Shell_Type,Column_Manufacturer,Material_Type,Lamp_Wattage,Lantern_Manufacturer,Number_of_Lamps,Switching_Regime_Code,Switching_Regime,Lamp_Type2,Easting,Northing'],
'outSR': ['27700'],
'returnGeometry': ['true'],
'spatialRel': ['esriSpatialRelIntersects']
}
# url without parameters
url = 'https://gis2.westberks.gov.uk/arcgis/rest/services/maps/Wbc_Highways/MapServer/11/query'
response = requests.get(url, params=params)
#print(response.url)
#print(response.status_code)
data = response.json()
for item in data['features']:
print('Locality:', item['attributes']['Locality'].strip())
print('Town :', item['attributes']['Town'].strip())
print('Street :', item['attributes']['Assigned_Street'].strip())
print('Geometry:', item['geometry'])
print('---')
Result:
Locality: BRIGHTWALTON
Town : NEWBURY
Street : SAXONS ACRE
Geometry: {'x': 442763, 'y': 179193}
---
Locality: BRIGHTWALTON
Town : NEWBURY
Street : ASH CLOSE
Geometry: {'x': 442782, 'y': 179248}
---
Locality: BRIGHTWALTON
Town : NEWBURY
Street : SAXONS ACRE
Geometry: {'x': 442770, 'y': 179214}
---
EDIT:
Version in R
> install.packages("jsonlite")
> library(jsonlite)
> URL = 'https://gis2.westberks.gov.uk/arcgis/rest/services/maps/Wbc_Highways/MapServer/11/query?f=json&returnGeometry=true&spatialRel=esriSpatialRelIntersects&geometry={"xmin":442520.89976831735,"ymin":178788.66371417744,"xmax":443314.65135582053,"ymax":179582.41530168062,"spatialReference":{"wkid":27700,"latestWkid":27700}}&geometryType=esriGeometryEnvelope&inSR=27700&outFields=OBJECTID,Item_Type,Item_Identity_Code,Location_Description,Assigned_Street,Locality,Town,Type,Bracket_Type,Lantern_Type,Lamp_Type,Ballast_Type,Control_Type,Sign_Lantern_Type,Sign_Bracket_Type,Sign_Post_Type,Bollard_Base_Type,Bollard_Shell_Type,Column_Manufacturer,Material_Type,Lamp_Wattage,Lantern_Manufacturer,Number_of_Lamps,Switching_Regime_Code,Switching_Regime,Lamp_Type2,Easting,Northing&outSR=27700'
> data <- fromJSON(URL)
> library(magrittr) # to use `%>%
> data <- URL %>% fromJSON
> data$features$attributes$Locality
[1] "BRIGHTWALTON "
[2] "BRIGHTWALTON "
[3] "BRIGHTWALTON "
> data$features$attributes$Town
[1] "NEWBURY "
[2] "NEWBURY "
[3] "NEWBURY "
> data$features$attributes$Assigned_Street
[1] "SAXONS ACRE "
[2] "ASH CLOSE "
[3] "SAXONS ACRE "
> data$features$geometry
x y
1 442763 179193
2 442782 179248
3 442770 179214
> library(stringr)
> data$features$attributes$Locality %>% str_trim
[1] "BRIGHTWALTON" "BRIGHTWALTON" "BRIGHTWALTON"
EDIT:
Something similar to Python
version
> library(magrittr) # to use `%>%
> library(httr)
> library(jsonlite)
> URL = 'https://gis2.westberks.gov.uk/arcgis/rest/services/maps/Wbc_Highways/MapServer/11/query'
> query = list(
f = list('json'),
geometry = list(
'{"xmin":442520.89976831735,"ymin":178788.66371417744,"xmax":443314.65135582053,"ymax":179582.41530168062,"spatialReference":{"wkid":27700,"latestWkid":27700}}'
),
geometryType = list('esriGeometryEnvelope'),
inSR = list('27700'),
outFields = list('OBJECTID,Item_Type,Item_Identity_Code,Location_Description,Assigned_Street,Locality,Town,Type,Bracket_Type,Lantern_Type,Lamp_Type,Ballast_Type,Control_Type,Sign_Lantern_Type,Sign_Bracket_Type,Sign_Post_Type,Bollard_Base_Type,Bollard_Shell_Type,Column_Manufacturer,Material_Type,Lamp_Wattage,Lantern_Manufacturer,Number_of_Lamps,Switching_Regime_Code,Switching_Regime,Lamp_Type2,Easting,Northing'),
outSR = list('27700'),
returnGeometry = list('true'),
spatialRel = list('esriSpatialRelIntersects')
)
> response <- GET(URL, query=query)
> data <- response %>% content %>% fromJSON
> data <- GET(URL, query=query) %>% content %>% fromJSON
> items <- data$features