Home > Back-end >  Web scraping json files in R
Web scraping json files in R

Time:10-13

I am trying to scraping public data from UNHCR web site enter image description here

In order to do this I started with code but I don't have idea how to processed.

library(jsonlite)
url <- 'https://data2.unhcr.org/en/situations/mediterranean#'


# Removes last character, i.e. &
url <- substr(url, 1, nchar(url)-1)

# Encodes URL to avoid errors
url <- URLencode(url)

# Extracts JSON from URL
json_extract <- fromJSON(url)

# Converts relevant list into a data.frame
df <- data.frame(json_extract[['items']])

So can anybody help me with code how to download this data into table like table below

enter image description here

CodePudding user response:

Right-click on the page and choose "Inspect". Then go to the "Network" tab. Click on one of the JSON buttons and you will see the query that runs show up. This is the url you need to use.

url <- 'https://data2.unhcr.org/population/get/timeseries'
query <- '?widget_id=267293&sv_id=11&population_group=4797,4798&frequency=month&fromDate=2015-01-01'

# Extracts JSON from URL
json_extract <- fromJSON(paste0(url, query))

# Extracts data.frame from list (there might be other info in the list you want too)
df <- json_extract$data$timeseries

# Others
# https://data2.unhcr.org/population/?widget_id=267298&sv_id=11&population_group=4797,4798&year=latest # Total Arrivals
# https://data2.unhcr.org/population/?widget_id=267299&sv_id=11&population_group=4797&year=latest # Sea Arrivals

Gives:

tail(df)
   month year unix_timestamp individuals
76     5 2021     1622160000        9401
77     6 2021     1624838400        9245
78     7 2021     1627430400       12565
79     8 2021     1630108800       15749
80     9 2021     1632787200       16052
81    10 2021     1635379200        1902
  • Related