I am getting time-out with GET function from httr package in R with this settings:
GET("https://isir.justice.cz/isir/common/index.do", add_headers(.headers = c('"authority"="isir.justice.cz",
"scheme"="https",
"path"="/isir/common/index.do",
"cache-control"="max-age=0",
"sec-ch-ua-mobile"="?0",
"sec-ch-ua-platform"= "Windows",
"upgrade-insecure-requests"="1",
"accept"="text/html,application/xhtml xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9",
"sec-fetch-site"="none",
"sec-fetch-mode"="navigate",
"sec-fetch-user"="?1",
"sec-fetch-dest"="document",
"accept-encoding"="gzip, deflate, br",
"accept-language"="cs-CZ,cs;q=0.9"'
)))
But the seemingly same query via powershell returns a webpage.
Invoke-WebRequest -UseBasicParsing -Uri "https://isir.justice.cz/isir/common/index.do" `
-WebSession $session `
-Headers @{
"method"="GET"
"authority"="isir.justice.cz"
"scheme"="https"
"path"="/isir/common/index.do"
"cache-control"="max-age=0"
"sec-ch-ua-mobile"="?0"
"sec-ch-ua-platform"="`"Windows`""
"upgrade-insecure-requests"="1"
"accept"="text/html,application/xhtml xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9"
"sec-fetch-site"="none"
"sec-fetch-mode"="navigate"
"sec-fetch-user"="?1"
"sec-fetch-dest"="document"
"accept-encoding"="gzip, deflate, br"
"accept-language"="cs-CZ,cs;q=0.9"
}
Do I have a problem with my R code or is it simple a matter of difference between using R vs powershell?
CodePudding user response:
Your code didn't run for me as it had an extra '
somewhere. Correcting this, it ran fine. If you keep getting timeout messages, you can increase the maximum request time using timeout()
:
library(httr)
x <- GET("https://isir.justice.cz/isir/common/index.do", timeout(10), add_headers(
.headers = c("authority" = "isir.justice.cz",
"scheme" = "https",
"path" = "/isir/common/index.do",
"cache-control" = "max-age=0",
"sec-ch-ua-mobile" = "?0",
"sec-ch-ua-platform" = "Windows",
"upgrade-insecure-requests" = "1",
"accept" = "text/html,application/xhtml xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9",
"sec-fetch-site" = "none",
"sec-fetch-mode" = "navigate",
"sec-fetch-user" = "?1",
"sec-fetch-dest" = "document",
"accept-encoding" = "gzip, deflate, br",
"accept-language" = "cs-CZ,cs;q=0.9")
))
As a sidenote: there is a successor package by the same people called httr2
. I'm also still using httr
but it's probably a good idea to learn the new package. Here is how that would look like:
library(httr2)
req <- request("https://isir.justice.cz/isir/common/index.do") %>%
req_headers("authority" = "isir.justice.cz",
"scheme" = "https",
"path" = "/isir/common/index.do",
"cache-control" = "max-age=0",
"sec-ch-ua-mobile" = "?0",
"sec-ch-ua-platform" = "Windows",
"upgrade-insecure-requests" = "1",
"accept" = "text/html,application/xhtml xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9",
"sec-fetch-site" = "none",
"sec-fetch-mode" = "navigate",
"sec-fetch-user" = "?1",
"sec-fetch-dest" = "document",
"accept-encoding" = "gzip, deflate, br",
"accept-language" = "cs-CZ,cs;q=0.9") %>%
req_timeout(seconds = 10)
# check your request in a dry run
req %>%
req_dry_run()
#> GET /isir/common/index.do HTTP/1.1
#> Host: isir.justice.cz
#> User-Agent: httr2/0.1.1 r-curl/4.3.2 libcurl/7.80.0
#> authority: isir.justice.cz
#> scheme: https
#> path: /isir/common/index.do
#> cache-control: max-age=0
#> sec-ch-ua-mobile: ?0
#> sec-ch-ua-platform: Windows
#> upgrade-insecure-requests: 1
#> accept: text/html,application/xhtml xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9
#> sec-fetch-site: none
#> sec-fetch-mode: navigate
#> sec-fetch-user: ?1
#> sec-fetch-dest: document
#> accept-encoding: gzip, deflate, br
#> accept-language: cs-CZ,cs;q=0.9
resp <- req_perform(req)
resp
#> <httr2_response>
#> GET https://isir.justice.cz/isir/common/index.do
#> Status: 200 OK
#> Content-Type: text/html
#> Body: In memory (116916 bytes)
Created on 2022-01-03 by the reprex package (v2.0.1)