Home > Enterprise >  httr GET function time-out
httr GET function time-out

Time:01-04

I am getting time-out with GET function from httr package in R with this settings:

GET("https://isir.justice.cz/isir/common/index.do", add_headers(.headers = c('"authority"="isir.justice.cz",
                                                                         "scheme"="https",
                                                                         "path"="/isir/common/index.do",
                                                                         "cache-control"="max-age=0",
                                                                         "sec-ch-ua-mobile"="?0",
                                                                         "sec-ch-ua-platform"= "Windows",
                                                                         "upgrade-insecure-requests"="1",
                                                                         "accept"="text/html,application/xhtml xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9",
                                                                         "sec-fetch-site"="none",
                                                                         "sec-fetch-mode"="navigate",
                                                                         "sec-fetch-user"="?1",
                                                                         "sec-fetch-dest"="document",
                                                                         "accept-encoding"="gzip, deflate, br",
                                                                         "accept-language"="cs-CZ,cs;q=0.9"'
                                                                         )))

But the seemingly same query via powershell returns a webpage.

Invoke-WebRequest -UseBasicParsing -Uri "https://isir.justice.cz/isir/common/index.do" `
-WebSession $session `
-Headers @{
"method"="GET"
  "authority"="isir.justice.cz"
  "scheme"="https"
  "path"="/isir/common/index.do"
  "cache-control"="max-age=0"
  "sec-ch-ua-mobile"="?0"
  "sec-ch-ua-platform"="`"Windows`""
  "upgrade-insecure-requests"="1"
  "accept"="text/html,application/xhtml xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9"
  "sec-fetch-site"="none"
  "sec-fetch-mode"="navigate"
  "sec-fetch-user"="?1"
  "sec-fetch-dest"="document"
  "accept-encoding"="gzip, deflate, br"
  "accept-language"="cs-CZ,cs;q=0.9"
}

Do I have a problem with my R code or is it simple a matter of difference between using R vs powershell?

CodePudding user response:

Your code didn't run for me as it had an extra ' somewhere. Correcting this, it ran fine. If you keep getting timeout messages, you can increase the maximum request time using timeout():

library(httr)
x <- GET("https://isir.justice.cz/isir/common/index.do", timeout(10), add_headers(
  .headers = c("authority" = "isir.justice.cz",
               "scheme" = "https",
               "path" = "/isir/common/index.do",
               "cache-control" = "max-age=0",
               "sec-ch-ua-mobile" = "?0",
               "sec-ch-ua-platform" =  "Windows",
               "upgrade-insecure-requests" = "1",
               "accept" = "text/html,application/xhtml xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9",
               "sec-fetch-site" = "none",
               "sec-fetch-mode" = "navigate",
               "sec-fetch-user" = "?1",
               "sec-fetch-dest" = "document",
               "accept-encoding" = "gzip, deflate, br",
               "accept-language" = "cs-CZ,cs;q=0.9")
))

As a sidenote: there is a successor package by the same people called httr2. I'm also still using httr but it's probably a good idea to learn the new package. Here is how that would look like:

library(httr2)

req <- request("https://isir.justice.cz/isir/common/index.do") %>% 
  req_headers("authority" = "isir.justice.cz",
              "scheme" = "https",
              "path" = "/isir/common/index.do",
              "cache-control" = "max-age=0",
              "sec-ch-ua-mobile" = "?0",
              "sec-ch-ua-platform" =  "Windows",
              "upgrade-insecure-requests" = "1",
              "accept" = "text/html,application/xhtml xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9",
              "sec-fetch-site" = "none",
              "sec-fetch-mode" = "navigate",
              "sec-fetch-user" = "?1",
              "sec-fetch-dest" = "document",
              "accept-encoding" = "gzip, deflate, br",
              "accept-language" = "cs-CZ,cs;q=0.9") %>% 
  req_timeout(seconds = 10)

# check your request in a dry run
req %>% 
  req_dry_run()
#> GET /isir/common/index.do HTTP/1.1
#> Host: isir.justice.cz
#> User-Agent: httr2/0.1.1 r-curl/4.3.2 libcurl/7.80.0
#> authority: isir.justice.cz
#> scheme: https
#> path: /isir/common/index.do
#> cache-control: max-age=0
#> sec-ch-ua-mobile: ?0
#> sec-ch-ua-platform: Windows
#> upgrade-insecure-requests: 1
#> accept: text/html,application/xhtml xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9
#> sec-fetch-site: none
#> sec-fetch-mode: navigate
#> sec-fetch-user: ?1
#> sec-fetch-dest: document
#> accept-encoding: gzip, deflate, br
#> accept-language: cs-CZ,cs;q=0.9

resp <- req_perform(req)
resp
#> <httr2_response>
#> GET https://isir.justice.cz/isir/common/index.do
#> Status: 200 OK
#> Content-Type: text/html
#> Body: In memory (116916 bytes)

Created on 2022-01-03 by the reprex package (v2.0.1)

  •  Tags:  
  • Related