I am trying to scrape weather data (RAWS data to be specific) from this WRCC webpage:
I have provided my full R code below along with a screenshot of the error message
# Set up RSelenium -------------------------------------------------------
## load packages ----
library(RSelenium)
library(tidyverse)
library(netstat)
library(here)
library(dplyr)
library(readr)
## Open a chrome browser session with RSelenium ----
rs_driver_object <-rsDriver(
browser = 'chrome',
chromever ='108.0.5359.71',
port = free_port(),
extraCapabilities = eCaps
)
remDr <- rs_driver_object$client
#Navigate to RAWS website
remDr$navigate("https://wrcc.dri.edu/cgi-bin/rawMAIN.pl?caucgr")
#Switch to Left Frame named "List"
ListFrame <- remDr$findElement(using = "name", value = "List")
remDr$switchToFrame(ListFrame)
#Select Daily Summary Time Series Link
Link1 <- remDr$findElement(using = "link text", value = "Daily Summary Time Series")
Link1$clickElement()
#Switch to Right Frame named "Graph"
GraphFrame <- remDr$findElement(using = "name", value = "Graph")
remDr$switchToFrame(GraphFrame)
CodePudding user response:
You can actually gather info from the site without RSelenium. Look into httr2
. Here I gathered info from the daily summar
library(tidyverse)
library(rvest)
library(httr2)
"https://wrcc.dri.edu/cgi-bin/wea_daysum2.pl" %>%
request() %>%
req_body_form(
stn = "UCGR",
mon = 12,
day = 20,
yea = 22,
unit = "E",
typ = "reg"
) %>%
req_perform() %>%
resp_body_html() %>%
html_table()
# A tibble: 45 × 24
X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 X14 X15 X16 X17
<chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
1 Hour Total "" "" "" "" "" Air "" Soil "" Rela… "" "" "" "" ""
2 of Day Solar "" "Win… "Win… "Win… "" Temp… "" Temp… "" Humi… "" "Dew" "Wet" "" "Bar…
3 Ending… Rad. "" "Ave… "V. … "Max… "" Mean "" Mean "" Mean "" "Poi… "Bul… "" "Pre…
4 L.S.T. ° ly. "" "mph" "Deg" "mph" "" Deg.… "" Deg.… "" Perc… "" "Deg… "Deg… "" "in.…
5 1 am 0.0 "" "3.6" "288" "4.7" "" 36.1 "" 42.2 "" 29 "" "7" "27" "" "25.…
6 2 am 0.0 "" "3.7" "284" "4.9" "" 35.3 "" 41.8 "" 32 "" "8" "26" "" "25.…
7 3 am 0.0 "" "3.6" "287" "5.1" "" 36.2 "" 41.4 "" 32 "" "9" "27" "" "25.…
8 4 am 0.0 "" "2.3" "288" "4.3" "" 37.7 "" 41.0 "" 30 "" "9" "28" "" "25.…
9 5 am 0.0 "" "1.9" "284" "3.8" "" 37.1 "" 40.7 "" 31 "" "9" "28" "" "25.…
10 6 am 0.0 "" "3.2" "286" "5.8" "" 37.1 "" 40.4 "" 30 "" "8" "28" "" "25.…
# … with 35 more rows, and 7 more variables: X18 <chr>, X19 <chr>, X20 <chr>, X21 <chr>, X22 <chr>,
# X23 <chr>, X24 <chr>
CodePudding user response:
I was able to solve the problem thanks to a colleague: I just needed to add remDr$switchToFrame(NA)
prior to this code block: GraphFrame <- remDr$findElement(using = "name", value = "Graph") remDr$switchToFrame(GraphFrame)