i am trying to extract the table from the website https://www.bseindia.com/corporates/Sharehold_Searchnew.aspx?expandable=3
shell('docker pull selenium/standalone-chrome')
Sys.sleep(2)
shell('docker run -d -p 4445:4444 selenium/standalone-chrome')
Sys.sleep(2)
remDr <- remoteDriver(remoteServerAddr = "localhost" , port = 4445L, browserName = "chrome")
Sys.sleep(3)
remDr$open()
remDr$setTimeout(type = "script", milliseconds = 30000000000)
remDr$navigate("https://www.bseindia.com/corporates/Sharehold_Searchnew.aspx?expandable=3")
webelemtemp<-remDr$findElement(using = "xpath", value = "//*[@id='ContentPlaceHolder1_gvData']/tbody/tr[27]/td/table/tbody")
count<-1
Getting an error Error: Summary: NoSuchElement Detail: An element could not be located on the page using the given search parameters. class: org.openqa.selenium.NoSuchElementException Further Details: run errorDetails method
CodePudding user response:
To extract the table we can use,
library(dplyr)
library(rvest)
remDr$navigate("https://www.bseindia.com/corporates/Sharehold_Searchnew.aspx?expandable=3")
remDr$getPageSource()[[1]] %>%
read_html() %>% html_nodes('.mGrid') %>%
html_table()
[[1]]
# A tibble: 4 x 5
`Security Code` `Security Name` Industry `For Quarter Ending` XBRL
<int> <chr> <chr> <chr> <lgl>
1 509055 VISAKA INDUSTRIES LTD. Cement & Cement Products 09 Feb 2022 NA
2 532687 REPRO INDIA LTD. Comm.Printing/Stationery December 2021 NA
3 538734 Ceinsys Tech Ltd IT Consulting & Software 09 Feb 2022 NA
4 532695 CELEBRITY FASHIONS LTD. Other Apparels & Accessories 27 Jan 2022 NA
But a faster approach would be using rvest
url %>%
read_html() %>%
html_nodes('.mGrid') %>%
html_table()