I want to scraping table data from this website by using R language.
my code
library(XML)
url <- "https://www.westmetall.com/en/markdaten.php?action=show_table&field=LME_Cu_cash"
doc <- htmlParse(url)
tableNodes = getNodeSet(doc,"//table")
tb = readHTMLTable(tableNodes[[1]])
but i got a error looks like that enter image description here
CodePudding user response:
You can do it using {rvest} package
library(rvest)
url <- "https://www.westmetall.com/en/markdaten.php?action=show_table&field=LME_Cu_cash"
tables <- read_html(url) |>
html_table()
The tables
list contains all the tables found in the page
you can inspect it
str(tables)
#> List of 5
#> $ : tibble [7 × 4] (S3: tbl_df/tbl/data.frame)
#> ..$ Official LME-Prices in US Dollar: chr [1:7] "in US Dollar per ton" "Copper" "Tin" "Lead" ...
#> ..$ 07. October 2022 : chr [1:7] "Settlement Kasse" "7,575.50" "20,000.00" "2,078.00" ...
#> ..$ : chr [1:7] "3 months" "7,554.00" "19,950.00" "2,050.00" ...
#> ..$ : chr [1:7] "Chart\nTable\nAverage" "" "" "" ...
#> $ : tibble [7 × 4] (S3: tbl_df/tbl/data.frame)
#> ..$ LME stocks : chr [1:7] "in tons" "Copper" "Tin" "Lead" ...
#> ..$ 07. October 2022: chr [1:7] "" "143,775" "4,690" "31,875" ...
#> ..$ Changes : chr [1:7] "" "3,575" "15" "0" ...
#> ..$ : chr [1:7] "Chart\nTable\nAverage" "" "" "" ...
#> $ : tibble [3 × 4] (S3: tbl_df/tbl/data.frame)
#> ..$ Exchange Rates : chr [1:3] "EUR/USD LME-FX-rate (MTLE)" "ECB-Fixing (14:15 Uhr)" "EUR/USD-Basis DEL-Notiz"
#> ..$ 07. October 2022: num [1:3] 0.979 0.98 0.979
#> ..$ 06. October 2022: num [1:3] 0.987 0.986 0.987
#> ..$ : logi [1:3] NA NA NA
#> $ : tibble [15 × 4] (S3: tbl_df/tbl/data.frame)
#> ..$ German Metal Prices: chr [1:15] "in Euro per 100 kg" "lower Copper WM-Notiz" "higher Copper WM-Notiz" "lower DEL-Notiz (until February 11, 2022)" ...
#> ..$ 07. October 2022 : chr [1:15] "" "786.54" "789.89" "-" ...
#> ..$ 06. October 2022 : chr [1:15] "" "797.52" "800.84" "-" ...
#> ..$ : chr [1:15] "Chart\nTable\nAverage" "" "" "" ...
#> $ : tibble [5 × 4] (S3: tbl_df/tbl/data.frame)
#> ..$ Precious metals : chr [1:5] "Gold London Fixing in USD/oz." "Gold in Euro/kg" "Gold, processed in Euro/kg" "Fine Silver in Euro/kg" ...
#> ..$ 07. October 2022: chr [1:5] "1,711.50" "55,190.00" "62,080.00" "666.90 / 733.80" ...
#> ..$ 06. October 2022: chr [1:5] "1,716.00" "54,840.00" "61,670.00" "658.90 / 725.10" ...
#> ..$ : logi [1:5] NA NA NA NA NA
Then you just have to pick the table you want and format as you wish
tables[[2]]
#> # A tibble: 7 × 4
#> `LME stocks` `07. October 2022` Changes ``
#> <chr> <chr> <chr> <chr>
#> 1 in tons "" "" "Chart\nTable\nAverage"
#> 2 Copper "143,775" "3,575" ""
#> 3 Tin "4,690" "15" ""
#> 4 Lead "31,875" "0" ""
#> 5 Zinc "53,475" "150" ""
#> 6 Aluminium "327,625" "-1,225" ""
#> 7 Nickel "52,362" "942" ""
Created on 2022-10-09 with reprex v2.0.2