Download all excel files from a webpage to R dataframes-CodePudding

My question is very similar to this one. I want to download all Excel files (.xlsx) from this webpage. But the difference is (I think) that I do not have the same pattern as used in the example. I have tried several variations with no result. Any idea how to download these files? Also, if you can show how I can download them directly into a dataframe (without downloading them to my PC first) that would be appreciated.

CodePudding user response：

A simple way to download the excel files, one step at a time.

First, get the links.

library(rvest)

url <- "https://www.fondbolagen.se/fakta_index/statistik/"

read_html(url) |>
  html_elements("p") |>
  html_elements("a") |>
  html_attr("href") |>
  (\(x) grep("\\.xls", x, value = TRUE))() |>
  (\(x) sprintf("http://www.fondbolagen.se%s", x))() -> excel_links

Now, use the code in this Rich Scriven post to download the files. I have omitted the files creation instruction.

dir.create("myexcel")
## save the current directory path for later
wd <- getwd()
## change working directory for the download
setwd("myexcel")
## download them all
lapply(excel_links, \(x) download.file(x, basename(x)))
## reset working directory to original
setwd(wd)