Home > Net >  Download zip file to R when download link ends in '/download'
Download zip file to R when download link ends in '/download'

Time:09-28

My issue is similar to this post, but the solution suggestion does not appear applicable.

I have a lot of zipped data stored an online server (B2Drop), that provides a download link with the extension "/download" instead of ".zip". I have been unable to get the method described here, to work.

I have created a test download page https://b2drop.eudat.eu/s/K9sPPjWz3jxtXEq, where the download link https://b2drop.eudat.eu/s/K9sPPjWz3jxtXEq/download can be obtained by right clicking the download button. Here is my script:

temp <- tempfile()
download.file("https://b2drop.eudat.eu/s/K9sPPjWz3jxtXEq/download",temp, mode="wb")
data <- read.table(unz(temp, "Test_file1.csv"))
unlink(temp)

When I run it, I get the error:

> download.file("https://b2drop.eudat.eu/s/K9sPPjWz3jxtXEq/download",temp, mode="wb")
trying URL 'https://b2drop.eudat.eu/s/K9sPPjWz3jxtXEq/download'
Content type 'application/zip' length 558 bytes
downloaded 558 bytes

> data <- read.table(unz(temp, "Test_file1.csv"))
Error in open.connection(file, "rt") : cannot open the connection
In addition: Warning message:
In open.connection(file, "rt") :
  cannot locate file 'Test_file1.csv' in zip file 'C:\Users\User_name\AppData\Local\Temp\RtmpMZ6gXi\file3e881b1f230e'

which typically indicates a problem with the working directory where R is looking for the file. In this case that should be the temp wd.

Has anyone had this problem and found a solution? Thanks in advance

CodePudding user response:

Your internal path is wrong. You can use list=TRUE to list the files in the archive, analogous to the command-line utility's -l argument.

unzip(temp, list=TRUE)
#                  Name Length                Date
# 1 Test/Test_file1.csv    256 2021-09-27 10:13:00
# 2 Test/Test_file2.csv    286 2021-09-27 10:14:00

Better than read.table, though, use read.csv since it's comma-delimited.

data <- read.csv(unz(temp, "Test/Test_file1.csv"))
head(data, 3)
#   ID Variable1 Variable2 Variable Variable3
# 1  1         f     54654       25        t1
# 2  2         t       421       64        t2
# 3  3         x      4521       85        t3
  • Related