My issue is similar to this post, but the solution suggestion does not appear applicable.
I have a lot of zipped data stored an online server (B2Drop), that provides a download link with the extension "/download" instead of ".zip". I have been unable to get the method described here, to work.
I have created a test download page https://b2drop.eudat.eu/s/K9sPPjWz3jxtXEq, where the download link https://b2drop.eudat.eu/s/K9sPPjWz3jxtXEq/download can be obtained by right clicking the download button. Here is my script:
temp <- tempfile()
download.file("https://b2drop.eudat.eu/s/K9sPPjWz3jxtXEq/download",temp, mode="wb")
data <- read.table(unz(temp, "Test_file1.csv"))
unlink(temp)
When I run it, I get the error:
> download.file("https://b2drop.eudat.eu/s/K9sPPjWz3jxtXEq/download",temp, mode="wb")
trying URL 'https://b2drop.eudat.eu/s/K9sPPjWz3jxtXEq/download'
Content type 'application/zip' length 558 bytes
downloaded 558 bytes
> data <- read.table(unz(temp, "Test_file1.csv"))
Error in open.connection(file, "rt") : cannot open the connection
In addition: Warning message:
In open.connection(file, "rt") :
cannot locate file 'Test_file1.csv' in zip file 'C:\Users\User_name\AppData\Local\Temp\RtmpMZ6gXi\file3e881b1f230e'
which typically indicates a problem with the working directory where R is looking for the file. In this case that should be the temp wd.
Has anyone had this problem and found a solution? Thanks in advance
CodePudding user response:
Your internal path is wrong. You can use list=TRUE
to list the files in the archive, analogous to the command-line utility's -l
argument.
unzip(temp, list=TRUE)
# Name Length Date
# 1 Test/Test_file1.csv 256 2021-09-27 10:13:00
# 2 Test/Test_file2.csv 286 2021-09-27 10:14:00
Better than read.table
, though, use read.csv
since it's comma-delimited.
data <- read.csv(unz(temp, "Test/Test_file1.csv"))
head(data, 3)
# ID Variable1 Variable2 Variable Variable3
# 1 1 f 54654 25 t1
# 2 2 t 421 64 t2
# 3 3 x 4521 85 t3