I am looking to read a .txt
file from a URL
. I run the following:
readLines(paste0("https://www.sec.gov/Archives/", All_file_today[Var], sep = ""))
Given that All_file_today[var]
contains the following Url: 'edgar/data/99189/0001567619-22-004329.txt'
But it returns the error:
Error in file(con, "r") :
cannot open the connection to 'https://www.sec.gov/Archives/edgar/data/99189/0001567619-22-004329.txt'
When i copy this weblink and paste it in a web browser, it shows the content that I am looking for just clear. Anyone knows what i am not doing right please ?
Following the feedback from Nad below, I run the following:
> user <- paste('Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7), AppleWebKit/537.36 (KHTML, like Gecko) Chrome/99.0.4844.83 Safari/537.36')
> res <- GET(url, add_headers(`User-Agent` = user, Connection = 'keep-alive'))
> res
Response [https://www.sec.gov/Archives/edgar/data/1000097/0000919574-15-002406.txt]
Date: 2022-03-29 01:32
Status: 200
Content-Type: text/plain
Size: 5.44 kB
<SEC-DOCUMENT>0000919574-15-002406.txt : 20150225
<SEC-HEADER>0000919574-15-002406.hdr.sgml : 20150225
<ACCEPTANCE-DATETIME>20150225160223
ACCESSION NUMBER: 0000919574-15-002406
CONFORMED SUBMISSION TYPE: 13F-HR/A
PUBLIC DOCUMENT COUNT: 2
CONFORMED PERIOD OF REPORT: 20141231
FILED AS OF DATE: 20150225
DATE AS OF CHANGE: 20150225
EFFECTIVENESS DATE: 20150225
...
> readLines(content(res))
No encoding supplied: defaulting to UTF-8.
Error in file(con, "r") : cannot open the connection
From the above, I understand that I am able to get to the file, but the readLines does not go through. What could be the reason please ?
CodePudding user response:
We can read the file using package httr
,
url = 'https://www.sec.gov/Archives/edgar/data/99189/0001567619-22-004329.txt'
user <- paste('Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:98.0)',
'Gecko/20100101 Firefox/98.0')
res <- GET(url, add_headers(`User-Agent` = user, Connection = 'keep-alive'))
readLines(content(res))