Home > Software design >  Download tsv file from link which generates the file upon request, using bash
Download tsv file from link which generates the file upon request, using bash

Time:11-16

I need to download .txt files which are generated from links like this one: https://www.ebi.ac.uk/ena/portal/api/filereport?accession=SRP002480&result=read_run&fields=fastq_ftp&format=tsv&download=true&limit=0 but I need to download it in the bash shell. It works perfectly fine on Firefox, on the shell I tried wget and curl to no avail. I read lots of similar question in Stack Overflow and other pages, tried everything I could find, but couldn't find a solution. For example:

curl https://www.ebi.ac.uk/ena/portal/api/filereport?accession=SRP002480&result=read_run&fields=fastq_ftp&format=tsv&download=true&limit=0      

This is the output, and no file is downloaded:

[1] 1094                                                                                                                                       
[2] 1095                                                                                                                                       
[3] 1096                                                                                                                                       
[4] 1097                                                                                                                                       
[5] 1098                                                                                                                                       
[2]   Done                    result=read_run                                                                                                  
[3]   Done                    fields=fastq_ftp                                                                                                 
[4]-  Done                    format=tsv                                                                                                       
(base) user@DESKTOP-LV4SKHQ:/mnt/c/Users/conog/Desktop/prova$ curl: (6) Could not resolve host: www.ebi.ac.uk                              
                                                                                                                                               
[1]-  Exit 6                  curl https://www.ebi.ac.uk/ena/portal/api/filereport?accession=SRP002480                                         
[5]   Done                    download=true 

Another example, after I read a couple of posts here:

curl -O -L https://www.ebi.ac.uk/ena/portal/api/filereport?accession=SRP002480&result=read_run&fields=fastq_ftp&format=tsv&download=true&limit=0
[1] 1056                                                                                                                                       
[2] 1057                                                                                                                                       
[3] 1058                                                                                                                                       
[4] 1059                                                                                                                                       
[5] 1060                                                                                                                                       
[2]   Done                    result=read_run                                                                                                  
[3]   Done                    fields=fastq_ftp                                                                                                 
[4]   Done                    format=tsv                                                                                                       
[5]   Done                    download=true                                                                                                    
(base) gsoletta@DESKTOP-LV4SKHQ:/mnt/c/Users/conog/Desktop/prova$   % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current                                                                                                                                             
                                 Dload  Upload   Total   Spent    Left  Speed                                                                  
100    49  100    49    0     0     68      0 --:--:-- --:--:-- --:--:--    67                                                                 
                                                                                                                                               
[1]   Done

this last one downloads a 49 byte file with no extension, called filereportaccession=SRP002480, with the content: "Required String parameter 'result' is not present".

I'll also add I'm a novice at bash. What could I do?

Thank you!

CodePudding user response:

It works for me:

$ curl -s 'https://www.ebi.ac.uk/ena/portal/api/filereport?accession=SRP002480&result=read_run&fields=fastq_ftp&format=tsv&download=true&limit=0'
run_accession   fastq_ftp
SRR1620013  ftp.sra.ebi.ac.uk/vol1/fastq/SRR162/003/SRR1620013/SRR1620013_1.fastq.gz;ftp.sra.ebi.ac.uk/vol1/fastq/SRR162/003/SRR1620013/SRR1620013_2.fastq.gz
SRR1620014  ftp.sra.ebi.ac.uk/vol1/fastq/SRR162/004/SRR1620014/SRR1620014_1.fastq.gz;ftp.sra.ebi.ac.uk/vol1/fastq/SRR162/004/SRR1620014/SRR1620014_2.fastq.gz
...
$ wget -O filereport.tsv 'https://www.ebi.ac.uk/ena/portal/api/filereport?accession=SRP002480&result=read_run&fields=fastq_ftp&format=tsv&download=true&limit=0'
--2021-11-15 17:51:48--  https://www.ebi.ac.uk/ena/portal/api/filereport?accession=SRP002480&result=read_run&fields=fastq_ftp&format=tsv&download=true&limit=0
Resolving www.ebi.ac.uk (www.ebi.ac.uk)... 193.62.193.80
Connecting to www.ebi.ac.uk (www.ebi.ac.uk)|193.62.193.80|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [text/plain]
Saving to: ‘filereport.tsv’
...
2021-11-15 17:51:51 (831 KB/s) - ‘filereport.tsv’ saved [675136]

Your problem is that you didn't put quotes around the URL. When you don't quote the URL the &s in it cause each URL parameter to be interpreted as a separate command by bash.

  • Related