Input: read_fasta_html<-AMPs%>%html_nodes("pre")%>%html_text() read_fasta_html
Output:
read_fasta_html<-AMPs%>%html_nodes("pre")%>%html_text() read_fasta_html character(0)
CodePudding user response:
One way to get the sequence is using the API
from which the webpage gets its text,
'https://www.ncbi.nlm.nih.gov/sviewer/viewer.fcgi?id=1626603948&db=protein&report=fasta&extrafeat=null&conwithfeat=on&hide-cdd=on&retmode=html&withmarkup=on&tool=portal&log$=seqview&maxdownloadsize=1000000' %>%
read_html() %>% html_text2()
[1] ">TII12583.1 GhoT/OrtT family toxin [Enterococcus faecium] MYLVRNAISFFITYFLSHDTMALVL"
You can also further look into packages rentrez
and biomartr