How can I access the html nodes of 'hidden' class with R language as I'm unable to ex-CodePudding

Input: read_fasta_html<-AMPs%>%html_nodes("pre")%>%html_text() read_fasta_html

Output:

read_fasta_html<-AMPs%>%html_nodes("pre")%>%html_text() read_fasta_html character(0)

CodePudding user response：

One way to get the sequence is using the API from which the webpage gets its text,

'https://www.ncbi.nlm.nih.gov/sviewer/viewer.fcgi?id=1626603948&db=protein&report=fasta&extrafeat=null&conwithfeat=on&hide-cdd=on&retmode=html&withmarkup=on&tool=portal&log$=seqview&maxdownloadsize=1000000' %>% 
  read_html() %>% html_text2()
[1] ">TII12583.1 GhoT/OrtT family toxin [Enterococcus faecium] MYLVRNAISFFITYFLSHDTMALVL"

You can also further look into packages rentrez and biomartr