Home > OS >  How to add the start of a url to a colly link list
How to add the start of a url to a colly link list

Time:11-24

I'm somewhat new to go and am trying to scrape several webpages using colly. Two of the pages have incomplete links, the below is the code and output

func PaloNet() {

    c := colly.NewCollector(
        colly.AllowedDomains("security.paloaltonetworks.com"),
    )

    c.OnHTML(".list", func(e *colly.HTMLElement) {
        PaloNetlinks := e.ChildAttrs("a", "href")
        fmt.Println("\n\n PaloAlto Security: \n\n", PaloNetlinks)
    })

    c.Visit("https://security.paloaltonetworks.com/")

}

Output:

[/CVE-2022-0031 /CVE-2022-42889 /PAN-SA-2022-0006 /CVE-2022-0030 /CVE-2022-0029 /PAN-SA-2022-0005 /CVE-2022-28199 /PAN-SA-2022-0004 /CVE-2022-0028 /PAN-SA-2022-0003 /CVE-2022-0024 /CVE-2022-0026 /CVE-2022-0025 /CVE-2022-0027 /PAN-SA-2022-0001 /PAN-SA-2022-0002 /CVE-2022-0023 /CVE-2022-0778 /CVE-2022-22963 /CVE-2022-0022 /CVE-2021-44142 /CVE-2022-0016 /CVE-2022-0017 /CVE-2022-0020 /CVE-2022-0011 /csv?]

As you can see the links are missing the 'https://security.paloaltonetworks.com/' section. What would be the best way to add the start of the link

CodePudding user response:

you can do it like this

func PaloNet() {
visitUrl := "https://security.paloaltonetworks.com"
urls := []string{}

c := colly.NewCollector(
    colly.AllowedDomains("security.paloaltonetworks.com"),
)

c.OnHTML(".list", func(e *colly.HTMLElement) {
    PaloNetlinks := e.ChildAttrs("a", "href")

    for i := 0; i < len(PaloNetlinks); i   {
        urls = append(urls, visitUrl PaloNetlinks[i])
    }

    fmt.Println("\n\n PaloAlto Security: \n\n", urls)
})

c.Visit(visitUrl)
}
  • Related