Home > other >  How can I get the name of a seller on a specific website using golang?
How can I get the name of a seller on a specific website using golang?

Time:08-17

I'm making a web scraper in go. Given a specific web page, I'm trying to get the name of the seller which is placed in the top right corner (in this example on this olx site you can see the name of the seller is Ionut). When I run the down below code, it should write the name in the index.csv file, but the file is empty. I think the problem is at the HTML parser, though it looks fine to me.

package main

import (
    "encoding/csv"
    "fmt"
    "log"
    "os"
    "path/filepath"

    "github.com/gocolly/colly"
)

func main() {
    //setting up the file where we store collected data
    fName := filepath.Join("D:\\", "go projects", "cwst go", "CWST-GO", "target folder", "index.csv")
    file, err := os.Create(fName)
    if err != nil {
        log.Fatalf("Could not create file, error :%q", err)
    }
    defer file.Close()
    //writer that writes the collected data into our file
    writer := csv.NewWriter(file)
    //after the file is written, what it is in the buffer goes in writer and then passed to file
    defer writer.Flush()

    //collector
    c := colly.NewCollector(
        colly.AllowedDomains("https://www.olx.ro/"),
    )

    //HTML parser
    c.OnHTML(".css-1fp4ipz", func(e *colly.HTMLElement) { //div class that contains wanted info

        writer.Write([]string{
            e.ChildText("h4"), //specific tag of the info
        })
    })

    fmt.Printf("Scraping page :  ")
    c.Visit("https://www.olx.ro/d/oferta/bmw-xdrixe-seria-7-2020-71000-tva-IDgp7iN.html")

    log.Printf("\n\nScraping Complete\n\n")
    log.Println(c)

}

CodePudding user response:

You don't need to add https or / in the allowed domains.

c := colly.NewCollector(
    colly.AllowedDomains("www.olx.ro"),
)
  • Related