Home > Blockchain >  Cannot do webscraping with Selenium because recognised as bot
Cannot do webscraping with Selenium because recognised as bot

Time:10-27

I am trying to webscrape "https://www.futbol24.com/" and I am recognised as bot. I tried everything, including the removal of signatures in the javascript of chromedriver.exe, or changing user-agent and proxy, or playing with the several chrome_options.

However, I do reach the website if I simply use Chrome while it always fail whenever I use chromedriver instead. I think there may be something in the headers suggesting to the website when I try to access it by script or not. However, it seems it is impossible (or quite diffucult) to change the headers.

I am not expert about networking, so there may be some solution I could not find yet. Can somebody help me with that?

CodePudding user response:

you can use undetected-chromedriver

CodePudding user response:

Try this. It seems no problem

    ChromeDriver driver;   
    ChromeOptions options = new ChromeOptions();
    options.AcceptInsecureCertificates = true;
    var service = ChromeDriverService.CreateDefaultService();
    service.HideCommandPromptWindow = true;
    options.BinaryLocation = @"Chrome\Chrome\Chrome.exe";
    options.AddArguments("user-data-dir=/path/to/your/custom/profilotest");

    options.AddArguments(new List<string>() { "no-sandbox", "disable-gpu" });
    //options.AddArgument("headless");
    options.AddUserProfilePreference("browser.download.useDownloadDir", true);
    options.AddArguments("--browser.helperApps.neverAsk.saveToDisk=pdf");
    options.AddUserProfilePreference("plugins.always_open_pdf_externally", true);

    driver = new ChromeDriver(service, options);
    driver.Navigate().GoToUrl("https://www.futbol24.com");
  • Related