Home > Mobile >  Selenium C# drive.PageSource - 'is too long, or a component of the specified path is too long.&
Selenium C# drive.PageSource - 'is too long, or a component of the specified path is too long.&

Time:11-05

I'm trying to pass the driver.PageSource from Selenium C# to HTML Agility Pack, but this line of code htmlDoc.Load(driver.PageSource); returns error: '...' is too long, or a component of the specified path is too long.

p.s. Selenium Python and Beautiful Soup doesn't produce this error, when I was trying to do the same thing in Python instead of C#.

How to resolve this problem?

Full Code:

using System;
using System.Threading;
using HtmlAgilityPack;
using OpenQA.Selenium;
using OpenQA.Selenium.Chrome;
using OpenQA.Selenium.Support.UI;

namespace SeleniumSharp
{
    public static class WebScraping
    {
        public static void GetPageData()
        {
            // initial setup
            IWebDriver driver = new ChromeDriver();
            driver.Navigate().GoToUrl("<url>");

            // dropdown
            var dropdown1 = driver.FindElement(By.Id("cpMain_ucc1_ctl00_liResidentialFront"));
            dropdown1.Click();
            
            // enter search query
            var search = driver.FindElement(By.Id("cpMain_ucc1_ctl00_txtResidentialSearchBox"));
            search.Click();
            search.SendKeys("london");
            Thread.Sleep(3000);

            // submit search
            var submit = driver.FindElement(By.XPath("//div[@id='cpMain_ucc1_ctl00_pnlContentResidential']//a[@class='search-button']"));
            submit.Click();

            // Html Agility Pack
            HtmlDocument htmlDoc = new HtmlDocument();
            htmlDoc.Load(driver.PageSource);

            var address = htmlDoc.DocumentNode
                .SelectNodes("//div[@class='grid-address']")
                .ToList();

            foreach(var item in address)
            {
                Console.WriteLine(item.InnerText);
            }

        }

        
    }
}

This line of code returns error:

htmlDoc.Load(driver.PageSource);

Error:

'<html source>'is too long, or a component of the specified path is too long.
at System.IO.PathHelper.GetFullPathName(ReadOnlySpan`1 path, ValueStringBuilder& builder)
   at System.IO.PathHelper.Normalize(String path)
   at System.IO.Path.GetFullPath(String path)
   at System.IO.FileStream..ctor(String path, FileMode mode, FileAccess access, FileShare share, Int32 bufferSize, FileOptions options)
   at System.IO.StreamReader.ValidateArgsAndOpenPath(String path, Encoding encoding, Int32 bufferSize)  
   at System.IO.StreamReader..ctor(String path, Encoding encoding)
   at HtmlAgilityPack.HtmlDocument.Load(String path)

CodePudding user response:

It is because you are using the method Load instead of LoadHtml. Load method consumes path to file that contains HTML, not HTML source (driver.PageSource).

// From File
var doc = new HtmlDocument();
doc.Load(filePath);

// From String
var doc = new HtmlDocument();
doc.LoadHtml(html);

So try to use

htmlDoc.LoadHtml(driver.PageSource);
  • Related