Home > Blockchain >  Fatal error using PHP Simple HTML DOM Parser
Fatal error using PHP Simple HTML DOM Parser

Time:01-05

Hope you are well. Apologies for what I am sure is a dumb question...

Started working on using a PHP/HTMl scraper, and was following the tutorial here: https://www.zenrows.com/blog/web-scraping-php#introduction

Worked out how to use composer, installed the dependencies etc, however when using the example and playing around with it, I get the below:

Fatal error: Uncaught TypeError: Argument 1 passed to voku\helper\HtmlDomParser::loadHtml() must be of the type string, bool given

I've tried pretty much everything I can think of, this is the offending code:

    
    require_once __DIR__ . '/../vendor/autoload.php';
    require_once  __DIR__ . '/constants.php';
    require_once  __DIR__ . '/utils/scraping.php';
    use voku\helper\HtmlDomParser;
    
    // initializing the cURL request
    $curl = curl_init();
    // setting the URL to reach with a GET HTTP request
    curl_setopt($curl, CURLOPT_URL, "https://scrapeme.live/shop/");
    // to make the cURL request follow eventual redirects
    // and reach the final page of interest
    curl_setopt($curl, CURLOPT_FOLLOWLOCATION, true);
    // to get the data returned by the cURL request as a string
    curl_setopt($curl, CURLOPT_RETURNTRANSFER, true);
    // setting the User-Agent header
    curl_setopt($curl, CURLOPT_USERAGENT, USER_AGENT);
    // executing the cURL request and
    // get the HTML of the page as a string
    $html = curl_exec($curl);
    // releasing the cURL resources
    curl_close($curl);
    
    echo gettype ($html);
    echo "here";
    echo $html;
    
    // initialize HtmlDomParser 
    
    
    // initializing HtmlDomParser
    $htmlDomParser = HtmlDomParser::str_get_html($html);

Ive checked the var type for $html and PHP is echo'ing that its a string.

Again sorry for being dense, happy to do the reading but can anyone point me in the right direction?

Thanks!

CodePudding user response:

cURL might return a boolean value on error. From the docs: "it will return the result on success, false on failure.". I also run the code and it failed once, went OK another time. Maybe you can check for the value and retry the request if there's an error.

  • Related