Hope you are well. Apologies for what I am sure is a dumb question...
Started working on using a PHP/HTMl scraper, and was following the tutorial here: https://www.zenrows.com/blog/web-scraping-php#introduction
Worked out how to use composer, installed the dependencies etc, however when using the example and playing around with it, I get the below:
Fatal error: Uncaught TypeError: Argument 1 passed to voku\helper\HtmlDomParser::loadHtml() must be of the type string, bool given
I've tried pretty much everything I can think of, this is the offending code:
require_once __DIR__ . '/../vendor/autoload.php';
require_once __DIR__ . '/constants.php';
require_once __DIR__ . '/utils/scraping.php';
use voku\helper\HtmlDomParser;
// initializing the cURL request
$curl = curl_init();
// setting the URL to reach with a GET HTTP request
curl_setopt($curl, CURLOPT_URL, "https://scrapeme.live/shop/");
// to make the cURL request follow eventual redirects
// and reach the final page of interest
curl_setopt($curl, CURLOPT_FOLLOWLOCATION, true);
// to get the data returned by the cURL request as a string
curl_setopt($curl, CURLOPT_RETURNTRANSFER, true);
// setting the User-Agent header
curl_setopt($curl, CURLOPT_USERAGENT, USER_AGENT);
// executing the cURL request and
// get the HTML of the page as a string
$html = curl_exec($curl);
// releasing the cURL resources
curl_close($curl);
echo gettype ($html);
echo "here";
echo $html;
// initialize HtmlDomParser
// initializing HtmlDomParser
$htmlDomParser = HtmlDomParser::str_get_html($html);
Ive checked the var type for $html and PHP is echo'ing that its a string.
Again sorry for being dense, happy to do the reading but can anyone point me in the right direction?
Thanks!
CodePudding user response:
cURL might return a boolean value on error. From the docs: "it will return the result on success, false on failure.". I also run the code and it failed once, went OK another time. Maybe you can check for the value and retry the request if there's an error.