1 - With my code (French Wikipedia Page):
$doc = new DOMDocument();
$doc->load("https://fr.wikipedia.org/wiki/Facebook");
echo $doc->saveHTML();
I get the following error:
Warning: DOMDocument::load(): Specification mandates value for attribute checked in https://fr.wikipedia.org/wiki/Facebook, line: 43 in C:\laragon\www\test.php on line 190
Warning: DOMDocument::load(): Opening and ending tag mismatch: img line 37 and span in https://fr.wikipedia.org/wiki/Facebook, line: 68 in C:\laragon\www\test.php on line 190
Warning: DOMDocument::load(): Opening and ending tag mismatch: img line 37 and a in https://fr.wikipedia.org/wiki/Facebook, line: 69 in C:\laragon\www\test.php on line 190
Warning: DOMDocument::load(): Opening and ending tag mismatch: span line 37 and div in https://fr.wikipedia.org/wiki/Facebook, line: 71 in C:\laragon\www\test.php on line 190
Warning: DOMDocument::load(): Opening and ending tag mismatch: img line 37 and header in https://fr.wikipedia.org/wiki/Facebook, line: 156 in C:\laragon\www\test.php on line 190
Warning: DOMDocument::load(): Opening and ending tag mismatch: a line 37 and div in https://fr.wikipedia.org/wiki/Facebook, line: 2971 in C:\laragon\www\test.php on line 190
Warning: DOMDocument::load(): Opening and ending tag mismatch: header line 37 and body in https://fr.wikipedia.org/wiki/Facebook, line: 2977 in C:\laragon\www\test.php on line 190
Warning: DOMDocument::load(): Opening and ending tag mismatch: input line 37 and html in https://fr.wikipedia.org/wiki/Facebook, line: 2978 in C:\laragon\www\test.php on line 190
Warning: DOMDocument::load(): EndTag: '</' not found in https://fr.wikipedia.org/wiki/Facebook, line: 2978 in C:\laragon\www\test.php on line 190
2 - In the same way, with the following code (Japanese Wikipedia Page):
$doc->load(rawurldecode("https://zh.wikipedia.org/wiki/亞馬遜公司"));
echo $doc->saveHTML();
Despite having used rawurldecode
to no longer for the Asian language, I get the different following error:
Warning: DOMDocument::load(): Entity 'reg' not defined in https://zh.wikipedia.org/wiki/亞馬遜公司, line: 1382 in C:\laragon\www\test.php on line 181
Warning: DOMDocument::load(): Entity 'trade' not defined in https://zh.wikipedia.org/wiki/亞馬遜公司, line: 1382 in C:\laragon\www\test.php on line 181
Why does the load
method of DomDocument
show me different errors when I use it with Pages in French
(https://fr.wikipedia.org/wiki/Facebook) and in Asia (Japanese
and Chinese:
https://zh.wikipedia.org/wiki/亞馬遜公司) on the other hand works very well when the URL is in English ???
Thank you please help me.
CodePudding user response:
You should use any get content method
$doc = new DOMDocument();
$doc->loadHTML(file_get_contents("https://fr.wikipedia.org/wiki/Facebook"));
echo $doc->saveHTML();
CodePudding user response:
And also when I try to use:
$doc->loadHTMLFile("https://zh.wikipedia.org/wiki/亞馬遜公司");
It seems to work as the result is displayed but just at the top of the result I get errors like this:
Warning: DOMDocument::loadHTMLFile(): Tag wbr invalid in https://zh.wikipedia.org/wiki/