I think this is an encoding problem and I tried everything I can think of. This is my simple code
$curl = curl_init();
curl_setopt_array($curl, [
CURLOPT_URL => "https://theatrevazrajdane.bg/творчески-състав/актьори/2",
CURLOPT_RETURNTRANSFER => true,
// CURLOPT_ENCODING => "",
CURLOPT_MAXREDIRS => 10,
CURLOPT_TIMEOUT => 30,
CURLOPT_HTTP_VERSION => CURL_HTTP_VERSION_1_1,
CURLOPT_CUSTOMREQUEST => "GET",
CURLOPT_POSTFIELDS => "",
CURLOPT_HTTPHEADER => [
"Accept: application/json",
"Content-Type: text/html; charset=UTF-8"
],
]);
$response = curl_exec($curl);
$err = curl_error($curl);
curl_close($curl);
if ($err) {
echo "cURL Error #:" . $err;
} else {
echo $response;
}
So far I tied:
- Sending it with and without the CURLOPT_ENCODING parameter
- Sending it with and without "Content-Type: text/html; charset=UTF-8"
Sending this from Insomnia with the following parameters it works just fine.
curl --request GET \
--url https://theatrevazrajdane.bg/творчески-състав/актьори/2 \
--header 'Accept: application/json' \
--header 'Content-Type: text/html; charset=UTF-8'
From Insomnia for example I get the Title of the page as <title>Актьори | Театър Възраждане</title>
, but sending this from PHP using cURL I am getting <title>������� | ������ ����������</title>
CodePudding user response:
Take a look at the Content-Type
header of the website you mentioned:
Content-Type: text/html; charset=windows-1251;
This is the encoding in which the returned string is: windows-1251
.
Either give this when you return your own content to the browser or:
- Use
iconv
to convert the encoding to UTF-8:
$convertedResponse = iconv('Windows-1251', 'UTF-8', $response);
- Use
mb_convert_encoding
to convert the encoding to UTF-8:
$convertedResponse = mb_convert_encoding($response, 'UTF-8', 'Windows-1251');