How can I convert URLS like:
https://danielabruña.com/
https://fagualópez.com/
https://laconspiracióndelcastellano.com/
to their UTF-8 encoding variants?
CodePudding user response:
It depends on the encoding of your text editor, but those are likely to be already UTF-8. You surely mean ASCII, as used by internationalized domain names.
First you need to extract the raw domain name with e.g. parse_url(), because IDN only applies to domains. Then you can use idn_to_ascii() the get the pure 7-bit ASCII encoding:
$urls = [
'https://danielabruña.com/',
'https://fagualópez.com/',
'https://laconspiracióndelcastellano.com/',
];
foreach ($urls as $url) {
$domain = parse_url($url, PHP_URL_HOST);
var_dump($domain, idn_to_ascii($domain));
}
string(17) "danielabruña.com"
string(23) "xn--danielabrua-beb.com"
string(15) "fagualópez.com"
string(21) "xn--fagualpez-b7a.com"
string(32) "laconspiracióndelcastellano.com"
string(38) "xn--laconspiracindelcastellano-ctc.com"
For this to work, you need to set your text editor to save files as UTF-8. Otherwise, you'll need to do additional conversions with e.g. mb_convert_encoding().
Rebuilding the original URL is a bit trickier, but also out of the scope of the question. You can use a dedicated URL handling library or string replace functions.
CodePudding user response:
With "utf8_encode":
echo(utf8_encode("https://danielabruña.com/"));
echo(utf8_encode("https://fagualópez.com/"));
echo(utf8_encode("https://laconspiracióndelcastellano.com/"));
Result: