I'm trying to convert a cyrillic 1251 to utf-8.
Given String: Íó è ÿ ñäåëàëà âûâîäû...
Expected String: Ну и я сделала выводы...
What I've tried so far:
echo iconv('CP1251', 'UTF-8', 'Íó è ÿ ñäåëàëà âûâîäû...');
or
echo mb_convert_encoding('Íó è ÿ ñäåëàëà âûâîäû...', 'UTF-8', 'CP1251');
The result I got:
Íó è ÿ ñäåëà ëà âûâîäû...
Any ideas how I could make it work?
CodePudding user response:
What you have is a UTF8 string made up of cp1252 characters which are a misrepresentation of cp1251.
The true answer is to fix what produced this mistake so that your data doesn't get corrupted like this.
The worse answer is to repeat the mis-translation in reverse to recover the original string, and then convert it properly.
$input = 'Íó è ÿ ñäåëàëà âûâîäû...';
// convert back to source string via CP1252 single-byte encoding
$out = mb_convert_encoding($input, 'CP1252', 'UTF-8');
// correctly convert source string to UTF8 using CP1251
$out = mb_convert_encoding($out, 'UTF-8', 'CP1251');
var_dump($st2);