Home > Mobile >  Convert cyrillic 1251 to UTF-8
Convert cyrillic 1251 to UTF-8

Time:11-18

I'm trying to convert a cyrillic 1251 to utf-8.

Given String: Íó è ÿ ñäåëàëà âûâîäû...

Expected String: Ну и я сделала выводы...

What I've tried so far:

echo iconv('CP1251', 'UTF-8', 'Íó è ÿ ñäåëàëà âûâîäû...');

or

echo mb_convert_encoding('Íó è ÿ ñäåëàëà âûâîäû...', 'UTF-8', 'CP1251');

The result I got:

Íó è ÿ ñäåëà ëà âûâîäû...

Any ideas how I could make it work?

CodePudding user response:

What you have is a UTF8 string made up of cp1252 characters which are a misrepresentation of cp1251.

The true answer is to fix what produced this mistake so that your data doesn't get corrupted like this.

The worse answer is to repeat the mis-translation in reverse to recover the original string, and then convert it properly.

$input = 'Íó è ÿ ñäåëàëà âûâîäû...';

// convert back to source string via CP1252 single-byte encoding
$out = mb_convert_encoding($input, 'CP1252', 'UTF-8');

// correctly convert source string to UTF8 using CP1251
$out = mb_convert_encoding($out, 'UTF-8', 'CP1251');

var_dump($st2);
  •  Tags:  
  • php
  • Related