Home > Enterprise >  Google Sheets - Convert UTF-8 Text to "normal character"
Google Sheets - Convert UTF-8 Text to "normal character"

Time:12-18

I'm looking for a solution to convert strings formatted in UTF-8? (Avrìl) to their normal appearance.

It is a list of Names that should be readable:

Shivãz
Hôndêmon
Hôndêmon
Revá
Ðreamy
Fåmé
Nobô
Noshì

With a list of codes i want to replace them with:

...
Ò   c3 92   Ò
Ó   c3 93   Ó
Ô   c3 94   Ô
Õ   c3 95   Õ
Ö   c3 96   Ö
×   c3 97   ×
Ø   c3 98   Ø
Ù   c3 99   Ù
Ú   c3 9a   Ú
Û   c3 9b   Û
Ü   c3 9c   Ü
...

I found app-spript snippets, and "hardcoded" =substitute() but since there are not only 5 characters to be searched, i dont want to nest 200 substitutes inside each other.

Any solutions?

CodePudding user response:

Without any substitute, try

function myFunction(){
  var txt = `Shivãz
Hôndêmon
Hôndêmon
Revá
Ðreamy
Fåmé
Nobô
Noshì`
  console.log(utf8decode(unescape(txt)))
}

and add this function

function utf8decode(utftext) {
  var string = "";
  var i = 0;
  var c = c1 = c2 = 0;
  while ( i < utftext.length ) {
    c = utftext.charCodeAt(i);
    if (c < 128) {
      string  = String.fromCharCode(c);
      i  ;
    }
    else if((c > 191) && (c < 224)) {
      c2 = utftext.charCodeAt(i 1);
      string  = String.fromCharCode(((c & 31) << 6) | (c2 & 63));
      i  = 2;
    }
    else {
      c2 = utftext.charCodeAt(i 1);
      c3 = utftext.charCodeAt(i 2);
      string  = String.fromCharCode(((c & 15) << 12) | ((c2 & 63) << 6) | (c3 & 63));
      i  = 3;
    }
  }
  return string;
}

var txt = `Shivãz
Hôndêmon
Hôndêmon
Revá
Ðreamy
Fåmé
Nobô
Noshì`
console.log(utf8decode(unescape(txt)))
function utf8decode(utftext) {
  var string = "";
  var i = 0;
  var c = c1 = c2 = 0;
  while ( i < utftext.length ) {
    c = utftext.charCodeAt(i);
    if (c < 128) {
      string  = String.fromCharCode(c);
      i  ;
    }
    else if((c > 191) && (c < 224)) {
      c2 = utftext.charCodeAt(i 1);
      string  = String.fromCharCode(((c & 31) << 6) | (c2 & 63));
      i  = 2;
    }
    else {
      c2 = utftext.charCodeAt(i 1);
      c3 = utftext.charCodeAt(i 2);
      string  = String.fromCharCode(((c & 15) << 12) | ((c2 & 63) << 6) | (c3 & 63));
      i  = 3;
    }
  }
  return string;
}

CodePudding user response:

if you change your mind with SUBSTITUTE brute force:

={""; INDEX("=INDEX(IFERROR("&REPT("SUBSTITUTE(", COUNTA(E2:E))&"A:A, "&QUERY("D"&
 SEQUENCE(COUNTA(D2:D), 1, ROW(D2))&", "&"E"&
 SEQUENCE(COUNTA(D2:D), 1, ROW(D2))&"),",,9^9)&" ))")}

enter image description here

spreadsheet demo

  • Related