Home > Enterprise >  Fix corrupted characters (e.g. umlauts) in a string using ORACLE SQL and convert it to proper UTF-8
Fix corrupted characters (e.g. umlauts) in a string using ORACLE SQL and convert it to proper UTF-8

Time:03-22

I currently have an ORACLE table which, in one column, contains obviously corrupted strings like the following: Pachtvertrag über eine Gaststätte.

At some point, there probably have been used a wrong encoding for the string. Is there a way of fixing the "wrong" encoding in a string like this even when the string is already corrupted like this?

I tried the following:

SELECT CONVERT('Pachtvertrag über eine Gaststätte', 'UTF8', 'US7ASCII') FROM DUAL;

But this leads to: Pachtvertrag ����ber eine Gastst����tte, while it should actually be Pachtvertrag über eine Gaststätte.

Another idea of mine was to somehow convert the string to bytes first (e.g. by using TO_SINGLE_BYTE) but this didn't lead to the desired result, either.

CodePudding user response:

Character set US7ASCII does not support special characters and you must flip the character sets.

So, correction statement must be like

CONVERT('Pachtvertrag über eine Gaststätte', 'WE8ISO8859P1', 'AL32UTF8')

Just a note, ISO-8859-1, ISO-8859-15 and Windows-CP1252 (WE8MSWIN1252) are very similar. See ISO 8859-15 vs. -1 vs. Windows-1252 vs. Unicode and pick the correct encoding.

  • Related