Home > Software engineering >  Non-ASCII strings saved to MySQL in a wrong encoding (MySQL, Java)
Non-ASCII strings saved to MySQL in a wrong encoding (MySQL, Java)

Time:10-27

I'm writing a simple insert-select data "hello world" application (I'm new to Java but try to help my 14-y.o. son with his first project) and get an issue: non-ASCII (Russian) strings are saved to MySQL table in a wrong encoding. All right, I have already checked:

  • Schema and table colation: utf8_general_ci
  • Code file encoding is UTF-8 (written in VS Code)

I'm using official MySQL Connector/J from Oracle website. The code itself:

public static void main(String[] args) {
        Connection conn = null;
        PreparedStatement stmt1 = null;

        try {
            Class.forName("com.mysql.cj.jdbc.Driver").newInstance();
        } catch (Exception ex) {
            System.out.println("Error getting newInstance()"   ex.getMessage());
            return;
        }

        try {
            conn = DriverManager.getConnection("jdbc:mysql://demo.server.ru/project1?user=...&password=...&characterEncoding=utf8&useUnicode=true");

            stmt1 = conn.prepareStatement("INSERT INTO Pers1 (FirstName, LastName, Phone) VALUES (?, ?, ?)");

            stmt1.setString(1, "Иван");
            stmt1.setString(2, "Ромашкин");
            stmt1.setString(3, " 79115544788");

            stmt1.executeUpdate();
            stmt1.Close();
        } catch (SQLException ex) {
            // handle the error
            System.out.println("MySQL error: "   ex.getMessage());
            System.out.println("SQLState: "   ex.getSQLState());
            System.out.println("VendorError: "   ex.getErrorCode());
        } 
    }
}

I have also tried to encode string data to UTF-8 (well, it must be in UTF-8 from the very beginning...). But I still find something like Антуан in the table! Please tell me what is wrong with all these stuff?

CodePudding user response:

utf8_general_ci

That's not UTF8. MySQL is a lying liar that lies.

utf8mb4 is actual UTF_8. utf8 is short for utf8mb3 which is a ridiculous name, as that is not UTF_8.

Address that issue. Once that's fixed, your code is no longer the problem. If you're seeing 'weird' characters, whatever pipeline you have that takes data from MySQL and shows it on screen has a problem, not your java code.

CodePudding user response:

Thanks to Jon Skeet, the right answer is this: the Java code containing non-ASCII strings should be compiled with extra flag encoding :

java -encoding utf8 Test.java
  • Related