Home > Blockchain >  Informix db clob with utf-8 characters stores incomplete data
Informix db clob with utf-8 characters stores incomplete data

Time:12-08

I am facing a problem with my Informix db while storing CLOB data with UTF-8 charset data. The regular varchar datatype with UTF-8 characters is storing without any issue; however, when I store the data as a CLOB and data (in my case XML) contains utf-8 characters, only truncated XML is stored.

Example Characters: german umlaut
Informix db character encoding is set to utf-8.

Below is my hbm file

<?xml version="1.0"?>
<!DOCTYPE hibernate-mapping PUBLIC
       "-//Hibernate/Hibernate Mapping DTD 3.0//EN"
        "http://hibernate.sourceforge.net/hibernate-mapping-3.0.dtd">
<hibernate-mapping package="com.test.adapter.persistence.bean">
    <class name="MediaLayoutConfigPojo" table="medialayoutconfig">
        <id name="dbId" type="int" column="id">
            <generator class="native" />
        </id>
        <property name="layoutxml" type="clob" column="layoutxml" /> 
        <property name="name" type="string" column="name" />
        <property name="description" type="string" column="description" />
        <property name="layouttype" type="string" column="layouttype" />    
    </class>
</hibernate-mapping>

SQL driver class is

<property name="connection.driver_class">com.informix.jdbc.IfxDriver</property>
<property name="dialect">org.hibernate.dialect.InformixDialect</property>

I am using the ifxjdbc-3.50.JC9.jar JDBC driver (unfortunately, I'm stuck with it; upgrading it may cause additional breakages since it's a legacy application).

Hibernate version used is: hibernate-core-3.6.3.Final.jar

I went through this link https://hibernate.atlassian.net/browse/HHH-6127. Not sure If I am hitting the same issue in driver class.

My table definition and sample data I am trying store and what exactly stored in given below:

Here is my table definition:

Column name          Type                                    Nulls

id                   serial                                  no
layouttype           varchar(32,0)                           no
name                 lvarchar(256)                           no
description          lvarchar(512)                           yes
layoutxml            clob                                    no

This the data I am trying to insert into CLOB layoutxml:

<MediaPropertiesLayout><name>Система</name><description></description><header><entry><showInPopOver/><displayName>Система</displayName><mediaProperty>callVariable1</mediaProperty><uiEditable>false</uiEditable></entry></header><column></column><column></column><uri></uri></MediaPropertiesLayout>

Actual data stored in layoutxml (CLOB)

<MediaPropertiesLayout><name>Система</name><description></description><header><entry><showInPopOver/><displayName>Система</displayName><mediaProperty>callVariable1</mediaProperty><uiEditable>false</uiEditable></entry></header><column></column><column></column><uri>

i.e. Closing tags for uri and MediaPropertiesLayout is truncated. This will vary depending upon the UTF-8 characters I use. If I use larger string instead of "Система", then truncation will be at different place.

Any pointer will be great help.

CodePudding user response:

Are you sure you are passing a correct UTF8 value? Usually when you get truncation is because the encoding is wrong.

I tried inserting the UTF8 string 'Füße' (46 C3BC C39F 65) with a simple Java console app and it behaves as expected.

D:\Infx\ids1410>chcp 65001
Active code page: 65001

D:\Infx\ids1410>java com.informix.jdbc.Version
IBM Informix JDBC Driver Version 3.50.JC9X3

D:\Infx\ids1410>javac -encoding utf8 clob.java

D:\Infx\ids1410>od -tcx1 clob.txt
0000000000     F               e
              46  C3  BC  C3  9F  65
0000000006
      
D:\Infx\ids1410>java -Dfile.encoding=UTF-8 clob

--Füße--Füße--

D:\Infx\ids1410>oncheck -pp utf8:t1 256
addr             stamp    chksum nslots flag type         frptr frcnt next     prev
2:4566           36224780 acf0   1      1    DATA         106   3982  0        0
        slot ptr   len   flg
        1    24    82    0
slot   1:
    0:  6 46 c3 bc c3 9f 65  0  0  0  1  0  0  0 d9 c8   .FC<C.e.......YH
   16: b7 a6  3  0  0  0  3  0  0  0  3  0  0  0 a1 f8   7&............!x
   32: 10 61  0  0  0  0  1  0  0  0  0  0  0  0  0  0   .a..............
   48:  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0   ................
   64:  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0   ................
   80:  0  0                                             ................

D:\Infx\ids1410>dbaccess utf8 -

Database selected.

> unload to t1.unl select * from t1;

1 row(s) unloaded.

> !od -x t1.unl
0000000000      C346    C3BC    659F    307C    362C    632C    6F6C    3162
0000000020      3031    2E66    6138    7C31    000A
0000000031
> !cat t1.unl
Füße|0,6,clob110f.8a1|
> !od -x clob110f.8a1
0000000000      C346    C3BC    659F
0000000006
> !cat clob110f.8a1
Füße>
>

I don't have 3.50.JC9 at hand, but 3.50.JC8 and 3.50.JC9X3 did the same so I'm sure the 'vanilla' 3.50.JC9 will work too.

Maybe you can try the console app with your xml data (make sure it is in UTF8) and see if it still gets corrupted or not.

// ----- clob.java -----
import java.sql.*;
import java.io.*;
import com.informix.jdbc.*; 

public class clob {
  
  public static void main(String[] args) {

    Connection con = null; 

    try {

        InputStreamReader isr = new InputStreamReader( System.in );
        BufferedReader stdin = new BufferedReader( isr );

        Class.forName("com.informix.jdbc.IfxDriver");
        con = DriverManager.getConnection("jdbc:informix-sqli://420ito:9091/utf8:INFORMIXSERVER=ids1410;user=informix;password=mypassw;DB_LOCALE=en_US.utf8;SQLIDEBUG=pp");

        System.out.println ("");

        Statement stm1 = con.createStatement();
        try {
                stm1.executeUpdate ("drop table  t1;"); 
        } catch (Exception ex) {}
        
        stm1.executeUpdate ("create table t1  (c1 varchar(20), c2 clob);");
          
        PreparedStatement ps = con.prepareStatement("insert into  t1 values ('Füße',?);");
        
        FileInputStream fis = new FileInputStream("clob.txt");
        BufferedInputStream dataStream = new BufferedInputStream(fis);
        ps.setAsciiStream(1, dataStream, dataStream.available());
        ps.execute();
        ps.close();

        PreparedStatement ps2 = con.prepareStatement("select * from t1;");
        IfxResultSet resultSet = (IfxResultSet) ps2.executeQuery();
        while (resultSet.next()) {
            System.out.format("--%s--%s--\n",resultSet.getString(1),resultSet.getString(2));
            try {
            } catch ( Exception ex ) {};
        }

        resultSet.releaseBlob();
        ps2.close();
        
        ((IfxSqliConnect) con).releaseBlob();
      
    } catch (Exception ex) {
      ex.printStackTrace();
      if (ex instanceof SQLException) {
        SQLException sqlex = (SQLException) ex;
        System.out.println ("Message: "   sqlex.getMessage());
    }
   }
  }
}
// ----- clob.java -----

If it works the issue maybe in how Hibernate handles that data.

CodePudding user response:

Issue still exists in creating Clob in Hibernate. Somehow, Hibernate not able to calculate the string length based on UTF-8 characters. When you have these kind of special character, you need layoutXML.getBytes("utf-8").length instead layoutXML.length(). This difference is actually causing the truncation.

However, I skipped the entire conversion overhead all together, just by declaring 'layoutxml' as String in POJO instead of java.sql.Clob.

  • Related