I am facing a problem with my Informix db while storing CLOB data with UTF-8 charset data. The regular varchar datatype with UTF-8 characters is storing without any issue; however, when I store the data as a CLOB and data (in my case XML) contains utf-8 characters, only truncated XML is stored.
Example Characters: german umlaut
Informix db character encoding is set to utf-8.
Below is my hbm file
<?xml version="1.0"?>
<!DOCTYPE hibernate-mapping PUBLIC
"-//Hibernate/Hibernate Mapping DTD 3.0//EN"
"http://hibernate.sourceforge.net/hibernate-mapping-3.0.dtd">
<hibernate-mapping package="com.test.adapter.persistence.bean">
<class name="MediaLayoutConfigPojo" table="medialayoutconfig">
<id name="dbId" type="int" column="id">
<generator class="native" />
</id>
<property name="layoutxml" type="clob" column="layoutxml" />
<property name="name" type="string" column="name" />
<property name="description" type="string" column="description" />
<property name="layouttype" type="string" column="layouttype" />
</class>
</hibernate-mapping>
SQL driver class is
<property name="connection.driver_class">com.informix.jdbc.IfxDriver</property>
<property name="dialect">org.hibernate.dialect.InformixDialect</property>
I am using the ifxjdbc-3.50.JC9.jar JDBC driver (unfortunately, I'm stuck with it; upgrading it may cause additional breakages since it's a legacy application).
Hibernate version used is: hibernate-core-3.6.3.Final.jar
I went through this link https://hibernate.atlassian.net/browse/HHH-6127. Not sure If I am hitting the same issue in driver class.
My table definition and sample data I am trying store and what exactly stored in given below:
Here is my table definition:
Column name Type Nulls
id serial no
layouttype varchar(32,0) no
name lvarchar(256) no
description lvarchar(512) yes
layoutxml clob no
This the data I am trying to insert into CLOB layoutxml:
<MediaPropertiesLayout><name>Система</name><description></description><header><entry><showInPopOver/><displayName>Система</displayName><mediaProperty>callVariable1</mediaProperty><uiEditable>false</uiEditable></entry></header><column></column><column></column><uri></uri></MediaPropertiesLayout>
Actual data stored in layoutxml (CLOB)
<MediaPropertiesLayout><name>Система</name><description></description><header><entry><showInPopOver/><displayName>Система</displayName><mediaProperty>callVariable1</mediaProperty><uiEditable>false</uiEditable></entry></header><column></column><column></column><uri>
i.e. Closing tags for uri
and MediaPropertiesLayout
is truncated. This will vary depending upon the UTF-8 characters I use. If I use larger string instead of "Система", then truncation will be at different place.
Any pointer will be great help.
CodePudding user response:
Are you sure you are passing a correct UTF8 value? Usually when you get truncation is because the encoding is wrong.
I tried inserting the UTF8 string 'Füße' (46 C3BC C39F 65) with a simple Java console app and it behaves as expected.
D:\Infx\ids1410>chcp 65001
Active code page: 65001
D:\Infx\ids1410>java com.informix.jdbc.Version
IBM Informix JDBC Driver Version 3.50.JC9X3
D:\Infx\ids1410>javac -encoding utf8 clob.java
D:\Infx\ids1410>od -tcx1 clob.txt
0000000000 F e
46 C3 BC C3 9F 65
0000000006
D:\Infx\ids1410>java -Dfile.encoding=UTF-8 clob
--Füße--Füße--
D:\Infx\ids1410>oncheck -pp utf8:t1 256
addr stamp chksum nslots flag type frptr frcnt next prev
2:4566 36224780 acf0 1 1 DATA 106 3982 0 0
slot ptr len flg
1 24 82 0
slot 1:
0: 6 46 c3 bc c3 9f 65 0 0 0 1 0 0 0 d9 c8 .FC<C.e.......YH
16: b7 a6 3 0 0 0 3 0 0 0 3 0 0 0 a1 f8 7&............!x
32: 10 61 0 0 0 0 1 0 0 0 0 0 0 0 0 0 .a..............
48: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ................
64: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ................
80: 0 0 ................
D:\Infx\ids1410>dbaccess utf8 -
Database selected.
> unload to t1.unl select * from t1;
1 row(s) unloaded.
> !od -x t1.unl
0000000000 C346 C3BC 659F 307C 362C 632C 6F6C 3162
0000000020 3031 2E66 6138 7C31 000A
0000000031
> !cat t1.unl
Füße|0,6,clob110f.8a1|
> !od -x clob110f.8a1
0000000000 C346 C3BC 659F
0000000006
> !cat clob110f.8a1
Füße>
>
I don't have 3.50.JC9 at hand, but 3.50.JC8 and 3.50.JC9X3 did the same so I'm sure the 'vanilla' 3.50.JC9 will work too.
Maybe you can try the console app with your xml data (make sure it is in UTF8) and see if it still gets corrupted or not.
// ----- clob.java -----
import java.sql.*;
import java.io.*;
import com.informix.jdbc.*;
public class clob {
public static void main(String[] args) {
Connection con = null;
try {
InputStreamReader isr = new InputStreamReader( System.in );
BufferedReader stdin = new BufferedReader( isr );
Class.forName("com.informix.jdbc.IfxDriver");
con = DriverManager.getConnection("jdbc:informix-sqli://420ito:9091/utf8:INFORMIXSERVER=ids1410;user=informix;password=mypassw;DB_LOCALE=en_US.utf8;SQLIDEBUG=pp");
System.out.println ("");
Statement stm1 = con.createStatement();
try {
stm1.executeUpdate ("drop table t1;");
} catch (Exception ex) {}
stm1.executeUpdate ("create table t1 (c1 varchar(20), c2 clob);");
PreparedStatement ps = con.prepareStatement("insert into t1 values ('Füße',?);");
FileInputStream fis = new FileInputStream("clob.txt");
BufferedInputStream dataStream = new BufferedInputStream(fis);
ps.setAsciiStream(1, dataStream, dataStream.available());
ps.execute();
ps.close();
PreparedStatement ps2 = con.prepareStatement("select * from t1;");
IfxResultSet resultSet = (IfxResultSet) ps2.executeQuery();
while (resultSet.next()) {
System.out.format("--%s--%s--\n",resultSet.getString(1),resultSet.getString(2));
try {
} catch ( Exception ex ) {};
}
resultSet.releaseBlob();
ps2.close();
((IfxSqliConnect) con).releaseBlob();
} catch (Exception ex) {
ex.printStackTrace();
if (ex instanceof SQLException) {
SQLException sqlex = (SQLException) ex;
System.out.println ("Message: " sqlex.getMessage());
}
}
}
}
// ----- clob.java -----
If it works the issue maybe in how Hibernate handles that data.
CodePudding user response:
Issue still exists in creating Clob in Hibernate. Somehow, Hibernate not able to calculate the string length based on UTF-8 characters. When you have these kind of special character, you need layoutXML.getBytes("utf-8").length
instead layoutXML.length()
. This difference is actually causing the truncation.
However, I skipped the entire conversion overhead all together, just by declaring 'layoutxml' as String in POJO instead of java.sql.Clob.