Home > Back-end >  Consult from the database query out the character of the number of bytes, thank you
Consult from the database query out the character of the number of bytes, thank you

Time:05-14

Drop table test;
Create table test
(
The name varchar (1)
) engine=myisam default charset=utf8mb4;

Insert into test values (' yan ');
commit;

Then execute the following SQL, found that "yan" accounted for 3 bytes
Mysql> Select the name, length (name) from the test;
+ -- -- -- -- -- - + -- -- -- -- -- -- -- -- -- -- -- -- -- -- +
| | name | length (name)
+ -- -- -- -- -- - + -- -- -- -- -- -- -- -- -- -- -- -- -- -- +
| | 3 | yan
+ -- -- -- -- -- - + -- -- -- -- -- -- -- -- -- -- -- -- -- -- +

This shows the Chinese save a utf-8

Now I
The static HashMap H_test=new HashMap (a);
String STR=rs. Get String (" name ");
H_test. Put (STR, "");
Could you tell me the STR and h_test "yan" is utf-8 code?

How do you know the number of bytes of STR?

thank you

CodePudding user response:

And you run the program on coding environment of system, such as the coding format is GBK my machine, so it is 2 bytes, but the MAC is utf-8 result is 3 bytes

System. The out. Println (cc) getBytes (.) length);
System. The out. Println (Charset. DefaultCharset ());

CodePudding user response:

The
reference 1 floor PNZ. BeijingL response:
and you run the program on coding environment of system, such as the coding format is GBK my machine, so it is 2 bytes, but the MAC is utf-8 result is 3 bytes

System. The out. Println (cc) getBytes (.) length);
System. The out. Println (Charset. DefaultCharset ());

Thank you for the great god
Feel very complex
1. The compilation of encoding is what mean?
Java javac - GBK encoding me.

2. The runtime - Dfile. Encoding is what mean?
Java - Dfile. Encoding=utf-8 me

3. Even when the database of characterEncoding what mean?
CharacterEncoding=utf-8

So complicated! What is the difference between above three encoding, respectively, have what effect?
Thank you


I think code is:
1. What encoding rules in
2. What encoding rules to read
3. With what encoding rules show

I just want to achieve:
1. Stored in a database table is utf-8 characters, also may be GBK
Whether the utf-8 or GBK,
Then query a database table, the table data is to save utf-8 to String or a hashmap,

2. Stored in the file is utf-8 characters, also may be GBK
Whether the utf-8 or GBK,
Then read the file, the file data is to save utf-8 to String or a hashmap,

Consult how to do? thank you

CodePudding user response:

Java is a double byte character data in the memory, USES is Unicode, Unicode16 double byte code,
Then, in the talk about other related coding,
Such as: Java source code, the use of the general is utf-8, useful also GBK code, this is no problem, because, to compile the Java source files, the build process will be carried out on the string encoding conversion, finally, compiles generates a class file, string of bytecode file use utf-8 is unified, so many people at the time of writing Java source files, directly to the source file is saved as utf-8, province code compiler conversion,

String in the database encoding problem, also we want to specific analysis, this involves coding place more,
1. The memory of the character encoding when data processing, generally use Unicode, thought it was fixed length encoding, processing up more convenient;
2. The data file (hard disk) in the character encoding (built table can set), the general use utf-8 GBK/local language such as the corresponding coding (to store the multinational language use utf-8);
3. The data transmission of a character encoding is a JDBC connection after the database, you have a string in the query results, and then transmitted from the server to the coding used in your application, generally use utf-8,
The building Lord check coding, is supposed to be a store data in the table character coding,


Unicode utf-8 encoding, and can be a multinational language characters, is a fixed length coding, one is a variable length coding, use among different scenarios,
Fixed-length coding convenient data processing, can quickly find a location in the string of characters;
Variable length coding convenience of data transmission and storage, though, to have more space of Chinese characters, however, can support multiple languages and English is space saving,
If you build the string in the table is only in Chinese, can set GBK code for building table/GB2312 can,

CodePudding user response:

The inside of the building Lord the HashMap string, coding is Uncode coding,
You are written to the file for transcoding,
One of two ways:
1. Build a FileWriter object, set file encoding is utf-8, then write code with string can be done automatically,
2. Build FileOutputSteam writing binary data stream, or byte array, use String. GetBytes (" utf-8 ") method, get the utf-8 encoding binary data, write to the output stream,

CodePudding user response:

references to ice, rain, 4/f response:
floor

Thank you for the great spirit show
I think a problem a problem sorting out
1. The first compilation, coding - encoding for Java file
As follows, specification test. The Java code is GBK
Javac - GBK encoding test. Java
As follows, specification test. The Java code is utf-8
Javac - encoding utf-8 test. Java
If - encoding and test Java code does not match, it is possible to compile error

2. As the source code, GBK encoding or utf-8 encoding, as long as you can compile success
Can use WINDOWS and Linux platforms, the source code itself is what code is not important, right?


3. I have a mysql database, there is a table, the table data is encoded utf-8 (UTF8MB4)
Is the Chinese characters is 3 bytes,
Now I connect to the database, query the table,
Then save record table's field values to a String, or a hashmap
Assuming that the select the name from t limit 1 the value of the name is "good"

First of all, the following characterEncoding=utf-8 and characterEncoding=GBK is what effect?
DBURL="JDBC: mysql://localhost: 3306/" + dbname + "? UseSSL=false& ServerTimezone=Asia/Shanghai& CharacterEncoding=utf-8 ";
DBURL="JDBC: mysql://localhost: 3306/" + dbname + "? UseSSL=false& ServerTimezone=Asia/Shanghai& GBK characterEncoding=";

Second, because of the table data is utf-8, so
Transferred from the database server to a Java variable, the transmission process of what is code?

Finally, the "good" save to String STR=rs. Get String (" name ")
At this point what is "good" code in STR, hypothesis is UTF - 16,
So is a Java automatically database of utf-8 automatically switch to STR UTF - 16?

Thank you for the great god
  • Related