Character set more, UNICODE conversion, number string interception problem, related to the character-CodePudding

Case 1: multiple character sets (gb2312)
Print string, ask first to intercept retain only 92 characters of a string, and then print to the designated area (fixed),
Gb2312 English one character at a time, two Chinese characters is very good capture

Simple example, interception of 10 characters (Chinese automatically forward an intercept)
SRC="https://bbs.csdn.net/topics/123456789" intercept before
Des="123456789" after the interception

To demand changes to use utf-8 first 16
Situation 2: unicode (utf - 16)
Print string, ask first to intercept retain only 92 characters of a string, and then print to the designated area (fixed),
SRC=https://bbs.csdn.net/topics/L "0" in 123456789 before the interception
Des=L "123456789" after the interception

Purpose utf - 16 cut after the string the same as the first gb2312

Question:
1. Apart from 0-127 coding, in GBK code (especially gb18030) are all other code corresponding to the two characters?
2. Unicode x0080 from 0 after the start coding in gb18030 some, are the two characters?

CodePudding user response:

Reference: https://blog.csdn.net/linyuanxing/article/details/3453650? Utm_medium=distribute. Pc_relevant_t0. None - task - blog - BlogCommendFromMachineLearnPai2-1. Control& Depth_1 - utm_source=distribute. Pc_relevant_t0. None - task - blog - BlogCommendFromMachineLearnPai2-1. The control

CodePudding user response:

My question is
Double-byte characters to single-byte characters conversion, want to know is this what is the corresponding,
Such as ° this symbol is a1e3 double character in GBK, 0 x00b0 single characters in unicode,

Routine in unicode Chinese character encoding is to say, within the scope of direct into two characters, but how to determine conversion in addition to the first time

CodePudding user response:

https://m.baidu.com/from=1012852q/bd_page_type=1/ssid=0/uid=0/pu=usm%401%2Csz%40224_220%2Cta%40iphone___11_13.9/baiduid=4EB47D13416CC999D38774F15458E460/w=0_10_/t=iphone/l=1/tc? Ref=www_iphone & amp; Lid=10497446139898289207 & amp; The order=1 & amp; FM=alop& IsAtom=1 & amp; Is_baidu=0 & amp; Dict=1 & amp; Tj=bk_polysemy_1_0_10_lNaN & amp; Clk_info=% 7 b % 22 tplname % 3 a % 22 bk_polysemy 22% % 22% 2 c % 22 srcid % 22% % 3 a1547 7 d & amp; Wd=& amp; Eqid=91 ae6ba5796edc371000000160222225 & amp; W_qd=IlPT2AEptyoA_yk66wEaqwK64lxSbXjioUdnse7 - & amp; Bdver=2 & amp; Tcplug=1 & amp; The SEC=10013 & amp; Di=f22a097a700c47d2 & amp; Bdenc=1 & amp; TCH=124.0.246.204.0.0 & amp; nsrc=https://bbs.csdn.net/topics/FydYV5L2%2FeGw6EM0C%2BbozP%2BEFpMajWLBTNRq2ogTUyRV8Rf2c5CLm4zwwW1OQtMrERScOGdFDJqOC3dyOioI9UR80W%2F1t%2FXjpxuc0Oxo2YtCUCxlNvVPShpym%2FGFHJ3pzoAY6jV5J%2F%2FKeZuFaScj%2FQ%3D%3D

CodePudding user response:

Well, that he asked such a question!
Gb18030 coding, whether things are two characters in Chinese, no matter a few bytes?