I would like to make my own Charset in Java and then use it for the encoding purpose. I need to add some particular symbols to my Charset as well as all of the numbers and 4 languages (Traditional Chinese, US English, Polish and Russian).
I tried to browse Charset class but didn`t really find a solution.
CodePudding user response:
Private Use Areas within Unicode
You’ve not really explained what goal you are trying to achieve, but likely there is no need to invent either:
- a character set (a collection of numbers each assigned to a particular character)
- a character encoding (a way to represent instances of those numbers as bits and bytes).
Unicode defines over 144,000 characters, each assigned a number from a range of zero to just over a million. That leaves large gaps of numbers unassigned. Some of those empty sub-ranges are reserved for future use. But, of interest to you, some of those sub-ranges are set aside for “private use”, never ever to be assigned to a character by the Unicode Consortium. See Wikipedia.