I am using Python 3 and I know all about hex
, int
, chr
, ord
, '\uxxxx'
escape and '\U00xxxxxx'
escape and Unicode has 1114111 codepoints...
My question is very simple: how can I check if a Unicode codepoint is a valid Unicode codepoint, a Unicode codepoint is a valid codepoint only if it is unambiguously mapped to a authoritatively defined character.
For example, codepoint 720 is a valid Unicode, it is 0x2d0 in hex, and U 02D0 points to ː:
In [135]: hex(720)
Out[135]: '0x2d0'
In [136]: '\u02d0'
Out[136]: 'ː'
And 888 is not a valid Unicode codepoint:
In [137]: hex(888)
Out[137]: '0x378'
In [138]: '\u0378'
Out[138]: '\u0378'
And 127744 is valid:
In [139]: chr(127744)
Out[139]: '