Home > other >  Questions about the judgment in python3 file encoding
Questions about the judgment in python3 file encoding

Time:09-24

Win10
Pycharm community edition
Python3
PIP download the latest chardet
The attached TXT face roll keyboard
Source program is as follows:

The import chardet
Text_line=open (" fp1. TXT "). The read ()
Print (type (text_line))
Print (chardet. Detect (text_line))

The results
Traceback (the most recent call last) :
The File "text. Py", line 4, the in & lt; module>
Print (chardet. Detect (text_line))
File "D: \ python \ lib \ site - packages \ chardet \ set py", line 34, detect in
'{0}'. The format (type (byte_str)))
TypeError: Expected object of type bytes or bytearray containing, got: & lt; The class 'STR' & gt;

The Process finished with exit code 1

But if the second line write
Text_line=open (" fp1. TXT "). The read () encode ()
Can pass, still have what meaning?
Chardet. Detec really can't accept STR type? That I have to judge?

CodePudding user response:

My environment validation is ok

CodePudding user response:

Will read the way to 'rb' model is solved,

CodePudding user response:

First, you should be clear, this is the code rather than evaluation data type, second, and encode just turn it into a byte,
This judgment is only able to receive bytes cannot receive the character type, now
 
The import json
The import chardet


Dict1={' city ':' Beijing ', 'name' : 'xaio}


# json. Dumps the default ASCII code
Jsondict=json. Dumps (dict1)

Print (' jsonlist='jsonlist)
Print (' jsondict='jsondict)


ASCII code was banned on # default utf-8
Jsondict1=json. Dumps (dict1, ensure_ascii=False)
Print (jsondict1)

# ASCII
Ss=chardet. Detect (json. Dumps (dict1) encode ())
Print (ss)


# utf-8
Ss=chardet. Detect (json. Dumps (dict1, ensure_ascii=False). The encode ())
Print (ss)



CodePudding user response:

Text_line=open (" fp1. TXT "). The read () encode () in the second row here encode the default is to use python code ASCII mode to STR parsed into bytes, then judge his encoding,
Encoding=chardet. Detect (open (input_path, 'rb'). The read ()) [' encoding '] use rb
  • Related