Watching some youtube tutorial. Person using ver 3.8.2, and i installed 3.10.4. He type smth like this and it works just fine:
r = open('file.txt', 'a')
r.write('something' '\n')
r.write('что-то')
r.close()
If i do the same, i get UnicodeEncodeError
Traceback (most recent call last): File "C:\Users\small\Desktop\test.py", line 3, in <module> r.write('что-то') File "C:\Python310\lib\encodings\cp1252.py", line 19, in encode return codecs.charmap_encode(input,self.errors,encoding_table)[0] UnicodeEncodeError: 'charmap' codec can't encode characters in position 0-2: character maps to <undefined>
and forced to declare encoding for opening file like this:
r = open('file.txt', 'a', encoding='utf-8')
r.write('something' '\n')
r.write('что-то')
r.close()
Mainly interested in 2 questions:
- Is this happening bc of the difference of OS versions (i got latest win10) or python version or smth else maybe?
- Is there are a way to fix this permanently? I thought about declaring encoding type at the start of the program but then it will become inflexible in terms of getting strings from different sources if they are not in the base encoding type. In this case i will be forced to make tons of checks for encoding type and converting it to the unicode-8, for example. This solution not looks like the right one.
CodePudding user response:
The default encoding for the open
function is platform dependent:
On Unix, it is the encoding of the
LC_CTYPE
locale. It can be set withlocale.setlocale(locale.LC_CTYPE, new_locale)
.On Windows, it is the ANSI code page (ex:
cp1252
).
So yes, it's because of the OS differences. It is a good habit to always specify encoding
for writing platform independent code.
You can also make it permanent by enabling the Python UTF-8 mode
.