Home > Software engineering >  Convert utf-16 to utf-8 using python
Convert utf-16 to utf-8 using python

Time:03-17

I am trying to convert a huge csv file from utf-16 to utf-8 format using python and below is the code:

with open(r'D:\_apps\aaa\output\srcfile, 'rb') as source_file:
            with open(r'D:\_apps\aaa\output\destfile, 'w b') as dest_file:
                contents = source_file.read()
                dest_file.write(contents.decode('utf-16').encode('utf-8'))

But this code uses lots of memory and fails with Memoryerror. Please help me with an alternate method.

CodePudding user response:

an option is to convert the file line by line:

with open(r'D:\_apps\aaa\output\srcfile', 'rb') as source_file, \
        open(r'D:\_apps\aaa\output\destfile', 'w b') as dest_file:
    for line in source_file:
        dest_file.write(line.decode('utf-16').encode('utf-8'))

or you could open the files with your desired encoding:

with open(r'D:\_apps\aaa\output\srcfile', 'rb', encoding='utf-16') as source_file, \
        open(r'D:\_apps\aaa\output\destfile', 'w b', encoding='utf-8') as dest_file:
    for line in source_file:
        dest_file.write(line)
  • Related