Home > other >  Help: python batch CSV file to XLSX file, open is garbled
Help: python batch CSV file to XLSX file, open is garbled

Time:11-14

Run the following code, open the excel is garbled, want to ask next how to solve the great god, CSV, there are Chinese characters, should be the problem of transcoding, looking for a long time also didn't find how to solve, thank you,
Without # coding GBK=error:
Unicode (error) 'utf-8' codec can 't decode byte 0 xc0 position in 17: invalid start byte
Can add after transformation, but is open is gibberish, refer to the great god, thank you!

GBK # coding=
# - * - coding: utf-8 - * -
# import pandas
The import pandas as pd
The import OS

# to establish a single file excel into CSV function, the file is excel file name, to_file is a CSV file, sep='; 'with the CSV file with a semicolon; Error_bad_lines=False ignore the error row data
Def csv_to_xlsx (file, to_file) :

Data_csv=pd. Read_csv (file, encoding='latin1', error_bad_lines=False, sep='; # ') reads the semicolon as the CSV file sep role at the separator to the specified separator, the default in the Windows system is a comma separator with
Data_csv. To_excel (to_file, sheet_name='data')

# read all files in a directory:
Def read_path (path) :
Dirs=OS. Listdir (path)
Return dirs


# main function
Def the main () :
# source file path
Source=r 'C: \ Users \ Desktop \ CSV to excel'

# the target file path
Ob=r 'C: \ Users \ Desktop \ CSV to excel'


# to the inside of the source file path list file into file_list
File_list=[source + I + '\ \' for I in read_path (source)]

A=0 # list index CSV file name in the j_list list, index 0 is the first name of the CSV file
J_list=read_path (source) # all CSV files in the folder name extracted sequentially into j_list list
Print (" -- & gt;" , read_path (source)) # read_path (source) is in itself a list
Print (" read_path (source) type: ", type (read_path (source)))
# building cycle for each file called excel_to_csv ()
For it in file_list:
J=j_list [a] # according to the index detailed CSV file name assigned to the variable j
# list some new names to the target file
J_mid=STR (j) replace (", "" CSV") # from the CSV file. CSV suffix to remove
Print ("====", j_mid)
J_xlsx=ob + + j_mid + '\ \' ". XLSX "
Csv_to_xlsx (it, j_xlsx)
Print (" # # # # # # ", it)
A=a + 1


If __name__=="__main__ ':
The main ()

CodePudding user response:

Data_csv print look at the content, isn't it already messed up here

CodePudding user response:

reference 1st floor weixin_45903952 response:
data_csv print look at the content, if here have messed up

Yes, I am behind the code below print data_csv, display is garbled,
Data_csv=pd. Read_csv (file, encoding='latin1', error_bad_lines=False, sep='; # ') reads the semicolon as the CSV file sep role at the separator to the specified separator, the default in the Windows system is a comma separator with
Data_csv. To_excel (to_file, sheet_name='data')
Print (data_csv) display is garbled the

Consult a great god, and how to change? Thank you for the
  • Related