Home > database >  Import CSV file error pymysql. Err. InternalError: (1054, "Unknown column 'nan' in &#
Import CSV file error pymysql. Err. InternalError: (1054, "Unknown column 'nan' in &#

Time:09-24

The import OS
The import pymysql
The import pandas as pd

# 1. Connect the Mysql database
Try:
Conn=pymysql. Connect (host='localhost', user='root' and password='123', the db='newbigdata' charset='utf8')
Cur=conn. Cursor ()
Print (' database connection success! ')
Print (')
Except:
Print (' database connection failed! ')

# 2. Read any folder of the CSV file
# the program where the path and name of all files under this path
Path=OS. Getcwd ()
Files=OS. Listdir (path)
# iterate through all the files
I=0
For file in files:
# figure out whether file CSV file
If the file. The split (') [1] in [' CSV] :
I +=1
# building a table name for SQL statement late call
Filename=file. The split (') [0]
Filename=filename

# use of pandas library to read all contents of the CSV file, the result f is a data frame, retain the form of data storage, is the data storage structure of pandas,
F=pd. Read_csv (file, encoding="utf-8") # note: if an error encoding="utf-8" try to
# print (f)

# 3. The calculation of field names and field types of SQL statements to create fragment

# 3.1 to get the data frame's header line (i.e., field name), as the field name in the SQL statement in the future,
The columns=f. olumns. Tolist ()
# print (columns)

# 3.2 to convert the types of fields in the CSV file to mysql in the field type
Types=f.f types
Field=[] # used to receive a list of field names
Table=[] # used to receive a list of field names and field type
For the item in the columns:
If 'int' types in [items] :
Char=item + 'INT'
Elif 'float' types in [items] :
Char=item + 'FLOAT'
Elif 'object' in types (item) :
Char=item + 'VARCHAR (255)'
Elif 'datetime types in [items] :
Char=item + 'DATETIME'
The else:
Char=item + 'VARCHAR (255)'
Table. Append (char)
Field. Append (item)

# 3.3 build SQL statements fragment
# 3.3.1 connects the table list elements with a comma, consisting of the field names and field type for table_sql statement pieces, used to create a table,
=', 'tables. Join (table)
# print (tables)

# 3.3.2 rainfall distribution on 10-12 connect the field list elements with a comma, consisting of the field names for insert_sql statement pieces, used to insert data,
=', 'fields. The join (field) # field name
# print (fields)

# 4. Create a database table
# 4.1 # if the database table already exists, first delete it
Cur. Execute (' drop table if the exists {}; 'the format (filename))
MIT ()
conn.com
# 4.2 build to create table SQL statement
# table_sql='CREATE TABLE IF NOT EXISTS' + filename + '(' +' id0 int PRIMARY KEY NOT NULL auto_increment, '+ tables +'); '
Table_sql='CREATE TABLE IF NOT EXISTS' + filename + '(' + tables +'); '
# print (+ table_sql table_sql is:)

# 4.3 began to create a database table
Print (' table: '+ filename +', begin to create... ')
Cur. Execute (table_sql)
MIT ()
conn.comPrint (' table: '+ filename +', create success! ')

# 5. Insert the data to a database table
Print (' table: '+ filename +', began to insert the data... ')

# 5.1 the data frame's data into list, each row is a list of all the data of a big list, which is in the list of lists, the future can bulk insert a database table,
Values=f.v alues. Tolist () #
all the data
# print (values)

# 5.2 calculated data frame, a total of how many field in each field instead of using a % s,
S=', '. Join ([' % s' for _ in range (len (f. olumns))])
# print (s)

# 5.3 build the SQL insert data
Insert into insert_sql='{} ({}) values ({})'. The format (filename, fields, s)
# print (+ insert_sql insert_sql is:)

# 5.4 insert data
Cur. Executemany (insert_sql, values) # use executemany bulk insert data
MIT ()
conn.comPrint (' table: '+ filename +', insert complete! ')
Print (')
Print (' mission accomplished! Total import {} a CSV file, 'the format (I))
  • Related