I have 999 playlists in one file. Each has a list of songs. The file was flattened and converted to a pandas data frame. I need everything in a db, so I spun up a SQLite3 database like so.
conn = sqlite3.connect('music.db')
cur = conn.cursor()
cur.execute("""CREATE TABLE entries (
name text,
collaborative text,
end_date text,
pid integer,
modified_at integer,
num_tracks integer,
num_albums integer,
num_edits integer,
num_artists integer,
description text,
pos integer,
artist_name text,
track_uri text,
artist_uri text,
track_name text,
album_uri text,
album_name text
)""")
I need to populate the table using data from the data frame. A for loop was assumed, and I consulted the docs for insertion syntax. I came up with the following code:
for row in df.itertuples():
cur.executemany("INSERT INTO entries VALUES(?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)", [row[1], row[2], row[3], row[4], row[5], row[6], row[7], row[8], row[9], row[10], row[11], row[12], row[13], row[14], row[15], row[16], row[17]])
I used print statements to check if row data were accessible using the indexes provided; they were. Then I ran the whole thing and got this gem of an error message:
ProgrammingError: Incorrect number of bindings supplied. The current statement uses 17, and there are 9 supplied.
I'm confident in my ability to count to 17. Does anyone see what's going wrong here?
CodePudding user response:
As ahmad said, your issue is probably incorrect data. I suggest you print the row numbers, and then check the last row before the error (added below).
You can also check data types, and ignore a row if it has an incorrect type (added below).
Also, don't use a for loop for the SQL execution itself. It slows down performance when you execute multiple queries. Use a for loop to build the values part of your query, and then execute one multirow insert
query (you may need to scroll down the page if your browser works differently than mine) with the data from the for loop:
insertRows = []
rowcount = 0 #for debugging
for row in df.itertuples():
rowcount = 1
print(rowcount)
vals = []
for i in range(1, 18):
if type(row[i]) != int:
ignoreRow = True
break
vals.append(row[i])
if not ignoreRow:
insertRows.append(f"({', '.join(vals)})")
cur.execute(f"INSERT INTO entries VALUES {', '.join(insertRows)}")
CodePudding user response:
You can try this, here you no need to create table directly convert your data into sql format -
import required
import pymysql
from sqlalchemy import create_engine
Establish connection, replace keywords as per your database name and all
engine = create_engine("mysql pymysql://" "user_name" ":" "password" "@" "localhost" "/" "database_name")
Now covert your df to sql
df.to_sql('file_name_to_sql', con=engine, if_exists = 'append',index = False, chunksize = 1000)
check if done or not
engine.execute('select * from file_name_to_sql limit 5').fetchall()