Home > Enterprise >  How to update values in a column MySQL database with data in a CSV file in Python?
How to update values in a column MySQL database with data in a CSV file in Python?

Time:05-09

I have a python script that stores information in a csv file. The columns are Name,Time Left, Date,ID I have a MySql database with the following columns: Name, Time Entered, Time Left, Date, ID

Time left is set to null. I am using one python script with pandas to update the rest of the rows. In a different script I want to update the Time Left column with the data I have in my CSV file if the IDs match (ID is a primary key).

I am unable to figure out how to do this. I have attached the code below for what I essentially want:

#loop through the data frame
        for i,row in empdata.iterrows():
            rowValue = row['ID']
            leaveTime = row['Time Left']
            sql = "UPDATE studentInfo SET `Time Left` = rowValue WHERE ID = leaveTime;"
            cursor.execute(sql)
            print("Record inserted")
            conn.commit()
        print("Completed!")
except Error as e:
    print("Error while connecting to MySQL", e)

The data in rowValue and leaveTime are in my CSV file and I want to use them to update my database table after iterating through the values.

Any advice will be really appreciated!

CodePudding user response:

Most likely you need to format your string as to actually use the values of rowValue = row['ID'], leaveTime = row['Time Left']

sql = f"UPDATE studentInfo SET `Time Left` = {rowValue} WHERE ID = {leaveTime};"

You should also include why it isn't working; a bug report or logic error.

CodePudding user response:

Consider iterating through csv.DictReader, avoiding pandas, and run executemany on list of tuples. (Also, you may need to flip row values to map positionally to %s placeholders).

import csv
...

# PREPARED STATEMENT  WITH PARAM PLACEHOLDERS
sql = "UPDATE studentInfo SET `Time Left` = %s WHERE ID = %s;"

# READ CSV WITH HEADERS TO LIST OF TUPLES
with open("mydata.csv", "r", newline="") as f:
    dr = csv.DictReader(f)
    data = [(row['Time Left'], row['ID']) for row in dr]

# EXECUTE QUERY AND BIND PARAMS TO PLACEHOLDERS
cursor.executemany(sql, data)
conn.commit()

To avoid the iteration of CSV rows especially if very large (i.e., hundreds of thousands to millions), consider LOAD DATA from a csv into a temporary table (that persists only during session) and then run a single UPDATE JOIN query without parameters. Note: Adjust data types and load data settings to match actual data.

sql = """CREATE TEMPORARY TABLE temp_data (
             Id INTEGER,
             `Time Left` DATETIME
         )
      """
cur.execute(sql)

sql = """LOAD DATA INFILE '/path/to/test.txt'
         INTO TABLE temp_data
         IGNORE 1 LINES
         FIELDS TERMINATED BY ',' 
         LINES TERMINATED BY '\r\n'
      """
cur.execute(sql)


sql = """UPDATE studentInfo s
         INNER JOIN temp_data t
            ON s.Id = t.Id
         SET `Time Left` = `Time Left`
      """
cur.execute(sql)
conn.commit()
  • Related