Home > Back-end >  Dates not importing to SQL database correctly from Python
Dates not importing to SQL database correctly from Python

Time:05-21

I have the following (subset of a) Pandas dataframe that I am trying to upload to a MySQL database: enter image description here

Here are details of the data using the info() function:

enter image description here

For added context, I previously converted both the tourn_date and date column to a datetime format using this code:

pd.to_datetime(all_data['tourn_date']).dt.date

Finally, here is the code that I'm using to import the data into the MySQL database:

for index, row in all_data.iterrows():
    inserts = [row.tourn_id, row.year, row.round_num, row.tourn_date, row.date]

    cursor.execute("""INSERT INTO AllPlayerStats 
                    (TourneyID, Year, RoundNum, TourneyDate, TDate)
                    values(%s,%s,%s,%s,%s)""", inserts)
db.commit()
cursor.close()

However, when I run this, some of the dates (like the ones shown above), are appearing as NULL in the database (see below) despite other tournaments working just fine and the format of the TourneyDate and Date columns in the database being of date type. In some instances, the tourn_date uploads correctly but the date doesn't.

enter image description here

Is there a way to troubleshoot this and understand why this is occurring? I'm very confused as to what's going on here.

CodePudding user response:

Python is claiming the dates are datatype Object, you will need them to be Cast as a Date type first. Information about how to Cast as a Date found here:

https://www.w3schools.com/sql/func_mysql_cast.asp

Also, make sure that the MySQL column you are inserting into has the correct data type format as well. Most problems that make no sense are usually data type issues, all too easy to compare x = 1 to y = "1" and not understand why x is NOT = to y for example.

UPDATE - a second answer appeared, the concepts are the same, I guess the difference is if you want to do it with Python or with MySQL.

CodePudding user response:

Consider either converting your date values to StringDtype (not object) or to datetime64 without the dt.date attribute. MySQL may map Python datetimes to its date type (i.e., minus time component). Consider also executemany from DataFrame.to_numpy().

Do note: astype("string") renders the StringDtype and is different than astype(str) or astype("str") which render object type.

sql = """INSERT INTO AllPlayerStats (TourneyID, Year, RoundNum, TourneyDate, TDate)
         VALUES (%s, %s, %s, %s, %s)"""

# StringDtype
all_data["tourn_date"], all_data["date"] = (
    all_data["tourn_date"].astype("string"), 
    all_data["date"].astype("string")
)

# datetime
all_data["tourn_date"], all_data["date"] = (
    pd.to_datetime(all_data["tourn_date"]), 
    pd.to_datetime(all_data["date"])
)

vals = all_data[["tourn_id", "year", "round_num", "tourn_date", "date"]].to_numpy()

cursor.executemany(sql, vals) 
db.commit() 
cursor.close()
  • Related