I have the following (subset of a) Pandas dataframe that I am trying to upload to a MySQL database:
Here are details of the data using the info() function:
For added context, I previously converted both the tourn_date
and date
column to a datetime format using this code:
pd.to_datetime(all_data['tourn_date']).dt.date
Finally, here is the code that I'm using to import the data into the MySQL database:
for index, row in all_data.iterrows():
inserts = [row.tourn_id, row.year, row.round_num, row.tourn_date, row.date]
cursor.execute("""INSERT INTO AllPlayerStats
(TourneyID, Year, RoundNum, TourneyDate, TDate)
values(%s,%s,%s,%s,%s)""", inserts)
db.commit()
cursor.close()
However, when I run this, some of the dates (like the ones shown above), are appearing as NULL in the database (see below) despite other tournaments working just fine and the format of the TourneyDate
and Date
columns in the database being of date type. In some instances, the tourn_date
uploads correctly but the date
doesn't.
Is there a way to troubleshoot this and understand why this is occurring? I'm very confused as to what's going on here.
CodePudding user response:
Python is claiming the dates are datatype Object, you will need them to be Cast as a Date type first. Information about how to Cast as a Date found here:
https://www.w3schools.com/sql/func_mysql_cast.asp
Also, make sure that the MySQL column you are inserting into has the correct data type format as well. Most problems that make no sense are usually data type issues, all too easy to compare x = 1 to y = "1" and not understand why x is NOT = to y for example.
UPDATE - a second answer appeared, the concepts are the same, I guess the difference is if you want to do it with Python or with MySQL.
CodePudding user response:
Consider either converting your date values to StringDtype
(not object
) or to datetime64
without the dt.date
attribute. MySQL may map Python datetimes to its date
type (i.e., minus time component). Consider also executemany
from DataFrame.to_numpy()
.
Do note: astype("string")
renders the StringDtype
and is different than astype(str)
or astype("str")
which render object
type.
sql = """INSERT INTO AllPlayerStats (TourneyID, Year, RoundNum, TourneyDate, TDate)
VALUES (%s, %s, %s, %s, %s)"""
# StringDtype
all_data["tourn_date"], all_data["date"] = (
all_data["tourn_date"].astype("string"),
all_data["date"].astype("string")
)
# datetime
all_data["tourn_date"], all_data["date"] = (
pd.to_datetime(all_data["tourn_date"]),
pd.to_datetime(all_data["date"])
)
vals = all_data[["tourn_id", "year", "round_num", "tourn_date", "date"]].to_numpy()
cursor.executemany(sql, vals)
db.commit()
cursor.close()