I have 20 million rows of data in maybe 40 sqlite3 databases. they are all the same: unix time stamp and several columns of numbers. the numbers are all type [float]. (the time stamp is [str]) The tables were made by me and defined as REAL. Sometimes I have trouble and I think it is related to the data type. I do not seem to be able to change it.
I ask for the type right out of the database and it gives me [float]
Sometimes I have no problems. But this little error code is pretty representative.
for row in c.fetchmany(50):
new_var = int(row[0])
print ("test", type(new_var))
error is
Traceback (most recent call last):
File "C:/Users/mrphy/Desktop/mu only/datatypes 01.py", line 45, in <module>
new_var = int(row[0])
TypeError: int() argument must be a string, a bytes-like object or a real number, not 'NoneType'
I calculate averages etc at time intervals. I think it freezes when it can find no numbers. But a filter does not seem to work.
CodePudding user response:
Use pandas for reading database instead.
import pandas as pd
df = pd.read_sql_query("your qurey here", con) #con is your db connection
Then replace all null values(no numbers) and perform similar pandas operation and then takes avg of that column like that
df['column name'] = df['column name'].fillna(0)
avg = df["column name"].mean()
But loading 20 million records in pandas may takes some time to execute.
So next I can suggest is use try except which will not break your code where it gets no numbers.
for row in c.fetchmany(50):
try:
new_var = int(row[0])
print ("test", type(new_var))
except:
pass
CodePudding user response:
I moved forward with try; except; pass. I know or suspect this is sloppy programming. I think most of us are focused on programming as a tool, I am moving forward. https://github.com/mrphysh?tab=repositories