Home > Blockchain >  How to handle NaN values when writing to a database which expects something else than type:float?
How to handle NaN values when writing to a database which expects something else than type:float?

Time:12-18

I'm working scraping project and I'm working with Postgres. I've set up all tables by hand which include the typical data types like datetime, integer, text and floats.

But I run into an issue. Since I scrape many fields, there are always some fields missing and they become nan. And since nan is type float I receive an error when I'm trying to push a float value into a Column which expects a, for instance, datetime or text object.

At least I believe so. The error message isn't very clear.

I'm basically using Pandas only (df.to_sql) after I've created the tables by SQLAlchemy's declarative_base.

My process is like:

obj = Scrape() # Scrape() includes all relevant methods
obj.get_data()
obj.create_a_df()
obj.df.to_sql('table', con=engine, [...])

As soon as a value within the current scrape (obj.get_data()) doesn't exist, it becomes nan. And since nan has type float, I can't post it into the table which expect anything else but float.

Error:
[SQL: INSERT INTO company_statistics (my column name) VALUES (my df_columns)]  [parameters: 
{my data as dictionary}]
(Background on this error at: https://sqlalche.me/e/14/9h9h)

Can't diagnose it very well, but I expect it to be the described reason above. Is there any way around? Does it make sense to df.fillna(None) or is there any keyword available I'm not aware of?

CodePudding user response:

You can also consider obj.df.dropna(inplace=True). This will simply drop all rows that contain NaN in any of the columns.

CodePudding user response:

Wouldn't getting rid of the NaNs fix this? Using .fillna() would replace all of the NaN values with whatever you declare inside. You could put this within your scraping method.

  • Related