I have a pandas dataframe that has data extract from a table. The table can be any given table from the database. The table may have unknown number of columns and datatypes. The database I am using is Redshift.
After get the data in pandas DF, I need to check for nan values in numeric/decimal type of columns if there are any columns with nan value, I need to change the value to 0
col_list_nan = pd.columns[pdf.isna().any()].tolist()
for i in col_list_nan:
pdf = pdf[i].replace(np.nan, 0)
How can I change the column value and return complete dataframe with remaining unchanged columns and with the replaced columns that have 0 instead of nan
CodePudding user response:
Suppose you had data:
df = pd.DataFrame({'a': [0.5, np.nan, 1], 'b': ['string', 'ok', np.nan]})
that looks like:
a b
0 0.5 string
1 NaN ok
2 1.0 NaN
Then you can select subset of columns using select_dtypes
and replace NaN values with 0 using fillna
:
s = df.select_dtypes(include=['float'])
df[s.columns] = s.fillna(0)
Output:
a b
0 0.5 string
1 0.0 ok
2 1.0 NaN