Home > Mobile >  DataFrame column type changes after filling blank cell with user input value in python
DataFrame column type changes after filling blank cell with user input value in python

Time:03-04

I have a large excel file uploaded to spyder, just for an eg. I have made it simple -

           Date      Name   Project    Age    Pin_code     Remarks    Gender
   0    2020-01-01      a     proj_a    34     123456      grade_a      M
   1    2019-12-04      b     proj_b    48     789012                 
   2                    c               54                              M

Now I need to fill blank cells(only string columns) with user input value, suppose user typed - 'no_entry', then in all the string column I need to fill this value to blank cells, and for numeric column I need to fill 0 and for datetime column I need to fill - 0000-00-00. The problem which I am facing is - after insertion, date was initially appears like - 2020-01-01 after insertion it becomes - 2020-01-01 00:00:00 and also it assigns only 0 to blank cell, pin_code column becomes float like - 789012.0. How to avoid these obstruction, please help.

my code -

import pandas as pd
import numpy as np
from pandas.api.types import is_string_dtype
from pandas.api.types import is_numeric_dtype
from pandas.api.types import is_datetime64_dtype

ip = input('Please enter a value for blank cells : ')
col = df.columns
for c in col:
    if is_string_dtype(df[c]) == True:
        df[c].fillna(ip, inplace = True)
    if is_integer_dtype(df[c]) == True :
        df[c].fillna(0, inplace = True)
    if is_datetime64_dtype(df[c]) == True:  
        df[c].fillna(0000-00-00, inplace = True)

CodePudding user response:

If need not valid pandas datetime - 0000-00-00 is necessary convert dates to strings and for convert numeric to integers use astype(int)

ip = input('Please enter a value for blank cells : ')

for c in  df.columns:
    if is_string_dtype(df[c]):
        df[c].fillna(ip, inplace = True)
    if is_numeric_dtype(df[c]):
        df[c] = df[c].fillna(0).astype(int)
    if is_datetime64_dtype(df[c]):  
        df[c] = df[c].dt.strftime('%Y-%m-%d').fillna('0000-00-00')
  • Related