I have a dataframe where it was added date and datetime information to a column where it was expected a string. What would be the best way to filter all dates and date values from a pandas dataframe column and replace those values to blank?
Thank you!
CodePudding user response:
In general, if you provided a minimum working example of your problem, one could help more specifically, but assuming you have the following column:
df = pd.DataFrame(np.zeros(shape=(10,1)), columns = ["Mixed"])
df["Mixed"] = "foobar"
df.loc[2,"Mixed"] = pd.to_datetime("2022-08-22")
df.loc[7,"Mixed"] = pd.to_datetime("2022-08-21")
#print("Before Fix", df)
You can use apply(type)
on the column to obtain the data-types of each cell and then use list comprehension [x!=str for x in types]
to check for each cells datatype if its a string or not. After that, just replace those values that are not the desired datatype with a value of your choosing.
types = df["Mixed"].apply(type).values
mask = [x!=str for x in types]
df.loc[mask,"Mixed"] = "" #Or None, or whatever you want to overwrite it with
#print("After Fix", df)