I have 1200000 rows x 96 columns dataframe, they are numbers, except for a few of them whose types are date and time.
The Question is:
I'd like to remove any row whose type is datetime.datetime
and convert the rest to float if they are number but their type is string
CodePudding user response:
This should get you the results you requested
import numpy as np
import pandas as pd
df = pd.DataFrame({
'Column1' : [123213123, '2022-01-01', '0111'],
'Column2' : ['2022-01-01', 111, '21398021']
})
for x in range(0, len(df.columns)):
df[df.columns[x]] = df[df.columns[x]].astype(str)
df[df.columns[x]] = np.where(df[df.columns[x]].str.contains('-'), np.nan, df[df.columns[x]])
df[df.columns[x]] = df[df.columns[x]].astype(float)
df
If you are wanting to delete any row who had a date string in that row you can simply replace the last line of the previous code with df = df.dropna()
and that will remove all rows that contain an np.nan from before