I'm trying to write a function which will identify if a column which is an object datatype, should actually be reclassified as a datetime datatype. Checking if everything in a column looks like a datetime value, then convert it to a datetime datatype. I'm trying to expand what I currently wrote to see if it's either a. empty values or b. datetime values, then convert it to a datetime format.
I have this piece of code which works correctly, reclassifying a column to a datetime datatype if all values in a column have a datetime looking format.
mask = df.astype(str).apply(lambda x : x.str.match('(\d{2,4}(-|\/|\\|\.| )\d{2}(-|\/|\\|\.| )\d{2,4}) ')).all()
I'm trying to expand the piece of code to identify if it's either a null value (None when treated as a str) OR a datetime looking value before converting it to datetime.
I'm trying to get this line of code to work but I'm not 100% sure how to tweak this lambda function to see if all of the columns values are either empty OR a datatime looking value,
mask = df.astype(str).apply(lambda x : (x.astype(str) != "" or x.str.match('(\d{2,4}(-|\/|\\|\.| )\d{2}(-|\/|\\|\.| )\d{2,4}) ')).all())
Any help would be appreciated!
CodePudding user response:
You can try:
mask = df.astype(str).apply(lambda x: pd.to_datetime(x, errors='coerce')).notna().all()