Basically I want to ignore objects that are not convertible to timestamps and convert the object timestamp columns into timestamps.
utc timestamps specifically.
time_tz | rpd | timestamp |
---|---|---|
36:00.0 | -1 | 18-06-2020 09:36 |
46:44.0 | 6 | 19-06-2020 09:36 |
20:59.0 | 10 | 20-06-2020 09:36 |
57:27.0 | 0 | 21-06-2020 09:36 |
51:18.0 | 0 | 22-06-2020 09:36 |
data = {'timestamp_1': ['36:00.0', '46:44.0', '20:59.0', '57:27.0', '51:18.0'],
'r1': [-1, 6, 10, 0, 0],
'timestamp': ['18-06-2020 09:36:00',
'19-06-2020 09:36:00',
'20-06-2020 09:36:00',
'21-06-2020 09:36:00',
'22-06-2020 09:36:00']
}
data = pd.DataFrame(data = data)
data
CodePudding user response:
Yoyu can use a specific function inside a .apply()
:
import pandas as pd
def try_timestamp(x):
try:
out = pd.to_datetime(x)
except:
out = x
return out
data = {'timestamp_1': ['36:00.0', '46:44.0', '20:59.0', '57:27.0', '51:18.0'],
'r1': [-1, 6, 10, 0, 0],
'timestamp': ['18-06-2020 09:36:00',
'19-06-2020 09:36:00',
'20-06-2020 09:36:00',
'21-06-2020 09:36:00',
'22-06-2020 09:36:00']
}
data = pd.DataFrame(data = data)
data["timestamp"] = data["timestamp"].apply(lambda x: try_timestamp(x))
data
CodePudding user response:
If you want to explicitly try to convert all columns as if they were UTC timestamps:
for col in df:
df[col] = pd.to_datetime(df[col], format='%d-%m-%Y %H:%M', errors='ignore')
print(df)
Output:
time_tz rpd timestamp
0 36:00.0 -1 2020-06-18 09:36:00
1 46:44.0 6 2020-06-19 09:36:00
2 20:59.0 10 2020-06-20 09:36:00
3 57:27.0 0 2020-06-21 09:36:00
4 51:18.0 0 2020-06-22 09:36:00
Better would probably be only converting the specific column:
df.timestamp = pd.to_datetime(df.timestamp)
# Or, if some values may be invalid: pd.to_datetime(df.timestamp, errors='coerce')
CodePudding user response:
I would do as follows:
df_types = pd.DataFrame(data.dtypes).T
cols = list(data.columns)
for i in range(len(cols)):
if df_types[cols[i]].iloc[0]=='object':
try:
print(cols[i])
#data[cols[i]] = pd.to_datetime(data[cols[i]])
data[cols[i]] = data[cols[i]].apply(lambda x: pd.Timestamp(x))
except:
pass
put more conditions if needed