Home > Net >  convert all objects present in timestamps otherwise ignore those columns
convert all objects present in timestamps otherwise ignore those columns

Time:06-30

Basically I want to ignore objects that are not convertible to timestamps and convert the object timestamp columns into timestamps.

utc timestamps specifically.

time_tz rpd timestamp
36:00.0 -1 18-06-2020 09:36
46:44.0 6 19-06-2020 09:36
20:59.0 10 20-06-2020 09:36
57:27.0 0 21-06-2020 09:36
51:18.0 0 22-06-2020 09:36
data = {'timestamp_1': ['36:00.0', '46:44.0', '20:59.0', '57:27.0', '51:18.0'],
        'r1': [-1, 6, 10, 0, 0],
        'timestamp': ['18-06-2020 09:36:00', 
                      '19-06-2020  09:36:00',
                      '20-06-2020 09:36:00',
                      '21-06-2020 09:36:00',
                      '22-06-2020 09:36:00']
       }

data = pd.DataFrame(data = data)
data

CodePudding user response:

Yoyu can use a specific function inside a .apply():

import pandas as pd

def try_timestamp(x):
    try:
        out = pd.to_datetime(x)
    except:
        out = x
    return out

data = {'timestamp_1': ['36:00.0', '46:44.0', '20:59.0', '57:27.0', '51:18.0'],
        'r1': [-1, 6, 10, 0, 0],
        'timestamp': ['18-06-2020 09:36:00', 
                      '19-06-2020  09:36:00',
                      '20-06-2020 09:36:00',
                      '21-06-2020 09:36:00',
                      '22-06-2020 09:36:00']
       }

data = pd.DataFrame(data = data)
data["timestamp"] = data["timestamp"].apply(lambda x: try_timestamp(x))
data

CodePudding user response:

If you want to explicitly try to convert all columns as if they were UTC timestamps:

for col in df:
    df[col] = pd.to_datetime(df[col], format='%d-%m-%Y %H:%M', errors='ignore')

print(df)

Output:

   time_tz rpd           timestamp
0  36:00.0  -1 2020-06-18 09:36:00
1  46:44.0   6 2020-06-19 09:36:00
2  20:59.0  10 2020-06-20 09:36:00
3  57:27.0   0 2020-06-21 09:36:00
4  51:18.0   0 2020-06-22 09:36:00

Better would probably be only converting the specific column:

df.timestamp = pd.to_datetime(df.timestamp) 
# Or, if some values may be invalid: pd.to_datetime(df.timestamp, errors='coerce')

CodePudding user response:

I would do as follows:

df_types = pd.DataFrame(data.dtypes).T

cols = list(data.columns)
for i in range(len(cols)):
    if df_types[cols[i]].iloc[0]=='object':
        try:
            print(cols[i])
            #data[cols[i]] = pd.to_datetime(data[cols[i]])
            data[cols[i]] = data[cols[i]].apply(lambda x: pd.Timestamp(x))
        except:
            pass

put more conditions if needed

  • Related