Home > Back-end >  how to change datetime format column that contains a string
how to change datetime format column that contains a string

Time:07-23

I've a data frame contains a column (Start Shift) and it has different data types (Datetime/string), What i need is to change the datetime format to be time format only and keep the string without any change, i used the below code to solve this issue but i can't find a way to apply this change in the data frame as when i trying to load the data frame after this change i found nothing has been changed.

The code that i used:-

df=pd.read_excel(r"C:\Users\Mahmoud.Bader\Desktop\FP Attendance V1.6 Apr 22.xlsx","Attendance").fillna("")

for i in df['Start Shift']:
    try:
        if i.isalpha():
            i
    except:
            i.strftime('%H:%M %p')

The Data Frame is:-

Department  Start Shift
Accommodation   Annual
Accommodation   OFF Day
Accommodation   2022-04-01 12:00:00
Accommodation   2022-04-01 09:00:00
Accommodation   2022-04-01 10:00:00
Complaints  OFF Day
Complaints  2022-04-29 07:00:00
Complaints  2022-04-29 08:00:00
Complaints  2022-04-30 07:00:00
Complaints  2022-04-30 08:00:00

enter image description here

The Data Frame that i expected to found:-

Department  Start Shift
Accommodation   Annual
Accommodation   OFF Day
Accommodation   12:00 PM
Accommodation   09:00 AM
Accommodation   10:00 AM
Complaints  OFF Day
Complaints  07:00 AM
Complaints  08:00 AM
Complaints  07:00 AM
Complaints  08:00 AM

CodePudding user response:

You can try to cast the dates into datetime first:

# cast column 'Start Shift' to datetime
df['Start Shift Formatted'] = pd.to_datetime(df['Start Shift'], format='%Y-%m-%d %H:%M:%S', errors='coerce')
# convert into requested time format
df['Start Shift Formatted'] = df['Start Shift Formatted'].dt.strftime('%I:%M %p')
# overwrite old time format
df['Start Shift'] = df['Start Shift'].mask(df['Start Shift Formatted'].notna(), df['Start Shift Formatted'])
# drop column 'Start Shift Formatted' as it is no longer needed
df.drop('Start Shift Formatted', axis=1, inplace=True)

Update: This can be applied to multiple columns using the following function:

def format_time(x):
    x_temp = pd.to_datetime(x, format='%Y-%m-%d %H:%M:%S', errors='coerce')
    x_temp = x_temp.dt.strftime('%I:%M %p')
    x = x.mask(x_temp.notna(), x_temp)
    
    return x

df.loc[:, ['Start Shift','End Shift','Login Time','Logout time']] = df[
    ['Start Shift','End Shift','Login Time','Logout time']
].apply(format_time)

CodePudding user response:

You need to apply the changes to the column:

import re
from datetime import datetime
import pandas

column = df['Start Shift']
for i in range(len(column)):
    if re.match(r'\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2}', column[i]):
        column[i] = datetime.strptime(column[i], '%Y-%m-%d %H:%M:%S').strftime('%H:%M %p')
    else:
        column[i] = column[i]

You can also write this as a function and apply it to your column

def format_time(string):
    if re.match(r'\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2}', string):
        string = datetime.strptime(string, '%Y-%m-%d %H:%M:%S').strftime('%H:%M %p')
    return string
df['Start Shift'] = df['Start Shift'].apply(format_time)

This works for me:

df = pandas.DataFrame(columns=['Start Shift'])

df['Start Shift'] = ['Annual', 'OFF Day', '2022-04-01 12:00:00', '2022-04-01 09:00:00', '2022-04-01 10:00:00']

print(df.to_string(index = False))

df['Start Shift'] = df['Start Shift'].apply(format_time)

print(df.to_string(index = False))

Output:

        Start Shift
             Annual
            OFF Day
2022-04-01 12:00:00
2022-04-01 09:00:00
2022-04-01 10:00:00

Start Shift
     Annual
    OFF Day
   12:00 PM
   09:00 AM
   10:00 AM
  • Related