I've a data frame contains a column (Start Shift) and it has different data types (Datetime/string), What i need is to change the datetime format to be time format only and keep the string without any change, i used the below code to solve this issue but i can't find a way to apply this change in the data frame as when i trying to load the data frame after this change i found nothing has been changed.
The code that i used:-
df=pd.read_excel(r"C:\Users\Mahmoud.Bader\Desktop\FP Attendance V1.6 Apr 22.xlsx","Attendance").fillna("")
for i in df['Start Shift']:
try:
if i.isalpha():
i
except:
i.strftime('%H:%M %p')
The Data Frame is:-
Department Start Shift
Accommodation Annual
Accommodation OFF Day
Accommodation 2022-04-01 12:00:00
Accommodation 2022-04-01 09:00:00
Accommodation 2022-04-01 10:00:00
Complaints OFF Day
Complaints 2022-04-29 07:00:00
Complaints 2022-04-29 08:00:00
Complaints 2022-04-30 07:00:00
Complaints 2022-04-30 08:00:00
The Data Frame that i expected to found:-
Department Start Shift
Accommodation Annual
Accommodation OFF Day
Accommodation 12:00 PM
Accommodation 09:00 AM
Accommodation 10:00 AM
Complaints OFF Day
Complaints 07:00 AM
Complaints 08:00 AM
Complaints 07:00 AM
Complaints 08:00 AM
CodePudding user response:
You can try to cast the dates into datetime
first:
# cast column 'Start Shift' to datetime
df['Start Shift Formatted'] = pd.to_datetime(df['Start Shift'], format='%Y-%m-%d %H:%M:%S', errors='coerce')
# convert into requested time format
df['Start Shift Formatted'] = df['Start Shift Formatted'].dt.strftime('%I:%M %p')
# overwrite old time format
df['Start Shift'] = df['Start Shift'].mask(df['Start Shift Formatted'].notna(), df['Start Shift Formatted'])
# drop column 'Start Shift Formatted' as it is no longer needed
df.drop('Start Shift Formatted', axis=1, inplace=True)
Update: This can be applied to multiple columns using the following function:
def format_time(x):
x_temp = pd.to_datetime(x, format='%Y-%m-%d %H:%M:%S', errors='coerce')
x_temp = x_temp.dt.strftime('%I:%M %p')
x = x.mask(x_temp.notna(), x_temp)
return x
df.loc[:, ['Start Shift','End Shift','Login Time','Logout time']] = df[
['Start Shift','End Shift','Login Time','Logout time']
].apply(format_time)
CodePudding user response:
You need to apply the changes to the column:
import re
from datetime import datetime
import pandas
column = df['Start Shift']
for i in range(len(column)):
if re.match(r'\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2}', column[i]):
column[i] = datetime.strptime(column[i], '%Y-%m-%d %H:%M:%S').strftime('%H:%M %p')
else:
column[i] = column[i]
You can also write this as a function and apply it to your column
def format_time(string):
if re.match(r'\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2}', string):
string = datetime.strptime(string, '%Y-%m-%d %H:%M:%S').strftime('%H:%M %p')
return string
df['Start Shift'] = df['Start Shift'].apply(format_time)
This works for me:
df = pandas.DataFrame(columns=['Start Shift'])
df['Start Shift'] = ['Annual', 'OFF Day', '2022-04-01 12:00:00', '2022-04-01 09:00:00', '2022-04-01 10:00:00']
print(df.to_string(index = False))
df['Start Shift'] = df['Start Shift'].apply(format_time)
print(df.to_string(index = False))
Output:
Start Shift
Annual
OFF Day
2022-04-01 12:00:00
2022-04-01 09:00:00
2022-04-01 10:00:00
Start Shift
Annual
OFF Day
12:00 PM
09:00 AM
10:00 AM