Home > OS >  Python - Split same day (date) into multiple coulmns
Python - Split same day (date) into multiple coulmns

Time:10-01

I'm using pandas, I have one column attendance timestamp data (Date). I want to split the same day values into (InTime) and (OutTime) columns.

df = df[['Date']]

Date
Thu 1/09 9:10 AM
Thu 1/09 6:10 PM
Fri 2/09 9:04 AM
Fri 2/09 6:02 PM

I'm trying to achieve the following result.

In Time            OutTime
Thu 1/09 9:10 AM   Thu 1/09 6:10 PM 
Fri 2/09 9:04 AM   Fri 2/09 6:02 PM

Thanks.

CodePudding user response:

Here is a solution with pandas.DataFrame.join and pandas.DataFrame.shift :

new_df = (
            df.add_suffix('_In_time')
              .join(df.shift(-1).add_suffix('_Out_time'))
              .iloc[::2]
         )

# Output :

print(new_df)

       Date_In_time     Date_Out_time
0  Thu 1/09 9:10 AM  Thu 1/09 6:10 PM
1  Fri 2/09 9:04 AM  Fri 2/09 6:02 AM

CodePudding user response:

another approach, convert the column to datetime and work with it. try:

df
    Date
0   Thu 1/09 9:10 AM
1   Thu 1/09 6:10 PM
2   Fri 2/09 9:04 AM
3   Fri 2/09 6:02 PM

df['Date'] = pd.to_datetime(df['Date']   ' 2022', format='%a %d/%m %H:%M %p %Y')
df['day'] = df['Date'].dt.date
df['time'] = df['Date'].dt.time

df.groupby('day').agg(InTime=('time', 'first'), OutTime=('time', 'last')).reset_index()

    day         InTime      OutTime
0   2022-09-01  09:10:00    06:10:00
1   2022-09-02  09:04:00    06:02:00

CodePudding user response:

A possible solution, based on pandas.DataFrame.pivot:

# separate day from time
df[['Date1', 'Date2']] = df['Date'].str.split('(?<=\d)\s(?=\d)', expand=True)

# create column with colnames for the new columns to be created by pivot
df['names'] = ['inTime', 'OutTime'] * (len(df)//2)
(df.pivot(index='Date1', columns='names', values='Date')
 .reset_index(drop=True).iloc[:,::-1])

Output:

names            inTime           OutTime
0      Fri 2/09 9:04 AM  Fri 2/09 6:02 PM
1      Thu 1/09 9:10 AM  Thu 1/09 6:10 PM

In case the dataset has not times ordered from earlier to later, then a solution might be the following:

# separate day from time
df[['Date1', 'Date2', 'Date3']] = df['Date'].str.split(
    '(?<=\d)\s(?=\d)|\s(?=.M$)', expand=True)

# this is needed if the times are no sorted in the initial dataset
df.sort_values(['Date1', 'Date3', 'Date2'])

# create column with colnames for the new columns to be created by pivot
df['names'] = ['inTime', 'OutTime'] * (len(df)//2)
(df.pivot(index='Date1', columns='names', values='Date')
 .reset_index(drop=True).iloc[:,::-1])
  • Related