Home > Enterprise >  Pandas - stack time columns with time and date
Pandas - stack time columns with time and date

Time:03-07

I have date and time data now I want to reduce this dataframe to two columns with Timestamp (date time) in a column and value in another column

current df -

Date                    8 am   10 am   1 pm
-----------------------------------------------
23/02/2022              5        10    11
24/02/2022              2        17    15         
25/02/2022              7        90    175

desired df -

Timestamp                       value
----------------------------------------------
2022-02-23 00:08:00               5
2022-02-23 00:10:00               10
2022-02-23 00:13:00               11
2022-02-24 00:08:00               2
2022-02-24 00:10:00               17
2022-02-24 00:13:00               15
2022-02-25 00:08:00               7
2022-02-25 00:10:00               90
2022-02-25 00:13:00               175

CodePudding user response:

IIUC, use melt and to_datetime:

(df
   .reset_index() # only if not already index
   .melt(id_vars='Date', var_name='time')
   .assign(Timestamp=lambda d: pd.to_datetime(d['Date'] ' ' d['time']))
   [['Timestamp', 'value']]
   # below optional
   .sort_values(by='Timestamp').reset_index(drop=True)
 )

Output:

            Timestamp  value
0 2022-02-23 08:00:00      5
1 2022-02-23 10:00:00     10
2 2022-02-23 13:00:00     11
3 2022-02-24 08:00:00      2
4 2022-02-24 10:00:00     17
5 2022-02-24 13:00:00     15
6 2022-02-25 08:00:00      7
7 2022-02-25 10:00:00     90
8 2022-02-25 13:00:00    175

CodePudding user response:

Use melt:

combine_datetime = lambda x: pd.to_datetime(x['Date']   ' '   x['Time'], 
                                            format='%d/%m/%Y %I %p')

out = (
  df.reset_index().melt('Date', var_name='Time').assign(Timestamp=combine_datetime) \
    .sort_values('Timestamp', ignore_index=True)[['Timestamp', 'value']]
)
print(out)

# Output
            Timestamp  value
0 2022-02-23 08:00:00      5
1 2022-02-23 10:00:00     10
2 2022-02-23 13:00:00     11
3 2022-02-24 08:00:00      2
4 2022-02-24 10:00:00     17
5 2022-02-24 13:00:00     15
6 2022-02-25 08:00:00      7
7 2022-02-25 10:00:00     90
8 2022-02-25 13:00:00    175

Notes

  1. I used reset_index because we don't know if Date is a column or the index. If it's a column, you can remove reset_index. In all cases, there is no side effect here to use it with melt.

  2. For pd.to_datetime, I used an explicit format to avoid Pandas infer the datetime if there are any ambiguities with the day first.

CodePudding user response:

Set date as index, stack and coerce concatenated time component into datetime;

s =df.set_index('Date').stack().to_frame('value').reset_index()
s=s.assign(Timestamp=pd.to_datetime(s['Date'].str.cat(s['level_1'], sep =' ')))[['Timestamp', 'value']]


 

             Timestamp   value
0 2022-02-23 08:00:00      5
1 2022-02-23 10:00:00     10
2 2022-02-23 13:00:00     11
3 2022-02-24 08:00:00      2
4 2022-02-24 10:00:00     17
5 2022-02-24 13:00:00     15
6 2022-02-25 08:00:00      7
7 2022-02-25 10:00:00     90
8 2022-02-25 13:00:00    175
  • Related