Home > database >  How to plot multiple daily time series, aligned at specified trigger times?
How to plot multiple daily time series, aligned at specified trigger times?

Time:03-17

The Problem:

I have a dataframe df that looks like this:

                                  value  msg_type
date        
2022-03-15 08:15:10 00:00         122    None
2022-03-15 08:25:10 00:00         125    None
2022-03-15 08:30:10 00:00         126    None
2022-03-15 08:30:26.542134 00:00  127    ANNOUNCEMENT
2022-03-15 08:35:10 00:00         128    None
2022-03-15 08:40:10 00:00         122    None
2022-03-15 08:45:09 00:00         127    None
2022-03-15 08:50:09 00:00         133    None
2022-03-15 08:55:09 00:00         134    None
....
2022-03-16 09:30:09 00:00         132    None
2022-03-16 09:30:13.234425 00:00  135    ANNOUNCEMENT
2022-03-16 09:35:09 00:00         130    None
2022-03-16 09:40:09 00:00         134    None
2022-03-16 09:45:09 00:00         135    None
2022-03-16 09:50:09 00:00         134    None

The value data occurs in roughly 5 minute intervals, but messages can occur at any time. I am trying to plot one line of values per day, where the x-axis ranges from t=-2 hours to t= 8 hours, and the ANNOUNCEMENT occurs at t=0 (see image below).

So, for example, if an ANNOUNCEMENT occurs at 8:30AM on 3/15 and again at 9:30AM on 3/16, there should be two lines:

  • one line for 3/15 that plots data from 6:30AM to 4:30PM, and
  • one line for 3/16 that plots data from 7:30AM to 5:30PM,

both sharing the same x-axis ranging from -2 to 8, with ANNOUNCEMENT at t=0.


What I've Tried:

I am able to do this currently by finding the index position of an announcement (e.g. say it occurs at row 298 -> announcement_index = 298), generating an array of 120 numbers from -24 to 96 (representing 10 hours at 5 minutes per number -> x = np.arange(-24, 96, 1)), then plotting

sns.lineplot(x, y=df['value'].iloc[announcement_index-24:announcement_index 96])

While this does mostly work (see image below), I suspect it's not the correct way to go about it. Specifically, trying to add more info to the plot (like a different set of 'value' markers) at specific times is difficult because I would need to convert the timestamp into this arbitrary 24-96 value range.

How can I make this same plot but by utilizing the datetime index instead? Thank you so much!

Announcement Profile Plot

CodePudding user response:

Assuming the index has already been converted

Full code:

import io
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

s = '''
date,value,msg_type
2022-03-15 08:15:10 00:00,122,None
2022-03-15 08:25:10 00:00,125,None
2022-03-15 08:30:10 00:00,126,None
2022-03-15 08:30:26.542134 00:00,127,ANNOUNCEMENT
2022-03-15 08:35:10 00:00,128,None
2022-03-15 08:40:10 00:00,122,None
2022-03-15 08:45:09 00:00,127,None
2022-03-15 08:50:09 00:00,133,None
2022-03-15 08:55:09 00:00,134,None
2022-03-16 09:30:09 00:00,132,None
2022-03-16 09:30:13.234425 00:00,135,ANNOUNCEMENT
2022-03-16 09:35:09 00:00,130,None
2022-03-16 09:40:09 00:00,134,None
2022-03-16 09:45:09 00:00,135,None
2022-03-16 09:50:09 00:00,134,None
'''
df = pd.read_csv(io.StringIO(s), index_col=0, parse_dates=['date'])

# create intervals from -1H to  2H of the index
dl, dr = -1, 2
left = df.index   pd.Timedelta(f'{dl}H')
right = df.index   pd.Timedelta(f'{dr}H')
df['interval'] = pd.arrays.IntervalArray.from_arrays(left, right)

# plot each announcement's interval.left to interval.right
fig, ax = plt.subplots()
for ann in df.loc[df['msg_type'] == 'ANNOUNCEMENT')].itertuples():
    window = df.loc[ann.interval.left:ann.interval.right] # extract interval.left to interval.right
    window.index -= ann.Index                             # compute time since announcement
    window.index = window.index.total_seconds()           # convert to seconds since announcement

    window.plot(ax=ax, y='value', label=ann.Index.date())
    deltas = np.arange(dl, dr   1)
    ax.set(xticks=deltas * 3600, xticklabels=deltas)      # set tick labels to hours since announcement

ax.grid()
ax.legend()
  • Related