Home > Software engineering >  How to add overlapping fill_between bounds on a line plot
How to add overlapping fill_between bounds on a line plot

Time:07-02

I have two data frames - one data frame consists of all metric values on a monthly basis, and another one consists of lower and upper bounds for a given period.

import pandas as pd

# sample data
data1 = {'Metric_Month': [pd.Timestamp('2022-05-01 00:00:00'), pd.Timestamp('2022-04-01 00:00:00'),
                          pd.Timestamp('2022-03-01 00:00:00'), pd.Timestamp('2022-02-01 00:00:00'),
                          pd.Timestamp('2022-01-01 00:00:00'), pd.Timestamp('2021-12-01 00:00:00'),
                          pd.Timestamp('2021-11-01 00:00:00'), pd.Timestamp('2021-10-01 00:00:00'),
                          pd.Timestamp('2021-09-01 00:00:00')],
         'Metric_Value': [0.2876, 0.3002, 0.3169, 0.3137, 0.3022, 0.3071, 0.3122, 0.307, 0.3091]}
df1 = pd.DataFrame(data1)

data2 = {'start_date': [pd.Timestamp('2021-11-01 00:00:00'), pd.Timestamp('2021-10-01 00:00:00')],
         'end_date': [pd.Timestamp('2022-05-01 00:00:00'), pd.Timestamp('2022-04-01 00:00:00')],
         'lower_bound': [0.2954, 0.2991], 'upper_bound': [0.3221, 0.3206]}
df2 = pd.DataFrame(data2)

Metric data frame:

Metric Month    Metric Value
2022-05-01  0.2876
2022-04-01  0.3002
2022-03-01  0.3169
2022-02-01  0.3137
2022-01-01  0.3022
2021-12-01  0.3071
2021-11-01  0.3122
2021-10-01  0.3070
2021-09-01  0.3091

Bounds data frame:

    start_date end_date    lower_bound upper_bound
   2021-11-01  2022-05-01  0.2954      0.3221
   2021-10-01  2022-04-01  0.2991      0.3206

My goal is to plot these (different but overlapping) lower and upper bounds on the same scatter plot of metric values.

My idea is to expand the data frame so that new dates are added in rows between the start and the end (included) dates as I want to do fill_between plots on this data set.

In our example a new data frame would look like this (end_date column can be removed):

start_date  end_date    lower_bound upper_bound
2021-11-01  2022-05-01  0.2954      0.3221
2021-12-01  2022-05-01  0.2954      0.3221
2022-01-01  2022-05-01  0.2954      0.3221
...
2022-05-01  2022-05-01  0.2954      0.3221
2021-10-01  2022-04-01  0.2991      0.3206
2021-11-01  2022-04-01  0.2991      0.3206
2021-12-01  2022-04-01  0.2991      0.3206
...
2022-04-01  2022-04-01  0.2991      0.3206

The final result should be a plot like this:

enter image description here

How to do this, and do I need to extend the bounds df in such a way?

CodePudding user response:

  • Given the data in the OP
  • All of the date columns in both dataframe must be datetime64[ns] Dtype, which can be done with enter image description here

  • Related