Monthly averages to daily granularity-CodePudding

I have a dataframe which is monthly averages which looks like the following;

    A  B  C  D  E
1   3  21 3 22 3
2   4  32 3 24 0
3   5  1  12 3 12
.  
.
11  5  4  9  85  85 3
12  43 4  48 3  84  4

I'm looking to convert this data to a daily timeframe so that the dataframe would be a ten year timeseries and each value would correspond to its' monthly value. For example;

           A  B  C  D  E
01/01/2010 3  21 3 22 3
02/01/2010 3  21 3 22 3
.
.
31/01/2010 3  21 3 22 3
.
.
.
30/12/2020 43 4 48 84 4
31/12/2020 43 4 48 84 4

Any help much appreciated!

Thanks

CodePudding user response：

You can reindex:

idx = pd.date_range('2010-01-01', '2020-12-30', freq='D')

out = df.reindex(idx.month).set_axis(idx)

Output:

               A     B     C     D    E
2010-01-01   3.0  21.0   3.0  22.0  3.0
2010-01-02   3.0  21.0   3.0  22.0  3.0
2010-01-03   3.0  21.0   3.0  22.0  3.0
2010-01-04   3.0  21.0   3.0  22.0  3.0
2010-01-05   3.0  21.0   3.0  22.0  3.0
...          ...   ...   ...   ...  ...
2020-12-26  43.0   4.0  48.0  84.0  4.0
2020-12-27  43.0   4.0  48.0  84.0  4.0
2020-12-28  43.0   4.0  48.0  84.0  4.0
2020-12-29  43.0   4.0  48.0  84.0  4.0
2020-12-30  43.0   4.0  48.0  84.0  4.0

[4017 rows x 5 columns]

CodePudding user response：

I would make two new indexes. First, repeat your data for the number of years that you want, then replace it with a proper datetime index. Finally, reindex by a new index spanning the same time interval with a day frequency and fill the nans:

import numpy as np
import pandas as pd

# Some example data
df = pd.DataFrame(np.random.randint(10, size=(12, 5)), index=range(1, 13), columns=list("ABCDE"))

idx1 = pd.date_range("2010", "2021", freq="M")
idx2 = pd.date_range("2010-01-01", "2020-12-31") # Implicit day frequency

out = (
    pd.concat([df] * 11)    # Repeat for 11 years (2010-2020 inclusive)
    .set_index(idx1)        # Overwrite with actual dates
    .reindex(idx2)          # Add all missing days
    .bfill()                # Backfill nans introduced for missing days
)