Home > other >  Pandas Aggregate Daily Data to Monthly Timeseries
Pandas Aggregate Daily Data to Monthly Timeseries

Time:07-22

I have a time series that looks like this (below)

And I want to resample it monthly, so it has 2019-10 is equal to the average of all the values of october, November is the average of all the PTS values for November, etc.

However, when i use the pd.resample('M').mean() method, if the final day for each month does not have a value, it fills in a Nan in my data frame. How do I solve this?

Date        PTS    
2019-10-23  14.0
2019-10-26  14.0
2019-10-27   8.0
2019-10-29  29.0
2019-10-31  17.0
2019-11-03  12.0
2019-11-05   2.0
2019-11-07  15.0
2019-11-08   7.0
2019-11-14  16.0
2019-11-16  12.0
2019-11-20  22.0
2019-11-22   9.0
2019-11-23  20.0
2019-11-25  18.0```

CodePudding user response:

Would this work?

pd.resample('M').mean().dropna()

CodePudding user response:

Do you have a code sample? This works:

import pandas as pd
import numpy as np

rng = np.random.default_rng()
days = np.arange(31)

data = pd.DataFrame({"dates": np.datetime64("2019-03-01")   rng.choice(days, 60),
                     "values": rng.integers(0, 60, size=60)})

data.set_index("dates", inplace=True)

# Set the last day to null.
data.loc["2019-03-31"] = np.nan

# This works
data.resample("M").mean()

It also works with an incomplete month:

incomplete_days = np.arange(10)

data = pd.DataFrame({"dates": np.datetime64("2019-03-01")   rng.choice(incomplete_days, 10),
                     "values": rng.integers(0, 60, size=10)})

data.set_index("dates", inplace=True)

data.resample("M").mean()

You should check your data and types more thoroughly in case the NaN you're receiving indicates a more pressing issue.

CodePudding user response:

Why don't you just drop the NaN values?

  • Related