I have a time series that looks like this (below)
And I want to resample it monthly, so it has 2019-10 is equal to the average of all the values of october, November is the average of all the PTS values for November, etc.
However, when i use the pd.resample('M').mean() method, if the final day for each month does not have a value, it fills in a Nan in my data frame. How do I solve this?
Date PTS
2019-10-23 14.0
2019-10-26 14.0
2019-10-27 8.0
2019-10-29 29.0
2019-10-31 17.0
2019-11-03 12.0
2019-11-05 2.0
2019-11-07 15.0
2019-11-08 7.0
2019-11-14 16.0
2019-11-16 12.0
2019-11-20 22.0
2019-11-22 9.0
2019-11-23 20.0
2019-11-25 18.0```
CodePudding user response:
Would this work?
pd.resample('M').mean().dropna()
CodePudding user response:
Do you have a code sample? This works:
import pandas as pd
import numpy as np
rng = np.random.default_rng()
days = np.arange(31)
data = pd.DataFrame({"dates": np.datetime64("2019-03-01") rng.choice(days, 60),
"values": rng.integers(0, 60, size=60)})
data.set_index("dates", inplace=True)
# Set the last day to null.
data.loc["2019-03-31"] = np.nan
# This works
data.resample("M").mean()
It also works with an incomplete month:
incomplete_days = np.arange(10)
data = pd.DataFrame({"dates": np.datetime64("2019-03-01") rng.choice(incomplete_days, 10),
"values": rng.integers(0, 60, size=10)})
data.set_index("dates", inplace=True)
data.resample("M").mean()
You should check your data and types more thoroughly in case the NaN you're receiving indicates a more pressing issue.
CodePudding user response:
Why don't you just drop the NaN values?