I have a dataframe p90results
that contains daily counts of temperature exceedances from 12/01/1952-12/31/2021.
I want to create a plot that sums the daily exceedances in winter for each year. The problem is that the winter months, December, January, and February are spread over 2 years. So I would consider winter of 1951 to be December 1951, January 1952, and February 1952.
My first thought was to make an if statement that adds one to the year for the december months. This way I can then groupby year and have the correct winter months. This is what I tried:
for index, row in p90results.iterrows():
if p90results.index.month==12:
p90results.index.year=p90results.index.year 1
But when I do this I get the following error: ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()
Is there a way to change the year for the december so that it is easier for me to plot later on?
CodePudding user response:
This is how I would do it. This isn't the prettiest, but it works:
index = pd.date_range('2000-01-01', '2020-01-01', freq='1M')
df = pd.DataFrame({'high': np.random.randint(0,2, size=index.size), 'date':index})
date_offset = (df.date.dt.month*100 df.date.dt.day - 320)00
df['winter'] = date_offset > 900
df['winter_yr'] = np.where((month < 6) & (df.winter), df.date.dt.year - 1, df.date.dt.year)
Credit to the numeric trick goes to this
CodePudding user response:
You can start with just applying the 'winter' tag to the months in question. You can apply other logic to determine groupings after that.
df = pd.DataFrame({'Date': pd.date_range(start='1951-09-01 00:00:00', periods=180)})
df['season'] = np.where(df['Date'].dt.month.isin([12,1,2]), 'winter', np.nan)