my data frame has daily value from 2005-01-01 to 2021-10-31.
| C1 | C2
-----------------------------
2005-01-01 | 2.7859 | -7.790
2005-01-02 |-0.7756 | -0.97
2005-01-03 |-6.892 | 2.770
2005-01-04 | 2.785 | -0.97
. . .
. . .
2021-10-28 | 6.892 | 2.785
2021-10-29 | 2.785 | -6.892
2021-10-30 |-6.892 | -0.97
2021-10-31 |-0.7756 | 2.34
I want to downsample this data frame to get quarter value as follows.
| C1 | C2
------------------------------
2005-03-01 | 2.7859 | -7.790
2005-06-30 |-0.7756 | -0.97
2005-09-30 |-6.892 | 2.770
2005-12-31 | 2.785 | -0.97
I tried to do it with Pandas resample method but it requires an aggregation method.
df = df.resample('Q').mean()
I don't want the aggregated value I want the current value at the quarter-end date as it is.
CodePudding user response:
Your code works except you are not using the right function. Replace mean
by last
:
dti = pd.date_range('2005-01-01', '2021-10-31', freq='D')
df = pd.DataFrame(np.random.random((len(dti), 2)), columns=['C1', 'C2'], index=dti)
dfQ = df.resample('Q').last()
print(dfQ)
# Output:
C1 C2
2005-03-31 0.653733 0.334182
2005-06-30 0.425229 0.316189
2005-09-30 0.055675 0.746406
2005-12-31 0.394051 0.541684
2006-03-31 0.525208 0.413624
... ... ...
2020-12-31 0.662081 0.887147
2021-03-31 0.824541 0.363729
2021-06-30 0.064824 0.621555
2021-09-30 0.126891 0.549009
2021-12-31 0.126217 0.044822
[68 rows x 2 columns]
CodePudding user response:
You can do this,
df = df[df.index.is_quarter_end]
You will filter out the dates only at the end of each quarter.