I am currently looking at a variety of future curves. Each as of-date has a list of futures curves available. My end goal is to pull the maturity date value in relation to each individual as of the date. Below is an example of my data frame
asOfDate maturityDate value
0 2017-10-02 2017-10-01 406.8
1 2017-10-02 2017-11-01 406.8
2 2017-10-02 2017-12-01 398.3
3 2017-10-02 2018-01-01 398.3
4 2017-10-02 2018-02-01 390.2
5 2017-10-02 2018-03-01 390.2
6 2017-10-02 2018-04-01 380.5
7 2017-10-02 2018-05-01 380.5
8 2017-10-02 2018-06-01 385.0
9 2017-10-02 2018-07-01 385.0
10 2017-10-02 2018-08-01 385.0
11 2017-10-02 2018-09-01 385.0
12 2017-10-02 2018-10-01 383.2
13 2017-10-02 2018-11-01 383.2
14 2017-10-03 2017-10-01 410.4
15 2017-10-03 2017-11-01 410.4
16 2017-10-03 2017-12-01 400.8
17 2017-10-03 2018-01-01 400.8
18 2017-10-03 2018-02-01 392.5
19 2017-10-03 2018-03-01 392.5
The end goal would be a list of the closest value of each maturityDate relative to each as of date. So given the above data set the desired output would be
asOfDate maturityDate value
0 2017-10-02 2017-10-01 406.8
14 2017-10-03 2017-10-01 410.4
Given that the contract closes to the asOfDate on 2017-10-02 is 2017-10-01 and 2017-10-03 closest is that same contract.
My thinking is I could do some type of groupby for each as of date, this is what I have tested thus far
df_lumber[df_lumber['maturityDate']==df_lumber.groupby(['asOfDate'])['maturityDate'].apply(min)]
I am trying to match the current maturity dates to match the smallest maturity date for each as-of date, which in theory should be what I am after.
CodePudding user response:
OP is not far from the desired output.
First of all, make sure both columns are of datetime
with pandas.to_datetime
df['asOfDate'] = pd.to_datetime(df['asOfDate'])
df['maturityDate'] = pd.to_datetime(df['maturityDate'])
Then, this will do the work
df[df["maturityDate"] == df.groupby("asOfDate")["maturityDate"].transform(min)]
[Out]:
asOfDate maturityDate value
0 2017-10-02 2017-10-01 406.8
14 2017-10-03 2017-10-01 410.4
Notes:
One is grouping by
asOfDate
withpandas.DataFrame.groupby
.And for each group use
.transform('min')
to get the minimum for the columnmaturityDate
.