Having the following DF :
Index Date
1D 9/13/2022
1W 9/19/2022
2W 9/26/2022
3W 10/3/2022
1M 10/12/2022
2M 11/14/2022
3M 12/12/2022
4M 1/12/2023
5M 2/13/2023
6M 3/13/2023
7M 4/12/2023
8M 5/12/2023
9M 6/12/2023
10M 7/12/2023
11M 8/14/2023
12M 9/12/2023
18M 3/12/2024
2Y 9/12/2024
3Y 9/12/2025
4Y 9/14/2026
5Y 9/13/2027
6Y 9/12/2028
7Y 9/12/2029
8Y 9/12/2030
9Y 9/12/2031
10Y 9/13/2032
12Y 9/12/2034
15Y 9/14/2037
20Y 9/12/2042
The idea would be to do a loop, and to do :
if DF.index[i][-1] == 'Y':
if int(self.dfcurve.index[i 1][:-1])-int(self.dfcurve.index[i][:-1])!= 1:
###Add Missing Index:
index_val = int(self.dfcurve.index[i][:-1]) 1
index_val = str(index_val) 'Y'
### Example of missing index :
## 11Y
## 13Y
## 14Y
## 16Y
## 17Y
## 18Y
## 19Y
But I don't know how to add the index in list at the right place. The final DF would be :
Index Date
1D 9/13/2022
1W 9/19/2022
2W 9/26/2022
3W 10/3/2022
1M 10/12/2022
2M 11/14/2022
3M 12/12/2022
4M 1/12/2023
5M 2/13/2023
6M 3/13/2023
7M 4/12/2023
8M 5/12/2023
9M 6/12/2023
10M 7/12/2023
11M 8/14/2023
12M 9/12/2023
18M 3/12/2024
2Y 9/12/2024
3Y 9/12/2025
4Y 9/14/2026
5Y 9/13/2027
6Y 9/12/2028
7Y 9/12/2029
8Y 9/12/2030
9Y 9/12/2031
10Y 9/13/2032
11Y NA
12Y 9/12/2034
13Y NA
14Y NA
15Y 9/14/2037
16Y NA
17Y NA
18Y NA
19Y NA
20Y 9/12/2042
CodePudding user response:
Use:
#filter Y index values
m = df.index.str.endswith('Y')
#processing only years
df1 = df[m].copy()
#extract numbers to index
df1.index = df1.index.str.extract(r'(\d )', expand=False).astype(int)
#reindex by range for append missing rows
df1 = df1.reindex(range(df1.index.min(), df1.index.max() 1)).rename(index=str)
#added Y substring
df1.index = 'Y'
print (df1)
Date
Index
2Y 9/12/2024
3Y 9/12/2025
4Y 9/14/2026
5Y 9/13/2027
6Y 9/12/2028
7Y 9/12/2029
8Y 9/12/2030
9Y 9/12/2031
10Y 9/13/2032
11Y NaN
12Y 9/12/2034
13Y NaN
14Y NaN
15Y 9/14/2037
16Y NaN
17Y NaN
18Y NaN
19Y NaN
20Y 9/12/2042
#remove Y original rows from Dataframe and append new Y rows
df = pd.concat([df[~m], df1])
print (df)
First solution add to all categories missing values:
df.index = pd.MultiIndex.from_frame(df.index.str.extract(r'(\d )(\D )', expand=True))
f = lambda x: x.reindex(range(x.index.min(), x.index.max() 1))
df = df.reset_index(1).rename(index=int).groupby(1).apply(f).drop(1, axis=1)
df.index = df.index.map(lambda x: f'{x[0]}{x[1]}')
print (df)
Date
D1 9/13/2022
M1 10/12/2022
M2 11/14/2022
M3 12/12/2022