Home > Back-end >  Trying to ffill NaN values based on a condition
Trying to ffill NaN values based on a condition

Time:11-17

I am trying to fill NaN values only if a certain condition is met. I want to use an ffill of whatever the 'REVENUE_STATUS_FLAG' value is prior to listed NaN values. The time series dataset is split into 'SCU_KEY's representing companies and 'PRODUCT' representing type of product. So I don't want the values being filled for one 'SCU_KEY' or 'PRODUCT' to populate on NaN values for other 'SCU_KEY's or 'PRODUCT's. Here is the code I attempted and the result I am looking for:

My code:

cross_6 = df_SR5['REVENUE_STATUS_FLAG'].notna()
cross_7 = df_SR5['REVENUE_STATUS_FLAG'].shift(-1).isna()
cross_8 = df_SR5['SCU_KEY'] == df_SR5['SCU_KEY'].shift(-1)
cross_9 = df_SR5['PRODUCT'] == df_SR5['PRODUCT'].shift(-1)
df_SR5['REVENUE_STATUS_FLAG'] = np.where((cross_6) & (cross_7) & (cross_8) & (cross_9), df_SR5['REVENUE_STATUS_FLAG'].fillna(method='ffill'), df_SR5['REVENUE_STATUS_FLAG'])

Input:

df = pd.DataFrame({'year': ['2019-02', '2019-03', '2019-04', '2019-05',
       '2018-03', '2018-04', '2018-05', '2017-04', '2017-05', '2019-06', '2019-07', '2019-08'], 'SCU_KEY': [3, 3, 3, 3, 4, 4, 4, 4, 4, 5, 5, 5], 'PRODUCT': ['HCM', 'HCM', 'HCM', 'HCM', 'HCM', 'HCM', 'HCM', 'EPM', 'EPM', 'SCM', 'SCM', 'SCM'], 'REVENUE_STATUS_FLAG': [1, 1, np.NaN, np.NaN, np.NaN, 0, np.NaN, np.NaN, np.NaN, np.NaN, 0, 0]})

Output:

df = pd.DataFrame({'year': ['2019-02', '2019-03', '2019-04', '2019-05',
       '2018-03', '2018-04', '2018-05', '2017-04', '2017-05', '2019-06', '2019-07', '2019-08'], 'SCU_KEY': [3, 3, 3, 3, 4, 4, 4, 4, 4, 5, 5, 5], 'PRODUCT': ['HCM', 'HCM', 'HCM', 'HCM', 'HCM', 'HCM', 'HCM', 'EPM', 'EPM', 'SCM', 'SCM', 'SCM'], 'REVENUE_STATUS_FLAG': [1, 1, 1, 1, np.NaN, 0, 0, np.NaN, np.NaN, np.NaN, 0, 0]})

CodePudding user response:

df['filled'] = df.groupby(['PRODUCT', 'SCU_KEY'])['REVENUE_STATUS_FLAG'].fillna(method='ffill')

       year  SCU_KEY PRODUCT  REVENUE_STATUS_FLAG  filled
0   2019-02        3     HCM                  1.0     1.0
1   2019-03        3     HCM                  1.0     1.0
2   2019-04        3     HCM                  NaN     1.0
3   2019-05        3     HCM                  NaN     1.0
4   2018-03        4     HCM                  NaN     NaN
5   2018-04        4     HCM                  0.0     0.0
6   2018-05        4     HCM                  NaN     0.0
7   2017-04        4     EPM                  NaN     NaN
8   2017-05        4     EPM                  NaN     NaN
9   2019-06        5     SCM                  NaN     NaN
10  2019-07        5     SCM                  0.0     0.0
11  2019-08        5     SCM                  0.0     0.0
  • Related