I want to create a new dataframe with the same shape based on two existing dataframes. I have one dataframe that represents the flags and another one with the values I want to replace.
The flag
dataframe has only 1, -1 and NaNs, and always after a 1 I'll have a -1. So basically its a "changing state" kind of dataframe. What I want to do is: in between the interval of 1 and -1, I need to fill in the average of the same interval in the second dataframe, PR
.
flag = pd.DataFrame({'col1': [np.nan,np.nan,np.nan,np.nan,np.nan,np.nan,np.nan,np.nan,np.nan,np.nan,np.nan],
'col2': [np.nan,1,-1,1,np.nan,np.nan,np.nan,np.nan,np.nan,np.nan,-1],
'col3': [np.nan,np.nan,np.nan,1,np.nan,np.nan,np.nan,np.nan,np.nan,np.nan,-1],
'col4': [np.nan,np.nan,np.nan,np.nan,np.nan,1,np.nan,-1,np.nan,np.nan,np.nan]
})
PR = pd.DataFrame({'col1': [81,81.3,80.7,81.5,81,80.4,80.3,81,79.5,80.7],
'col2': [80.9,81.6,81.2,81.7,80.9,79.7,79.3,79.1,79,77.5],
'col3': [81.1,81.3,81,81.6,80.8,79.5,79.2,78.8,78.8,77.4],
'col4': [80.1,80.6,79.9,80.4,80.4,79.3,79,78.8,78.4,77]
})
This would have to give me:
result = pd.DataFrame({'col1': [np.nan,np.nan,np.nan,np.nan,np.nan,np.nan,np.nan,np.nan,np.nan,np.nan,np.nan],
'col2': [np.nan,81.4,81.4,79.6,79.6,79.6,79.6,79.6,79.6,79.6,79.6],
'col3': [np.nan,np.nan,np.nan,79.44,79.44,79.44,79.44,79.44,79.44,79.44,79.44],
'col4': [np.nan,np.nan,np.nan,np.nan,np.nan,79.03,79.03,79.03,np.nan,np.nan,np.nan]
})
Any help is much appreciated!
CodePudding user response:
I would use a custom function:
def process(s, ref=flag):
f = ref[s.name] # get matching flag
# create group and mask data outside of 1 -> -1
m = (s.map({1: True, -1: False}).ffill()
| s.eq(-1)
)
group = f.eq(1).cumsum().where(m)
# transform to mean
return s.groupby(group).transform('mean')
out = PR.apply(process, ref=flag).round(2)
Output:
col1 col2 col3 col4
0 NaN NaN NaN NaN
1 NaN 81.4 NaN NaN
2 NaN 81.4 NaN NaN
3 NaN 79.6 79.44 NaN
4 NaN 79.6 79.44 NaN
5 NaN 79.6 79.44 79.03
6 NaN 79.6 79.44 79.03
7 NaN 79.6 79.44 79.03
8 NaN 79.6 79.44 NaN
9 NaN 79.6 79.44 NaN