I want to identify incline/decile if a column. So, in order to achieve this, I create a data frame with some values and conditions. The goal is to get the 2 res_e1,res_e2 following this logic:
if condition1 is met, the row is relevant, otherwise the value is NaN if relevant - take the first row value and the last value and decrease the last from the first. the results is either <0 or >0 for the entire interval. In the example below, e1 relevant values are idx 2 and idx 8, the corresponding values are 4,3 and therefore in res_e1 all True relevant rows are marked with 3-4 < 0 => "-" The same for e2, the values are 3,8 for the same indexes, the entire interval is " " Of course this should apply to all "True" intervals in the df
import pandas as pd
df = pd.DataFrame({
'e1': [1,2,4,4,5,6,5,4,3,2,0],
'e2': [1,2,3,4,3,2,-5,-10,8,10,11],
'condition1': ['A','A','B','B','B','B','B','B','B','A','A']
})
mask_c = df.condition1 == 'B'
df['relevant'] = mask_c
df['res_e1'] = ['NaN','NaN','-','-','-','-','-','-','-','NaN','NaN']
df['res_e2'] = ['NaN','NaN',' ',' ',' ',' ',' ',' ',' ','NaN','NaN']
print(df)
After print:
e1 e2 condition1 relevant res_e1 res_e2
0 1 1 A False NaN NaN
1 2 2 A False NaN NaN
2 4 3 B True -
3 4 4 B True -
4 5 3 B True -
5 6 2 B True -
6 5 -5 B True -
7 4 -10 B True -
8 3 8 B True -
9 2 10 A False NaN NaN
10 0 11 A False NaN NaN
I have tried to add a mask in order to detect the relevant frames, and some aggregation methods but I have no pretty way to get the first and last relevant values only and "color" the entire interval with their values. I added 2 examples e1,e2 and results res_e1, res_e2 just to simplify the question.
CodePudding user response:
You can use boolean indexing and groupby.transform
:
df = df.join(
df[df['relevant']]
.groupby(df['condition1'])[['e1', 'e2']]
.transform(lambda s: ' ' if s.iloc[-1]-s.iloc[0]>0 else '-')
.add_prefix('res_')
)
Output:
e1 e2 condition1 relevant res_e1 res_e2
0 1 1 A False NaN NaN
1 2 2 A False NaN NaN
2 4 3 B True -
3 4 4 B True -
4 5 3 B True -
5 6 2 B True -
6 5 -5 B True -
7 4 -10 B True -
8 3 8 B True -
9 2 10 A False NaN NaN
10 0 11 A False NaN NaN