Home > Mobile >  increasing row based on condition and retaining recent values based on groups
increasing row based on condition and retaining recent values based on groups

Time:12-10

Please help out. i have a table below. I want to add 1 to episode if condition = false and retain the new value for the next row if condition = True. Then add 1 to the retained value if condition = False again and so on. But if member number is first, episode is set to 1.

member_no condition episode
0001/1 True 1
0001/1 False 1
0001/1 True 1
0001/1 False 1
0001/2 False 1
0001/2 True 1
0001/2 False 1
0001/2 False 1
0001/2 True 1
0001/3 True 1
0001/3 False 1
0001/3 True 1

this is what i'm expecting. I've tried using shift function but i haven't been able to arrive at my desired answer

member_no condition episode value
0001/1 True 1 1
0001/1 False 1 2
0001/1 True 1 2
0001/1 False 1 3
0001/2 False 1 1
0001/2 True 1 1
0001/2 False 1 2
0001/2 False 1 3
0001/2 True 1 3
0001/3 True 1 1
0001/3 False 1 2
0001/3 True 1 2

CodePudding user response:

I hope I've understood your question right:

df["value"] = (
    df.groupby("member_no")
    .apply(
        lambda x: x["condition"].eq(False).cumsum()
          (x["condition"].iat[0] == True)
    )
    .values
)
print(df)

Prints:

   member_no  condition  episode  value
0     0001/1       True        1      1
1     0001/1      False        1      2
2     0001/1       True        1      2
3     0001/1      False        1      3
4     0001/2      False        1      1
5     0001/2       True        1      1
6     0001/2      False        1      2
7     0001/2      False        1      3
8     0001/2       True        1      3
9     0001/3       True        1      1
10    0001/3      False        1      2
11    0001/3       True        1      2

CodePudding user response:

Create original data

import pandas as pd

data = {
    'member_no': ['0001/1', '0001/1', '0001/1', '0001/1', '0001/2', '0001/2', '0001/2', '0001/2', '0001/2', '0001/3', '0001/3', '0001/3'],
    'condition': [True, False, True, False, False, True, False, False, True, True, False, True],
    'episode': [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]
}

df = pd.DataFrame(data)

Iterate through dataset and implement logic

df['value'] = 0

for index, row in df.iterrows():
    # if member_no is first, set value to 1
    if index == 0 or df.loc[index-1]['member_no'] != row['member_no']:
        df.loc[index, 'value'] = 1
    # if condition is True, set value equal to previous value
    elif row['condition']:
        df.loc[index, 'value'] = df.loc[index-1]['value']
    # if condition is False, set value equal to previous value   1
    else:
        df.loc[index, 'value'] = df.loc[index-1]['value']   1

View new dataset

df
  • Related