Home > database >  Form subgroups and increase/decrease according to value
Form subgroups and increase/decrease according to value

Time:03-01

My intial Dataframe looks as follows (except column Counter):

Index User Status Counter
1 John A 1
2 Ellen A 1
3 John B 0
4 Ellen A 2
5 John A 1
6 John A 2
7 John A 3
8 John A 4
9 Ellen A 3
10 John B 3
11 Ellen B 2
12 Ellen C 1
13 Ellen A 2
14 Ellen A 3

In this case I have two users (John/Ellen). In fact, there are way more users.

The Counter column is my goal to achieve. If I had only one user, the code would look like this:

count = 0
CounterList = []
for i, row in df.iterrows():
  if row["Status"] == "A":
    count  = 1
  elif row["Status"] == "B" or row["Status"] == "C":
    count -= 1
  CounterList.append(count)
df["Counter"] = CounterList
df

With status A it counts up by 1, with status B or C the counter is reduced by one.

But how to handle two or more users? How to build subgroups and counting each user-subgroup separately?

CodePudding user response:

You can check if the Status is equal to A and map 1, -1 otherwise. Then perform a cumsum per group:

df['Counter'] = (df['Status']
                 .eq('A').map({True: 1, False: -1}) # make A: 1, other: -1
                 .groupby(df['User']).cumsum()      # cumsum per group
                 )

output:

    Index   User Status  Counter
0       1   John      A        1
1       2  Ellen      A        1
2       3   John      B        0
3       4  Ellen      A        2
4       5   John      A        1
5       6   John      A        2
6       7   John      A        3
7       8   John      A        4
8       9  Ellen      A        3
9      10   John      B        3
10     11  Ellen      B        2
11     12  Ellen      C        1
12     13  Ellen      A        2
13     14  Ellen      A        3

CodePudding user response:

Create a mapping from Status to its corresponding score; then groupby cumsum:

mapping = {'A':1, 'B':-1,'C':-1}        
df['Counter'] = df.assign(Counter=df['Status'].map(mapping)).groupby('User')['Counter'].cumsum()

Output:

    Index   User Status  Counter
0       1   John      A        1
1       2  Ellen      A        1
2       3   John      B        0
3       4  Ellen      A        2
4       5   John      A        1
5       6   John      A        2
6       7   John      A        3
7       8   John      A        4
8       9  Ellen      A        3
9      10   John      B        3
10     11  Ellen      B        2
11     12  Ellen      C        1
12     13  Ellen      A        2
13     14  Ellen      A        3
  • Related