Home > Net >  How to sum up the value from previous row to subsequent rows pandas
How to sum up the value from previous row to subsequent rows pandas

Time:01-13

I have a dataframe with the below specs

  | ID |  Name| count  |
  | -- |----  | ----   | 
  | 1  |  A   |  75    |
  | 2  |  B   |  10    |
  | 3  |  A   |  15    |
  | 4  |  A   |  10    |
  | 5  |  A   |  5     |
  | 6  |  A   |  3     |

If I set the threshold for the count to be 15, I want the below rows to get added up uniformly. So the output should be

  | ID |  Name |  count |
  | -- |----   | ----   | 
  | 1  |  A    |    15  |
  | 2  |  B    |    10  |
  | 3  |  A    |    30  |
  | 4  |  A    |    25  |
  | 5  |  A    |    20  |
  | 6  |  A    |    18  |

75 from ID 1 gets added up based on group "Name" and it is always based on threshold value. Please advice

CodePudding user response:

IIUC you can do something like:

def distribute(sr, thres=15):
    idxmax = sr.idxmax()
    remain = max((sr[idxmax] - thres) / max(len(sr) - 1, 1), 0)
    return np.where(sr.index == idxmax, min(sr[idxmax], thres), sr   remain)

df['newcount'] = df.groupby('Name')['count'].transform(distribute, thres=15)

Output:

ID Name count newcount
1 A 75 15
2 B 10 10
3 A 15 30
4 A 10 25
5 A 5 20
6 A 3 18
  • Related