Home > Enterprise >  add cumulative value based on numeric range in pandas
add cumulative value based on numeric range in pandas

Time:11-13

I could do with some advice. I have a dataframe and I want to recategorize the data in columns 'a' so that if it falls within a range (0-4, 5-9, 10-14 etc) in to groups of 5s and create a new column 'b' with a cumulative value, starting at 0, which will represent the range

if the data frame is this

df = pd.DataFrame(data={'a': [9,5,4,2,7,5,6,19,2,0,8,21,14]})

column b should look like this

df['b'] = [1,1,0,0,1,1,1,3,0,0,1,4,2]

I cant figure it out so any pointers are awesome, thank you.

CodePudding user response:

Your expected output is not that clear, but I believe you're looking for something like these:

df['b'] = round(df['a'] / 5)

Or perhaps this (truncating, not rounding):

df['b'] = df['a'] // 5

CodePudding user response:

df = pd.DataFrame(data={'a': [9,5,4,2,7,5,6,19,2,0,8,21,14]})
df['b'] = df.apply(lambda x : pd.cut(x,[-1,4,9,14,19,23],labels=[0,1,2,3,4]))
print(df)

output:

     a  b
0    9  1
1    5  1
2    4  0
3    2  0
4    7  1
5    5  1
6    6  1
7   19  3
8    2  0
9    0  0
10   8  1
11  21  4
12  14  2
  • Related