I could do with some advice. I have a dataframe and I want to recategorize the data in columns 'a' so that if it falls within a range (0-4, 5-9, 10-14 etc) in to groups of 5s and create a new column 'b' with a cumulative value, starting at 0, which will represent the range
if the data frame is this
df = pd.DataFrame(data={'a': [9,5,4,2,7,5,6,19,2,0,8,21,14]})
column b should look like this
df['b'] = [1,1,0,0,1,1,1,3,0,0,1,4,2]
I cant figure it out so any pointers are awesome, thank you.
CodePudding user response:
Your expected output is not that clear, but I believe you're looking for something like these:
df['b'] = round(df['a'] / 5)
Or perhaps this (truncating, not rounding):
df['b'] = df['a'] // 5
CodePudding user response:
df = pd.DataFrame(data={'a': [9,5,4,2,7,5,6,19,2,0,8,21,14]})
df['b'] = df.apply(lambda x : pd.cut(x,[-1,4,9,14,19,23],labels=[0,1,2,3,4]))
print(df)
output:
a b
0 9 1
1 5 1
2 4 0
3 2 0
4 7 1
5 5 1
6 6 1
7 19 3
8 2 0
9 0 0
10 8 1
11 21 4
12 14 2