I have a column (x) with values from 0 to 100. I need to separate in 4 groups with the conditions:
- Group 1: x <= 17
- Group 2: x > 17 and x <=50
- Group 3: x>50 and x <=83
- Group 4: x>83
And calculate the percentage
CodePudding user response:
import pandas as pd
import numpy as np
First i'm creating bins and as cut-off points and assign the names to those bins:
bins = [0, 17, 50, 83, np.inf]
names = ['Group 1', 'Group 2', 'Group 3', 'Group 4]
We can create a column, let's say 'Groups' and create your specific categories inside this newly created column.
df['Groups'] = pd.cut(df['x'], bins, labels=names
percentages can be calculated by getting the length of the group divided by the total length. For example:
len(df[df['Groups'] == 'Group 1']) / len(df)
Would give you a decimal number. You could do *100 to get the exact percentage.