Home > Net >  Python Dataframe categorize values
Python Dataframe categorize values

Time:01-06

I have a data coming from the field and I want to categorize it with a gap of specific range. I want to categorize in 100 range. That is, 0-100, 100-200, 200-300 My code:

df=pd.DataFrame([112,341,234,78,154],columns=['value'])

    value
0   112
1   341
2   234
3   78
4   154

Expected answer:

    value  value_range
0   112    100-200
1   341    200-400
2   234    200-300
3   78     0-100
4   154    100-200

My code:

df['value_range'] = df['value'].apply(lambda x:[a,b] if x>a and x<b for a,b in zip([0,100,200,300,400],[100,200,300,400,500]))

Present solution:

SyntaxError: invalid syntax

CodePudding user response:

You can use pd.cut:

df["value_range"] = pd.cut(df["value"], [0, 100, 200, 300, 400], labels=['0-100', '100-200', '200-300', '300-400'])
print(df)

Prints:

   value value_range
0    112     100-200
1    341     300-400
2    234     200-300
3     78       0-100
4    154     100-200

CodePudding user response:

you can use the odd IntervalIndex.from_tuples. Just set the tuple values to the values that are in your data and you should be good to go! -Listen to Lil Wayne

df = pd.DataFrame([112,341,234,78,154],columns=['value'])

bins = pd.IntervalIndex.from_tuples([(0, 100), (100, 200), (200, 300), (300, 400)])
df['value_range'] = pd.cut(df['value'], bins)

final df

  • Related