I have a dataframe like as shown below
count
(1.386, 3.045]
(1.386, 3.045]
(0.692, 1.386]
(1.386, 3.045]
(1.386, 3.045]
(1.386, 3.045]
(1.386, 3.045]
(0.692, 1.386]
I would like to create labels for each interval
Above dataframe is a result of pd.cut function like below
pd.cut(t['count'],bins=p_breaks,labels=[1,2,3,4,5],include_lowest=True,duplicates='drop')
but it resulted in an error
So, I removed the labels
argument and I got an ouptut like below
(1.386, 3.045]
(1.386, 3.045]
(0.692, 1.386]
(1.386, 3.045]
(1.386, 3.045]
(1.386, 3.045]
(1.386, 3.045]
(0.692, 1.386]
Now, I would like to replace these items. So, I tried the below
t['count'].replace((0.692, 1.386),1)
t['count'].replace((1.386, 3.045),2)
I expect my output to be like as below
count
2
2
1
2
2
2
2
1
CodePudding user response:
There is no need to use replace
, you can use .cat.codes
to get the ordinal values assigned to the corresponding intervals
t['count'] = pd.cut(t['count'], bins=p_breaks, duplicates='drop', include_lowest=True).cat.codes 1