Home > database >  Python pandas.cut()
Python pandas.cut()

Time:05-28

I have the following array:

array.unique()
array(['10','8', '15','20','21','22 '27','28' nan, '30', '32', '33', 'Values']

I' am trying to assign the following category labels and put them in the respective bin: 'not_number', '10 and below', '11'- '32', '33 and up' using pd.cut() where the result should be:

'not_number'        2   <= this includes ('Values', nan) 
'10 and below'      2 
'11'- '32'          9 
'33 and up'         1

CodePudding user response:

The cut method raise a TypeError if you pass a non-int array datatype. The solution I suggest is to pass from an array to a list to manage different datatypes. In this case you can replace the nan and 'Values' with a negative number using a list comprehension. With this set you can use pd.cut method on list and label the data.

a = np.array(['10','8', '15', '20','21','22', '27', '28', 'nan', '30', '32', '33', 'Value'])
a_list = [int(i) if i.isdigit() else -1 for i in c]
bins = pd.IntervalIndex.from_tuples([(-np.Inf, 0), (0, 10), (10, 32), (32, np.Inf)])
lab = ['Not a Value', '10 and below', '11 - 32', '33 and above']
a_cut = pd.cut(s, bins)
a_cut.categories = lab
print(a.value_counts())
  • Related