I have the following array:
array.unique()
array(['10','8', '15','20','21','22 '27','28' nan, '30', '32', '33', 'Values']
I' am trying to assign the following category labels and put them in the respective bin: 'not_number', '10 and below', '11'- '32', '33 and up' using pd.cut() where the result should be:
'not_number' 2 <= this includes ('Values', nan)
'10 and below' 2
'11'- '32' 9
'33 and up' 1
CodePudding user response:
The cut method raise a TypeError if you pass a non-int array datatype. The solution I suggest is to pass from an array to a list to manage different datatypes. In this case you can replace the nan
and 'Values'
with a negative number using a list comprehension. With this set you can use pd.cut method on list and label the data.
a = np.array(['10','8', '15', '20','21','22', '27', '28', 'nan', '30', '32', '33', 'Value'])
a_list = [int(i) if i.isdigit() else -1 for i in c]
bins = pd.IntervalIndex.from_tuples([(-np.Inf, 0), (0, 10), (10, 32), (32, np.Inf)])
lab = ['Not a Value', '10 and below', '11 - 32', '33 and above']
a_cut = pd.cut(s, bins)
a_cut.categories = lab
print(a.value_counts())