Given a column of a dataframe, after dividing its numerical values in 10 groups, I am trying to assign a label to each group and create a list made out of these labels. In order to do so, I need to check between which interval each value in this column lies, however, according to the error I got
AttributeError: float object has no attribute 'between'
there is no 'between' command to deal with this issue.
l2=[29.69911764705882, 32.5, 32.5, 54.0, 12.0, 29.69911764705882, 24.0, 29.69911764705882, 45.0, 33.0, 20.0, 47.0, 29.0,
25.0, 23.0, 19.0, 37.0, 16.0, 24.0, 29.69911764705882, 22.0, 24.0, 19.0, 18.0, 19.0, 27.0, 9.0, 36.5, 42.0, 51.0, 22.0,
55.5, 40.5, 29.69911764705882, 51.0, 16.0, 30.0, 29.69911764705882, 29.69911764705882, 44.0, 40.0, 26.0, 17.0, 1.0, 9.0,
29.69911764705882, 45.0, 29.69911764705882, 28.0, 61.0, 4.0, 1.0, 21.0, 56.0, 18.0, 29.69911764705882, 50.0, 30.0, 36.0,
29.69911764705882, 29.69911764705882, 9.0, 1.0, 4.0, 29.69911764705882, 29.69911764705882, 45.0, 40.0, 36.0, 32.0, 19.0,
19.0, 3.0, 44.0, 58.0, 29.69911764705882, 42.0, 29.69911764705882, 24.0, 28.0, 29.69911764705882, 34.0, 45.5, 18.0, 2.0,
32.0, 26.0, 16.0, 40.0, 24.0, 35.0, 22.0, 30.0, 29.69911764705882, 31.0, 27.0, 42.0, 32.0, 30.0, 16.0, 27.0, 51.0,
29.69911764705882, 38.0, 22.0, 19.0, 20.5, 18.0, 29.69911764705882, 35.0, 29.0, 59.0, 5.0, 24.0, 29.69911764705882,
44.0, 8.0, 19.0, 33.0, 29.69911764705882, 29.69911764705882, 29.0, 22.0, 30.0, 44.0, 25.0, 24.0, 37.0, 54.0,
29.69911764705882, 29.0, 62.0, 30.0, 41.0, 29.0, 29.69911764705882, 30.0, 35.0, 50.0, 29.69911764705882, 3.0]
d = {'col1': []}
df = pd.DataFrame(data=d)
df['col1']=l2
print(df['col1'])
df['col2'] = pd.cut(df.col1,10)
print(df['col2'].value_counts())
new_list=[]
labels=['25-31','19,25','13-19','31-37','0-7','37-43','43-49','49-55','7-13','55-62']
for i in df['col1']:
for j in df['col2'].value_counts():
if i.between(j):
new_list.append(inter_list.index(j))
print(new_list)
CodePudding user response:
According to pandas.cut, you can directly specify the labels in the function call. The return value will be a pandas Series containing the belonging label for each value in df.col1
. The following code does the trick for you:
labels = ['25-31', '19,25', '13-19', '31-37', '0-7',
'37-43', '43-49', '49-55', '7-13', '55-62']
df['labels'] = pd.cut(df.col1,10, labels=labels)