I have a dataframe 'df'. I want to display all rows of the 'Percentage of 1/Yes' column that are greater than the average of the column. My code displays whether it is True/False, but I want to display the actual value for the True rows and not display the False rows.
My code:
trainData, validData = train_test_split(df, test_size=0.4, random_state=1)
# Response rate for RFM categories
# RFM: Combine R, F, M categories into one category
trainData['RFM'] = trainData['Mcode'].astype(str) trainData['Rcode'].astype(str) trainData['Fcode'].astype(str)
rfm_crosstab = pd.crosstab(index = [trainData['RFM']], columns = trainData['Florence'], margins = True)
rfm_crosstab['Percentage of 1/Yes'] = 100 * (rfm_crosstab[1] / rfm_crosstab['All'])
# Display rows with percentage greater than mean
rfm_crosstab['Percentage of 1/Yes'] > rfm_crosstab['Percentage of 1/Yes'].mean()
Output:
RFM
111 False
121 True
131 False
141 False
211 False
212 True
221 True
222 True
231 False
232 False
241 False
242 False
311 False
312 False
313 True
321 True
322 True
323 True
331 False
332 False
333 False
341 True
342 False
343 False
411 True
412 False
413 False
421 False
422 True
423 True
431 False
432 False
433 False
441 False
442 False
443 False
511 True
512 False
513 True
521 True
522 False
523 True
531 False
532 False
533 True
541 False
542 False
543 False
All False
Name: Percentage of 1/Yes, dtype: bool
Data: 'df'
Seq# ID# Gender M R F FirstPurch ChildBks YouthBks CookBks ... ItalCook ItalAtlas ItalArt Florence Related Purchase Mcode Rcode Fcode Yes_Florence No_Florence
0 1 25 1 297 14 2 22 0 1 1 ... 0 0 0 0 0 5 4 2 0 1
1 2 29 0 128 8 2 10 0 0 0 ... 0 0 0 0 0 4 3 2 0 1
2 3 46 1 138 22 7 56 2 1 2 ... 1 0 0 0 2 4 4 3 0 1
3 4 47 1 228 2 1 2 0 0 0 ... 0 0 0 0 0 5 1 1 0 1
4 5 51 1 257 10 1 10 0 0 0 ... 0 0 0 0 0 5 3 1 0 1
CodePudding user response:
Almost there, you can use your output (True/False column) the following way:
output = rfm_crosstab['Percentage of 1/Yes'] > rfm_crosstab['Percentage of 1/Yes'].mean()
rfm_crosstab['Percentage of 1/Yes'][output]