I have created a df of 50 rows. I have labelled value >0.5 as fraud and rest as not fraud. For the rows labelled as not fraud, i actually place them under another group called iffraud.
num = runif(50)
class_df = data.frame(num)
print(class_df)
class_df$type = ifelse(class_df$num > 0.5, 'fraud',"not fraud")
print(class_df)
iffraud = class_df[class_df["type"] == "not fraud"]
How should i count how many values stored in iffraud?
CodePudding user response:
This could be done by using table
like this:
set.seed(1)
num = runif(50)
class_df = data.frame(num)
print(class_df)
class_df$type = ifelse(class_df$num > 0.5, 'fraud',"not fraud")
print(class_df)
#> num type
#> 1 0.26550866 not fraud
#> 2 0.37212390 not fraud
#> 3 0.57285336 fraud
#> 4 0.90820779 fraud
#> 5 0.20168193 not fraud
#> 6 0.89838968 fraud
#> 7 0.94467527 fraud
#> 8 0.66079779 fraud
#> 9 0.62911404 fraud
#> 10 0.06178627 not fraud
#> 11 0.20597457 not fraud
#> 12 0.17655675 not fraud
#> 13 0.68702285 fraud
#> 14 0.38410372 not fraud
#> 15 0.76984142 fraud
#> 16 0.49769924 not fraud
#> 17 0.71761851 fraud
#> 18 0.99190609 fraud
#> 19 0.38003518 not fraud
#> 20 0.77744522 fraud
#> 21 0.93470523 fraud
#> 22 0.21214252 not fraud
#> 23 0.65167377 fraud
#> 24 0.12555510 not fraud
#> 25 0.26722067 not fraud
#> 26 0.38611409 not fraud
#> 27 0.01339033 not fraud
#> 28 0.38238796 not fraud
#> 29 0.86969085 fraud
#> 30 0.34034900 not fraud
#> 31 0.48208012 not fraud
#> 32 0.59956583 fraud
#> 33 0.49354131 not fraud
#> 34 0.18621760 not fraud
#> 35 0.82737332 fraud
#> 36 0.66846674 fraud
#> 37 0.79423986 fraud
#> 38 0.10794363 not fraud
#> 39 0.72371095 fraud
#> 40 0.41127443 not fraud
#> 41 0.82094629 fraud
#> 42 0.64706019 fraud
#> 43 0.78293276 fraud
#> 44 0.55303631 fraud
#> 45 0.52971958 fraud
#> 46 0.78935623 fraud
#> 47 0.02333120 not fraud
#> 48 0.47723007 not fraud
#> 49 0.73231374 fraud
#> 50 0.69273156 fraud
iffraud = class_df[class_df["type"] == "not fraud"]
a <- table(iffraud)
a
#> iffraud
#> 0.01339033 0.02333120 0.06178627 0.10794363 0.12555510 0.17655675 0.18621760
#> 1 1 1 1 1 1 1
#> 0.20168193 0.20597457 0.21214252 0.26550866 0.26722067 0.34034900 0.37212390
#> 1 1 1 1 1 1 1
#> 0.38003518 0.38238796 0.38410372 0.38611409 0.41127443 0.47723007 0.48208012
#> 1 1 1 1 1 1 1
#> 0.49354131 0.49769924 not fraud
#> 1 1 23
a[names(a)=="not fraud"]
#> not fraud
#> 23
Created on 2022-07-12 by the reprex package (v2.0.1)
CodePudding user response:
Using boolean operations only:
sum(class_df["type"] == "not fraud")
23