I have the following table:
totalAdjustments | accuracy |
---|---|
1 | 1 |
2 | 2 |
4 | 5 |
1 | 3 |
And would like to group it by totalAdjustments into two groups:
group1: totalAdjustments == 1 (named: oneAdjustment)
group2: totalAdjustments >= 2 (named: twoOrMoreAdjustments)
To get the following table:
numberOfAdjustments | accuracy |
---|---|
oneAdjustment | 1 |
twoOrMoreAdjustments | 2 |
twoOrMoreAdjustments | 5 |
oneAdjustment | 3 |
I currently import my csv with fread
result <- fread("data.csv")
CodePudding user response:
base R
You can use ifelse
for that:
ifelse(dat$totalAdjustments > 1, "twoOrMore", "one")
# [1] "one" "twoOrMore" "twoOrMore" "one"
dat$totalAdjustments <- ifelse(dat$totalAdjustments > 1, "twoOrMore", "one")
dat
# totalAdjustments accuracy
# 1 one 1
# 2 twoOrMore 2
# 3 twoOrMore 5
# 4 one 3
dplyr
library(dplyr)
dat %>%
mutate(totalAdjustments = if_else(totalAdjustments > 1, "twoOrMore", "one"))
# totalAdjustments accuracy
# 1 one 1
# 2 twoOrMore 2
# 3 twoOrMore 5
# 4 one 3
If this is expanded to include another number, perhaps
greater than 3 --> "tooMany"
then I would shift from a simple ifelse
flow to cut
:
dat %>%
mutate(totalAdjustments = cut(totalAdjustments, c(0, 1, 3, Inf), c("one", "twoOrMore", "tooMany")))
# totalAdjustments accuracy
# 1 one 1
# 2 twoOrMore 2
# 3 tooMany 5
# 4 one 3
Note that totalAdjustments
is now of class factor
instead of character
; the difference may be nothing, but often it can lead to unexpected results if you do not intend the class; in that case, wrap it with as.character
, as in = as.character(cut(...))
.