I have a data frame (BRAT2) where a column (oCC_HPE) with 58110 entries contains a range of values from 0-40 out to 2 decimal places.
When I try to reassign the range of values to a string all the values replace properly except for values from 5.02-9.99.
I'm not sure what is causing this issue, I've tried altering the order of the replacements, changing how many decimal places are in my replacement criteria, but to no avail.
BRAT2$oCC_HPE[BRAT2$oCC_HPE == 0.00] <- 'None'
BRAT2$oCC_HPE[BRAT2$oCC_HPE > 0.00 & BRAT2$oCC_HPE <= 1.00] <- 'Rare'
BRAT2$oCC_HPE[BRAT2$oCC_HPE > 1.00 & BRAT2$oCC_HPE <= 5.00] <- 'Occasional'
BRAT2$oCC_HPE[BRAT2$oCC_HPE > 5.00 & BRAT2$oCC_HPE <= 15.00] <- 'Frequent'
BRAT2$oCC_HPE[BRAT2$oCC_HPE > 15.00 & BRAT2$oCC_HPE <= 40.00] <- 'Pervasive'
CodePudding user response:
Instead of multiple assignments (after the first assignment on the same column, the column type is changed from numeric
to character
, making the next comparisons returning incorrect output), use either case_when
or cut
BRAT2$oCC_HPE <- cut(BRAT2$oCC_HPE, breaks = c( 0, 1, 5, 15, 40, Inf),
labels = c("None", "Rare", "Occasional", "Frequent", "Pervasive"))
Or use case_when
from dplyr
library(dplyr)
BRAT2 <- BRAT2 %>%
mutate(oCC_HPE = case_when(oCC_HPE == 0.00 ~ "None",
oCC_HPE > 0.00 &oCC_HPE <= 1.00 ~ "Rare",
oCC_HPE > 1.00 &oCC_HPE <= 5.00 ~ "Occasional",
oCC_HPE > 5.00 &oCC_HPE <= 15.00 ~ "Frequent" ,
oCC_HPE > 15.00 &oCC_HPE <= 40.00 ~"Pervasive" ))