I'm struggling with the code below. My aim is to rewrite a price with a Mode only if only one price differs from the rest. I know that with n()==1 we can choose unique prices without duplicates, but how to add that only if there is one such price. Tried with count(), but got an error.
library("dplyr")
Mode <- function(x) {
ux <- unique(x)
ux[which.max(tabulate(match(x, ux)))]
}
pp <- pp %>%
group_by(PPCODE) %>%
mutate(PRICE = ifelse(count(n() == 1)==1, Mode(PRICE), PRICE), .keep="unused")
Output of dput:
structure(list(OUTLETID = c("11N", "12B", "17C",
"44Oo", "1NN", "5CC", "AA1", "11A"), PPCODE = c(4623,
4623, 4623, 4111, 4111, 4623, 4111, 4111),
PRICE = c(1.45, 1.45, 1.45, 5.11, 5.11, 1.42, 5.13, 4.5))),
row.names = c(NA, -4L), class = c("tbl_df", "tbl", "data.frame"))
So, in the first ppcode 4623 price should be changed as only one price differs (1.42 should be changed to 1.45), but with the second ppcode they should stay as they are (because 2 prices differ 4.5 and 5.13 from mode).
CodePudding user response:
You may count how many values differ from mode in each group using sum(PRICE != Mode(PRICE))
and apply the Mode
function if it is less than equal to 1.
library(dplyr)
pp %>%
group_by(PPCODE) %>%
mutate(PRICE = if(sum(PRICE != Mode(PRICE)) <= 1) Mode(PRICE) else PRICE) %>%
ungroup
# OUTLETID PPCODE PRICE
# <chr> <dbl> <dbl>
#1 11N 4623 1.45
#2 12B 4623 1.45
#3 17C 4623 1.45
#4 44Oo 4111 5.11
#5 1NN 4111 5.11
#6 5CC 4623 1.45
#7 AA1 4111 5.13
#8 11A 4111 4.5