Home > Net >  Data transformation when only one digit differs with R
Data transformation when only one digit differs with R

Time:10-20

I am new to R and need help with the following task. In the table below is a dummy example of data. I am struggling with writing a script that should change the price, if only one price is different (for a particular ppcode) and only one symbol differs, to the same number as other prices. In this example, 1.42 should be changed to 1.45. But also if instead of 1.42 would be, for example, 1.55 it also should be changed to 1.45. Thanks in advance for any suggestions.

enter image description here

CodePudding user response:

Here is a base R way with ave.

with(df1, ave(PRICE, PPCODE, FUN = \(x) x[which.max(table(x))]))
#[1] 1.45 1.45 1.45 1.45

And just assign the result back to PRICE.

df1$PRICE <- with(df1, ave(PRICE, PPCODE, FUN = \(x) x[which.max(table(x))]))

CodePudding user response:

If we need the Mode value, an option with dplyr is

Mode <- function(x) {
  ux <- unique(x)
  ux[which.max(tabulate(match(x, ux)))]
}

library(dplyr)
df1 <- df1 %>%
        group_by(PPCODE, grp = sprintf('%.1f', PRICE)) %>%
        mutate(PRICE = Mode(PRICE)) %>%
        ungroup %>%
        select(-grp)

CodePudding user response:

Here is a dplyr way:

library(dplyr)
df %>% 
  group_by(PRICE) %>% 
  mutate(helper = n()) %>% 
  ungroup() %>% 
  group_by(PPCODE) %>% 
  mutate(PRICE = ifelse(helper == unique(1), first(PRICE), PRICE), .keep="unused")

output:

  OUTLETID   CAT     PPCODE PRICE
  <chr>      <chr>    <int> <dbl>
1 8900NS2871 AIR   46239679  1.45
2 8900NX2201 AIR   46239679  1.45
3 8900NK2202 AIR   46239679  1.45
4 8900NV1594 AIR   46239679  1.45
  • Related