compare observation to group average-CodePudding

I have a dataframe with a column for region and a column for an economic indicator.

region	indicator
A	100
B	50
C	10
D	20
B	102
C	10

Say that A and B belong to one group, and C and D belong to another. I want to create a new column for the comparative performance that will quantify how well a region is doing compared to the group average.

so for example the average for group 1 is 84, the average for group 2 is 13.33. so the output should be

region	comparative_indicator
A	16
B	-34
C	-3.33
D	6.67
B	18
C	-3.33

I have no idea what to do here

test dataframe

region<- c('A','B','C','D','B','C')
indicator <- c(100,50,10,20,102,10)
df <- data.frame(region,indicator)

CodePudding user response：

You may divide the regions in different groups. For each group subtract the average indicator value of the group with it's value to get comparative_indicator.

library(dplyr)

df %>%
  mutate(groups = case_when(region %in% c('A', 'B') ~ 'group1', 
                            region %in% c('C', 'D') ~ 'group2')) %>%
  group_by(groups) %>%
  mutate(comparative_indicator = indicator - mean(indicator)) %>%
  ungroup

#  region indicator groups comparative_indicator
#  <chr>      <dbl> <chr>                  <dbl>
#1 A            100 group1                 16   
#2 B             50 group1                -34   
#3 C             10 group2                 -3.33
#4 D             20 group2                  6.67
#5 B            102 group1                 18   
#6 C             10 group2                 -3.33

You may remove/rename columns according to your preference in the output.