Home > database >  R groupby and divide column by a value in column
R groupby and divide column by a value in column

Time:10-22

I have a dataframe like so:

id  year  month  val
1   2020  1      50
1   2020  7      80
1   2021  1      40
1   2021  7      70
.
.

Now, I want to index all the values using Jan 2020 as index year for each id. Essentially group by id, then divide val with val at Jan 2020 * 100. So the final dataframe would look something like this:

id  year  month  val
1   2020  1      100
1   2020  7      160
1   2021  1      80
1   2021  7      140
.
.

This is what I tried till now:

df %>% group_by(id) %>% mutate(val = 100*val/[val at Jan 2020])

I can separately get val at Jan 2020 like so:

df %>% filter(year==2020, month==1) %>% select(val)

But it doesn't work together:

df %>% group_by(id) %>% mutate(val = 100*val/(df %>% filter(year==2020, month==1) %>% select(val)))

The above throws error

CodePudding user response:

A dplyr approach

library(dplyr)

df %>% 
  group_by(id) %>%
  mutate(val = val / val[year == 2020 & month == 1] * 100) %>%
  ungroup()
# A tibble: 4 × 4
     id  year month   val
  <int> <int> <int> <dbl>
1     1  2020     1   100
2     1  2020     7   160
3     1  2021     1    80
4     1  2021     7   140

CodePudding user response:

Base R

do.call(
  rbind,
  lapply(
    split(df,df$id),
    function(x){
      cbind(
        subset(x,select=-c(val)),
        "val"=x$val/x$val[x$year==2020 & x$month==1]*100
      )
    }
  )
)

    id year month val
1.1  1 2020     1 100
1.2  1 2020     7 160
1.3  1 2021     1  80
1.4  1 2021     7 140
  •  Tags:  
  • r
  • Related