I have a dataframe:
df <- data.frame(category = rep(c('Cat','Dog','Chicken'),2),
value = c(3,4,5,4,6,7),
time = c(rep(1,3),rep(2,3))
category value time
Cat 3 1
Dog 4 1
Chicken 5 1
Cat 4 2
Dog 6 2
Chicken 7 2
How can you calculate the difference between a reference variable (in this case 'Cat') and the other variables for each time point. The output I am looking for would be like this:
category value time
Cat 0 1
Dog 1 1
Chicken 2 1
Cat 0 2
Dog 2 2
Chicken 3 2
CodePudding user response:
Assuming the reference value occurs only once within each time point, you can dplyr::group_by(time)
, then use logical indexing to get the value
where category == "cat"
:
library(dplyr)
df %>%
group_by(time) %>%
mutate(value = value - value[category == "Cat"]) %>%
ungroup()
# A tibble: 6 × 3
category value time
<chr> <dbl> <dbl>
1 Cat 0 1
2 Dog 1 1
3 Chicken 2 1
4 Cat 0 2
5 Dog 2 2
6 Chicken 3 2