Let's say we have a team variable, but we also have a time period 1 and a time period 2 variable, and a numeric grade 1-10. I want to mutate and add a variable that calculates the difference from time period 1 to time period 2.
How do I do this?
Visually the table looks like this: img
CodePudding user response:
There is a neat function in the data.table package called dcast( ) that allows you to transform your data from long to wide. In this case, you can use the Period variable to create 2 new columns, Period 1 and Period 2, where the values are the Grades.
library(data.table)
> data <- data.table(
Team = c("Team 1","Team 1","Team 2","Team 2","Team 3","Team 3"),
Period = c("Period 1","Period 2","Period 1","Period 2","Period 1","Period 2"),
Grade = c(75,87,42,35,10,95))
> data
Team Period Grade
1: Team 1 Period 1 75
2: Team 1 Period 2 87
3: Team 2 Period 1 42
4: Team 2 Period 2 35
5: Team 3 Period 1 10
6: Team 3 Period 2 95
> data2 <- dcast(
data = data,
Team ~ Period,
value.var = "Grade")
> data2
Team Period 1 Period 2
1: Team 1 75 87
2: Team 2 42 35
3: Team 3 10 95
> data2 <- data2[,Difference := `Period 2` - `Period 1`]
> data2
Team Period 1 Period 2 Difference
1: Team 1 75 87 12
2: Team 2 42 35 -7
3: Team 3 10 95 85
CodePudding user response:
In tidyverse syntax, we would use pivot_wider
and mutate
:
library(tidyverse)
df %>%
pivot_wider(names_from = `Time Period`, values_from = Grade) %>%
mutate(difference = P2 - P1)
#> # A tibble: 3 x 4
#> Team P1 P2 difference
#> <chr> <dbl> <dbl> <dbl>
#> 1 Team 1 75 87 12
#> 2 Team 2 42 35 -7
#> 3 Team 3 10 95 85
Created on 2022-08-29 with reprex v2.0.2
Data used
df <- data.frame(Team = paste("Team", rep(1:3, each = 2)),
`Time Period` = rep(c("P1", "P2"), 3),
Grade = c(75, 87, 42, 35, 10, 95),
check.names = FALSE)
df
#> Team Time Period Grade
#> 1 Team 1 P1 75
#> 2 Team 1 P2 87
#> 3 Team 2 P1 42
#> 4 Team 2 P2 35
#> 5 Team 3 P1 10
#> 6 Team 3 P2 95