I have a dataframe with three identifiers and multiple variables. As an example, say I have a variable that identifies a country, another that identifies a person, a third that identifies time, and two more variables var1 and var2. Like the data structure below:
set.seed(123)
df1 <- data.frame(country = LETTERS[1:2])
df2 <- data.frame(person = letters[1:3])
df3 <- data.frame(time = 1:3)
df <- merge(df1, df2)
df <- merge(df, df3)
df <- df %>% mutate(var1 = runif(18),
var2 = runif(18))
df <- df %>% arrange(country, person, time)
which is going to end up with a data frame like:
country person time var1 var2
1 A a 1 0.28757752 0.32792072
2 A a 2 0.52810549 0.65570580
3 A a 3 0.67757064 0.96302423
4 A b 1 0.40897692 0.88953932
5 A b 2 0.55143501 0.54406602
6 A b 3 0.10292468 0.69070528
7 A c 1 0.94046728 0.64050681
8 A c 2 0.95683335 0.28915974
9 A c 3 0.24608773 0.02461368
10 B a 1 0.78830514 0.95450365
11 B a 2 0.89241904 0.70853047
12 B a 3 0.57263340 0.90229905
13 B b 1 0.88301740 0.69280341
14 B b 2 0.45661474 0.59414202
15 B b 3 0.89982497 0.79546742
16 B c 1 0.04555650 0.99426978
17 B c 2 0.45333416 0.14711365
18 B c 3 0.04205953 0.47779597
I would like to transform this dataset so I have a column identifying each person for each var1 and var2. In the dataset, I have 3 persons, a,b, and c. So each row would correspond to a country x year, and I would end up with 6 columns: var1_a, var1_b, var1_c, var2_a, var2_b, var2_c. What is the best way to accomplish this?
CodePudding user response:
Here a solution:
library(tidyverse)
pivot_wider(df, id_cols = c(country, time), names_from=person, values_from=c(var1,var2))