I have a database in which I have several different emotion variables for two task conditions, for example:
ID Cond1_sad Cond1_anxious Cond1_happy Cond2_sad Cond2_anxious Cond2_happy
1 C_01 1 7 4 1 3 2
2 C_02 3 6 4 4 6 5
3 C_03 8 6 5 3 4 4
4 C_04 4 5 5 6 8 6
5 C_05 6 4 3 2 6 3
6 C_06 1 4 2 8 1 4
I need to stack according to condition, but keep emotion variables in separate columns. For example:
ID Condition Sad Anxious Happy
1 C_01 1 1 7 4
2 C_02 1 3 6 4
3 C_03 1 8 6 5
4 C_04 1 4 5 5
5 C_05 1 6 4 3
6 C_06 1 1 4 2
7 C_01 2 1 3 2
8 C_02 2 4 6 5
9 C_03 2 3 4 4
10 C_04 2 6 8 6
11 C_05 2 2 6 3
12 C_06 2 8 1 4
I have tried various options for stacking the data (e.g. melting them), but end up with all the variables stacked on top of one another. Is there a better solution?
Many thanks in advance.
CodePudding user response:
Here is a way to do it in the tidyverse. This should be a bit more flexible than specifying column positions by index.
library(dplyr)
library(tidyr)
library(janitor)
df %>%
pivot_longer(-ID, names_to = c("Condition", "Emotion"),
values_to = "Value", names_sep = "_", names_prefix = "Cond",
names_transform = list(Condition = as.integer)) %>%
pivot_wider(names_from = "Emotion", values_from = "Value") %>%
arrange(Condition, ID) %>%
clean_names("title")
# # A tibble: 12 x 5
# ID Condition Sad Anxious Happy
# <chr> <int> <int> <int> <int>
# 1 C_01 1 1 7 4
# 2 C_02 1 3 6 4
# 3 C_03 1 8 6 5
# 4 C_04 1 4 5 5
# 5 C_05 1 6 4 3
# 6 C_06 1 1 4 2
# 7 C_01 2 1 3 2
# 8 C_02 2 4 6 5
# 9 C_03 2 3 4 4
# 10 C_04 2 6 8 6
# 11 C_05 2 2 6 3
# 12 C_06 2 8 1 4
CodePudding user response:
Not sure if there are more conditions, in your given case this will work
library(tibble)
df <- data.frame(ID = c("C_01","C_02","C_03","C_04","C_05","C_06"),
Cond1_sad = c(1,3,8,4,6,1),
Cond1_anxious = c(7,6,6,5,4,4),
Cond1_happy = c(4,4,5,5,3,2),
Cond2_sad = c(1,4,3,6,2,8),
Cond2_anxious = c(3,6,4,8,6,1),
Cond2_happy = c(2,5,4,6,3,4))
df1 <- df[,c(1,2,3,4)]
df2 <- df[,c(1,5,6,7)]
df1 <- add_column(df1, Condition = 1, .after = 1)
df2 <- add_column(df2, Condition = 2, .after = 1)
colnames(df1) <- c("ID", "Condition", "Sad", "Anxious", "Happy")
colnames(df2) <- c("ID", "Condition", "Sad", "Anxious", "Happy")
df3 <- rbind(df1,df2)
df3
> df3
ID Condition Sad Anxious Happy
1 C_01 1 1 7 4
2 C_02 1 3 6 4
3 C_03 1 8 6 5
4 C_04 1 4 5 5
5 C_05 1 6 4 3
6 C_06 1 1 4 2
7 C_01 2 1 3 2
8 C_02 2 4 6 5
9 C_03 2 3 4 4
10 C_04 2 6 8 6
11 C_05 2 2 6 3
12 C_06 2 8 1 4
CodePudding user response:
This is an easy way to do it in base
. If you really want I am sure you can find a way to remove the indexes and the explicit v.names
.
reshape(df, direction="long", varying=list(c(2, 5), c(3, 6), c(4, 7)),
v.names=c("Sad", "Anxious", "Happy"), timevar = "Condition",
times = c(1, 2), idvar = F)