I am working with a within-subjects design looking at how participants rated 20 videos across a variety of variables (valence, arousal, etc.) and am trying to turn my wide-format df into a long format so it looks like this...
ID | Video_Type | Valence | Arousal |
---|---|---|---|
123 | Comedy | 1 | 100 |
123 | Drama | 4 | 82 |
Currently the wide-format looks something like this:
ID | Comedy.valence | Comedy.arousal | Comedy.rating | Drama.valence | Drama.arousal |
---|---|---|---|---|---|
111 | 1 | 1 | 100 | 5 | 7 |
999 | 6 | 4 | 82 | 3 | 8 |
When I use the code below all of the column names for the long dataset are correct but the values aren't mapping on correctly (e.g., the values for valence are placed under arousal, the values for arousal are placed under rating, etc.)
reshape(videoratings, direction = "long",
varying=c(1:23),
timevar = "video",
times = c("Comedy", "Drama", "Action"),
v.names = c("valence", "arousal", "rating"),
idvar = "ResponseId")
Does anyone know how to fix this?
CodePudding user response:
I created a sample dataset like your example
df <- data.frame(ID=c(111,999), Comedy.valence=c(1, 6), Comedy.arousal=c(1,4), Comedy.rating=c(100,82), Drama.valence=c(5,3), Drama.arousal=c(7,8), Drama.rating=c(20,80))
Then, I applied reshape like the following:
df2 <- reshape(df, varying = 2:7, direction = "long", idvar = "ID", timevar = "Video_Type", v.names = c("valence", "arousal", "rating"), times = c("Comedy", "Drama"))
row.names(df2) <- NULL
df2
And, the output looks like:
ID Video_Type valence arousal rating
1 111 Comedy 1 100 1
2 999 Comedy 4 82 6
3 111 Drama 7 20 5
4 999 Drama 8 80 3
I modified my code a little more:
df2 <- reshape(df,
varying = 2:7,
direction = "long",
idvar = "ID",
timevar = "Video_Type",
v.names = unlist(unique(lapply(strsplit(names(df), split="\\.")[2:7], '[[', 2))),
times = unlist(unique(lapply(strsplit(names(df), split="\\.")[2:7], '[[', 1))))
row.names(df2) <- NULL
df2
I guess it depends how many columns you have, you can modify the column index in the varying
and lapply
CodePudding user response:
Just need to switch the names. This cannot be reversed using the split =
parameter hence use sub
function
reshape(setNames(df, sub("(\\w ).(\\w )","\\2.\\1", names(df))), -1, dir='long', idv = 'ID')
ID time valence arousal rating
111.Comedy 111 Comedy 1 1 100
999.Comedy 999 Comedy 6 4 82
111.Drama 111 Drama 5 7 20
999.Drama 999 Drama 3 8 80