Home > database >  reshape function not mapping values correctly
reshape function not mapping values correctly

Time:02-01

I am working with a within-subjects design looking at how participants rated 20 videos across a variety of variables (valence, arousal, etc.) and am trying to turn my wide-format df into a long format so it looks like this...

ID Video_Type Valence Arousal
123 Comedy 1 100
123 Drama 4 82

Currently the wide-format looks something like this:

ID Comedy.valence Comedy.arousal Comedy.rating Drama.valence Drama.arousal
111 1 1 100 5 7
999 6 4 82 3 8

When I use the code below all of the column names for the long dataset are correct but the values aren't mapping on correctly (e.g., the values for valence are placed under arousal, the values for arousal are placed under rating, etc.)

reshape(videoratings, direction = "long", 
        varying=c(1:23), 
        timevar = "video",
        times = c("Comedy", "Drama", "Action"),
        v.names = c("valence", "arousal", "rating"),
        idvar = "ResponseId")

Does anyone know how to fix this?

CodePudding user response:

I created a sample dataset like your example

df <- data.frame(ID=c(111,999), Comedy.valence=c(1, 6), Comedy.arousal=c(1,4), Comedy.rating=c(100,82), Drama.valence=c(5,3), Drama.arousal=c(7,8), Drama.rating=c(20,80))

Then, I applied reshape like the following:

df2 <- reshape(df, varying = 2:7, direction = "long", idvar = "ID", timevar = "Video_Type", v.names = c("valence", "arousal", "rating"), times = c("Comedy", "Drama"))

row.names(df2) <- NULL

df2

And, the output looks like:

   ID Video_Type valence arousal rating
1 111     Comedy       1     100      1
2 999     Comedy       4      82      6
3 111      Drama       7      20      5
4 999      Drama       8      80      3

I modified my code a little more:

df2 <- reshape(df, 
        varying = 2:7, 
        direction = "long", 
        idvar = "ID", 
        timevar = "Video_Type", 
        v.names = unlist(unique(lapply(strsplit(names(df), split="\\.")[2:7], '[[', 2))), 
        times = unlist(unique(lapply(strsplit(names(df), split="\\.")[2:7], '[[', 1))))

row.names(df2) <- NULL

df2

I guess it depends how many columns you have, you can modify the column index in the varying and lapply

CodePudding user response:

Just need to switch the names. This cannot be reversed using the split = parameter hence use sub function

reshape(setNames(df, sub("(\\w ).(\\w )","\\2.\\1", names(df))), -1, dir='long', idv = 'ID')

            ID   time valence arousal rating
111.Comedy 111 Comedy       1       1    100
999.Comedy 999 Comedy       6       4     82
111.Drama  111  Drama       5       7     20
999.Drama  999  Drama       3       8     80
  • Related