Home > Back-end >  Selectively Applying pivot_wider() Function
Selectively Applying pivot_wider() Function

Time:11-13

BACKGROUND: We asked each participant to identify multiple emotions and then collected data on each of those emotions, such that there is a column for the first emotion they identified, second, and so on, and then individual columns for each of the follow up questions for each of the emotions. In wide format, it looks something like this:

rows <- 1:4
cols <- c("PID", "Stage", "Emo1_", "Emo2_", 
          "Emo1_Intense", "Emo2_Intense", 
          "Emo1_Desc", "Emo2_Desc", "Keyword")
df  <- data.frame(matrix(NA, 
                         nrow = length(rows), 
                         ncol = length(cols), 
                         dimnames = list(rows, cols)))
df$PID <- c("A-001", "A-002", "A-003", "A-004")
df$Stage <- c("Beginning", "End", "Middle", "Middle")
df$Emo1_ <- c("Fear", "Sadness", "Happy", "Anger")
df$Emo2_ <- c("Content", "Depressed", "Lost", "Sad")
df$Emo1_Intense <- 5:8
df$Emo2_Intense <- 1:4
df$Emo1_Desc <- c("E", "F", "G", "H")
df$Emo2_Desc <- c("A", "B", "C", "D")
df$Keyword <- c("Bus", "Ceiling", "Chainsaw", "Floor")
#    PID     Stage   Emo1_     Emo2_ Emo1_Intense Emo2_Intense Emo1_Desc Emo2_Desc  Keyword
#1 A-001 Beginning    Fear   Content            5            1         E         A      Bus
#2 A-002       End Sadness Depressed            6            2         F         B  Ceiling
#3 A-003    Middle   Happy      Lost            7            3         G         C Chainsaw
#4 A-004    Middle   Anger       Sad            8            4         H         D    Floor

PROBLEM: I'm having a brain fart and cannot figure out how to convert this dataframe to the format below, wherein we have single columns that each capture: 1.) Which position an emotion was named, 2.) Which emotion was named, 3.) Each of the follow up questions for each emotion:

rows <- 1:8
cols <- c("PID", "Stage", "Number", "Emo", "Intense", "Desc", "Keyword")
df  <- data.frame(matrix(NA, 
                         nrow = length(rows), 
                         ncol = length(cols), 
                         dimnames = list(rows, cols)))
df$PID <- sort(rep(c("A-001", "A-002", "A-003", "A-004"), 2))
df$Stage <- sort(rep(c("Beginning", "Middle", "Middle", "End"), 2))
df$Number <- rep(1:2, 4)
df$Emo <- c("Fear", "Content", "Sadness", "Depressed", "Happy", "Lost", "Anger", "Sad")
df$Intense <- c(5,1,6,2,7,3,4,8)
df$Desc <- c("E", "A", "F", "B", "G", "C", "H", "D")
df$Keyword <- rep(c("Bus", "Ceiling", "Chainsaw", "Floor"),2)
#    PID     Stage Number       Emo Intense Desc  Keyword
#1 A-001 Beginning      1      Fear       5    E      Bus
#2 A-001 Beginning      2   Content       1    A  Ceiling
#3 A-002       End      1   Sadness       6    F Chainsaw
#4 A-002       End      2 Depressed       2    B    Floor
#5 A-003    Middle      1     Happy       7    G      Bus
#6 A-003    Middle      2      Lost       3    C  Ceiling
#7 A-004    Middle      1     Anger       4    H Chainsaw
#8 A-004    Middle      2       Sad       8    D    Floor

I could do this manually, but it's a much larger dataset than this and I know I've used pivot_longer a thousand times to do this simply; I'm just struggling hard to make it work right now. It's either too conservative or too liberal in its column consolidation and I'm having trouble finding the balance.

Naming conventions are arbitrary. If reformatting in another way might work better, please be my guest!

CodePudding user response:

How about this solution?

Needed to update the names of the columns that ended with "_" and some polishing of the number's column. I'm sure this could be accomplished in a single line.

#rename columns that end with _
torename<-grep("(Emo._)$", names(df))
names(df)[torename] <- paste0(names(df)[torename], "Emo")

answer<- pivot_longer(df, cols= starts_with("Emo"), names_to=c( "Number", ".value"),  
                      names_sep = "_", names_repair="unique") 

#clean-up the Number column
answer$Number <- gsub("Emo", "", answer$Number)

answer
# A tibble: 8 × 7
  PID   Stage     Keyword  Number Emo       Intense Desc 
  <chr> <chr>     <chr>    <chr>  <chr>       <int> <chr>
1 A-001 Beginning Bus      1      Fear            5 E    
2 A-001 Beginning Bus      2      Content         1 A    
3 A-002 End       Ceiling  1      Sadness         6 F    
4 A-002 End       Ceiling  2      Depressed       2 B    
5 A-003 Middle    Chainsaw 1      Happy           7 G    
6 A-003 Middle    Chainsaw 2      Lost            3 C    
7 A-004 Middle    Floor    1      Anger           8 H    
8 A-004 Middle    Floor    2      Sad             4 D 
  • Related