Home > other >  How to pivot multiple columns at the same time?
How to pivot multiple columns at the same time?

Time:04-02

I have a data frame with exam questions organized as you see below in input. I'm trying to organize it in a tidy way as displayed in output.

In input you can see student's ID, their answer to the specific item suffix = ".text", their score for that specific item suffix = ".score", and their total score.

Input

library(tibble)

input <- tribble(
     ~ID, ~i1.text, ~i1.score, ~i2.text, ~i2.score, ~total,
  "Mark",      "A",        0L,      "B",        1L,     1L,
  "Mary",      "C",        1L,      "D",        0L,     1L,
  "John",      "A",        0L,      "B",        1L,     1L,
  "Abby",      "C",        1L,      "B",        1L,     2L
  )

I need to pivot the data to look like the one below.

I'm quite certain that I can do that with pivot_longer() but I'm quite lost.

Expected Output

output <- tribble(
     ~ID, ~item, ~text, ~score, ~total,
  "Mark",  "i1",   "A",     0L,     1L,
  "Mark",  "i2",   "B",     1L,     1L,
  "Mary",  "i1",   "C",     1L,     1L,
  "Mary",  "i2",   "D",     0L,     1L,
  "John",  "i1",   "A",     0L,     1L,
  "John",  "i2",   "B",     1L,     1L,
  "Abby",  "i1",   "C",     1L,     2L,
  "Abby",  "i2",   "B",     2L,     2L
  )

output

# A tibble: 8 × 5
  ID    item  text  score total
  <chr> <chr> <chr> <int> <int>
1 Mark  i1    A         0     1
2 Mark  i2    B         1     1
3 Mary  i1    C         1     1
4 Mary  i2    D         0     1
5 John  i1    A         0     1
6 John  i2    B         1     1
7 Abby  i1    C         1     2
8 Abby  i2    B         2     2

CodePudding user response:

We can use pivot_longer with names_sep as . - the column 'item' return the prefix part of the column names before the . and the .value will return the values of the column with the suffix part of the column name after the .

library(tidyr)
pivot_longer(input, cols = contains("."), 
    names_to = c("item", ".value"), names_sep = "\\.")

-output

# A tibble: 8 × 5
  ID    total item  text  score
  <chr> <int> <chr> <chr> <int>
1 Mark      1 i1    A         0
2 Mark      1 i2    B         1
3 Mary      1 i1    C         1
4 Mary      1 i2    D         0
5 John      1 i1    A         0
6 John      1 i2    B         1
7 Abby      2 i1    C         1
8 Abby      2 i2    B         1

CodePudding user response:

Here is an similar but alternative approach using names_pattern:

Explanation: (.*)\\.(.*) is regex:

() captures a group

. captures any character

* is a quantifier means 0 or more

\\ Escape character

The regular expression means: any amount of character followed by a dot followed by any amount of character. This regex matches your column names:

library(dplyr)
library(tidyr)

input %>% 
  pivot_longer(
    cols = -c(ID, total), 
    names_to = c('item', '.value'),
    names_pattern = '(.*)\\.(.*)'
    )
  ID    total item  text  score
  <chr> <int> <chr> <chr> <int>
1 Mark      1 i1    A         0
2 Mark      1 i2    B         1
3 Mary      1 i1    C         1
4 Mary      1 i2    D         0
5 John      1 i1    A         0
6 John      1 i2    B         1
7 Abby      2 i1    C         1
8 Abby      2 i2    B         1
  • Related