Home > Blockchain >  pivot longer twelve columns at once
pivot longer twelve columns at once

Time:11-23

I have 12 columns with info for 12 tests, each test has three different scores (bl, ind, dup) and I have five test categories/types matches('fluen|acur|dlp|outc|nota'). I'm trying to pivot longer them into five columns 'fluen', dlp, 'acur' , 'ouct', 'nota' which will contain the test score/value for each test and a categorical column called type containing the tests names as factors (bl, ind, dup) avoiding hard-coding.

  • data:
head(my_data)
  ID blfluen indfluen dupfluen blacur indacur dupacur bldlp inddlp dupdlp blnota indnota dupnota
1 p1   77.58    97.37   102.10   5.33    5.05    5.15 69.33  52.42  60.20      7     6.0     6.0
2 p2   93.63    81.56    97.84   5.82    2.29    7.54 50.09  57.93  56.79      9     7.5     8.0
  • desired output:
ID      fluen dlp acur outc nota type

part1                            bl
part1                            ind
part1                            dup
part2                            bl
part2                            ind
part2                            dup

so on...
  • Question: how can I achieve the desired output with pivot_longer() ? I'm sure there's a way to do this at once avoiding hard-coding. I've seen similar questions, but the thing that caught me was the patterns names, I didn't know how to group then into a single regex, for ex... Thanks in advance!

  • reproducible data:

> dput(head(my_data))
structure(list(ID = c("p1", "p2", "p3", "p4", "p5", "p6"), blfluen = c(77.58, 
93.63, 96, 59.74, 107.83, 90), indfluen = c(97.37, 81.56, 103.94, 
82.1, 100.31, 90.47), dupfluen = c(102.1, 97.84, 113.04, 93.87, 
96.36, 94.13), blacur = c(5.33, 5.82, 4.46, 8.97, 6.76, 3.33), 
    indacur = c(5.05, 2.29, 2.43, 7.69, 8.41, 11.57), dupacur = c(5.15, 
    7.54, 1.53, 7.21, 3.77, 7.69), bldlp = c(69.33, 50.09, 55.71, 
    59.74, 60.6, 65.55), inddlp = c(52.42, 57.93, 55.77, 61.02, 
    54.57, 66.31), dupdlp = c(60.2, 56.79, 46.15, 63.29, 65.28, 
    56.26), blnota = c(7, 9, 8, 4, 9, 7), indnota = c(6, 7.5, 
    8, 5, 9, 7), dupnota = c(6, 8, 9, 6, 8.5, 7.5)), row.names = c(NA, 
6L), class = "data.frame")

CodePudding user response:

Using the names_pattern argument and the special .value you could do:

library(tidyr)

my_data |> 
  pivot_longer(-ID, names_to = c("type", ".value"), names_pattern = "^(bl|ind|dup)(.*)$")
#> # A tibble: 18 × 6
#>    ID    type fluen  acur   dlp  nota
#>    <chr> <chr> <dbl> <dbl> <dbl> <dbl>
#>  1 p1    bl     77.6  5.33  69.3   7  
#>  2 p1    ind    97.4  5.05  52.4   6  
#>  3 p1    dup   102.   5.15  60.2   6  
#>  4 p2    bl     93.6  5.82  50.1   9  
#>  5 p2    ind    81.6  2.29  57.9   7.5
#>  6 p2    dup    97.8  7.54  56.8   8  
#>  7 p3    bl     96    4.46  55.7   8  
#>  8 p3    ind   104.   2.43  55.8   8  
#>  9 p3    dup   113.   1.53  46.2   9  
#> 10 p4    bl     59.7  8.97  59.7   4  
#> 11 p4    ind    82.1  7.69  61.0   5  
#> 12 p4    dup    93.9  7.21  63.3   6  
#> 13 p5    bl    108.   6.76  60.6   9  
#> 14 p5    ind   100.   8.41  54.6   9  
#> 15 p5    dup    96.4  3.77  65.3   8.5
#> 16 p6    bl     90    3.33  65.6   7  
#> 17 p6    ind    90.5 11.6   66.3   7  
#> 18 p6    dup    94.1  7.69  56.3   7.5
  • Related