Home > Enterprise >  How do I use pivot_longer with name_pattern
How do I use pivot_longer with name_pattern

Time:07-28

I wish to use pivot longer in the tibble below.

I wish to create tibble with 3 columns

  1. Column 1 - name - alpha - contains a and b
  2. Column 2 - name - beta - contains X and Y
  3. Column 3 - name - P - contains values
library(tidyverse)

tbl <- tibble(b_X_P = runif(10),
              b_Y_P = runif(10),
              a_X_P = runif(10),
              a_Y_P = runif(10))

CodePudding user response:

Here, we can just use names_sep as the _ will be delimiter where the column names should be split. Also, specify the names_to as a vector of 3 elements i.e. alpha, beta (as column names for the prefix part from the column names, and .value for the value part from the column)

library(tidyr)
pivot_longer(tbl, cols = everything(),
    names_to = c("alpha", "beta", ".value"), names_sep = "_")

-output

# A tibble: 40 × 3
   alpha beta        P
   <chr> <chr>   <dbl>
 1 b     X     0.271  
 2 b     Y     0.461  
 3 a     X     0.546  
 4 a     Y     0.344  
 5 b     X     0.234  
 6 b     Y     0.00462
 7 a     X     0.0157 
 8 a     Y     0.384  
 9 b     X     0.309  
10 b     Y     0.628  
# … with 30 more rows

If we need names_pattern it would be a pattern that should be wrapped within () to capture those characters

pivot_longer(tbl, cols = everything(),
    names_to = c("alpha", "beta", ".value"), 
    names_pattern = "^([^_] )_([^_] )_(.*)")
# A tibble: 40 × 3
   alpha beta        P
   <chr> <chr>   <dbl>
 1 b     X     0.271  
 2 b     Y     0.461  
 3 a     X     0.546  
 4 a     Y     0.344  
 5 b     X     0.234  
 6 b     Y     0.00462
 7 a     X     0.0157 
 8 a     Y     0.384  
 9 b     X     0.309  
10 b     Y     0.628  
# … with 30 more rows

CodePudding user response:

We could use regex: (\\w)_(\\w)_(\\w) where \\w stands for "word character", usually [A-Za-z0-9_]. The main thing that we have then to do is (and I just checked after seeing @akrun's answer) we have to define 3 places for names_to eg. a, b, and .value:

library(dplyr)
library(tidyr)

tbl %>% 
  pivot_longer(
  cols = everything(),
  names_pattern = "(\\w)_(\\w)_(\\w)", 
  names_to = c("a", "b", ".value")
  )
   a     b          P
   <chr> <chr>  <dbl>
 1 b     X     0.914 
 2 b     Y     0.603 
 3 a     X     0.0331
 4 a     Y     0.740 
 5 b     X     0.257 
 6 b     Y     0.819 
 7 a     X     0.963 
 8 a     Y     0.0964
 9 b     X     0.393 
10 b     Y     0.0656
# ... with 30 more rows
  • Related