Home > database >  How do I learn the grouping function with pivot_longer and names_pattern
How do I learn the grouping function with pivot_longer and names_pattern

Time:10-14

I am trying to work on the names_pattern agument in pivot_longer and I am not sure I understand the grouping function and how it works. I need to pivot the following data frame to match the desired output below.

df<-structure(list(Weighted_Ideology =0.514, Weighted_Ideology_se = 0.00, Unweighted_Ideology = 0.51, Unweighted_Ideology_se = 0.004), row.names = c(NA, -1L), class = "data.frame")

library(tidyr)
df%>%
pivot_longer(., cols=everything(), names_to=c('Variable',  ".value"), names_pattern="([a-z] _[a-z] )_(.*)")

Desired Output

df2<-data.frame(
  Variable=c('Weighted', "Unweighted"),
  Ideology=c(0.54, 0.51),
  se=c(0.005, 0.004)
)

CodePudding user response:

Change the names_pattern to capture the characters that are not a _ from the start (^) of the column name as a group, followe dby the _ and then capture the rest of the characters ((.*))

library(dplyr)
library(tidyr)
df %>%
    pivot_longer(cols = everything(), names_to = c("Variable", ".value"), 
       names_pattern = "^([^_] )_(.*)")%>%
    rename(se = Ideology_se)

-output

# A tibble: 2 × 3
  Variable   Ideology    se
  <chr>         <dbl> <dbl>
1 Weighted      0.514 0    
2 Unweighted    0.51  0.004

The [a-z] implies only lower case characters, whereas in the column names, there is upper case starting character (Weighted_Ideology). When there are multiple _ and we want to split at a particular _, it may be better to match characters other than the _ ([^_] ) as in the solution above

  • Related