I am trying to work on the names_pattern
agument in pivot_longer
and I am not sure I understand the grouping function and how it works. I need to pivot the following data frame to match the desired output below.
df<-structure(list(Weighted_Ideology =0.514, Weighted_Ideology_se = 0.00, Unweighted_Ideology = 0.51, Unweighted_Ideology_se = 0.004), row.names = c(NA, -1L), class = "data.frame")
library(tidyr)
df%>%
pivot_longer(., cols=everything(), names_to=c('Variable', ".value"), names_pattern="([a-z] _[a-z] )_(.*)")
Desired Output
df2<-data.frame(
Variable=c('Weighted', "Unweighted"),
Ideology=c(0.54, 0.51),
se=c(0.005, 0.004)
)
CodePudding user response:
Change the names_pattern
to capture the characters that are not a _
from the start (^
) of the column name as a group, followe dby the _
and then capture the rest of the characters ((.*)
)
library(dplyr)
library(tidyr)
df %>%
pivot_longer(cols = everything(), names_to = c("Variable", ".value"),
names_pattern = "^([^_] )_(.*)")%>%
rename(se = Ideology_se)
-output
# A tibble: 2 × 3
Variable Ideology se
<chr> <dbl> <dbl>
1 Weighted 0.514 0
2 Unweighted 0.51 0.004
The [a-z]
implies only lower case characters, whereas in the column names, there is upper case starting character (W
eighted_I
deology). When there are multiple _
and we want to split at a particular _
, it may be better to match characters other than the _
([^_]
) as in the solution above