Home > Blockchain >  multiple long-formating with pivot_longer() in R
multiple long-formating with pivot_longer() in R

Time:11-15

I was wondering how to achieve the output I'm currently getting from gather() by instead using the pivot_longer()?

Reproducible data and code are below.

m="
id count2020 count2021 ratio2020 ratio2021
A  1         2         .1        .2
A  3         4         .3        .4
B  5         6         .5        .6
B  7         8         .7        .8
"
d1  <- read.table(text = m, h=T)

# My old school solution:
d1 %>% 
gather(id, ratio2020:ratio2021,key = "year") %>%
  gather(id, count2020:count2021, key = "year", value = "ratio")

Desired output:

  id    count year   ratio
1 A     1     2020   0.1
2 A     2     2021   0.2
3 A     3     2020   0.3
4 A     4     2021   0.4
5 B     5     2020   0.5
6 B     6     2021   0.6
7 B     7     2020   0.7
8 B     8     2021   0.8

CodePudding user response:

We could do it this way using names_sep with a regex "(?<=[A-Za-z])(?=[0-9])" that does nothing else but separate the string by the first occurence of digit after a letter:

library(dplyr)
library(tidyr)

df %>% 
  pivot_longer(
    cols = -id,
    names_to = c(".value", "year"), 
    names_sep = "(?<=[A-Za-z])(?=[0-9])") 

And one solution without regex:

library(dplyr)
library(tidyr)
library(stringr)

df %>% 
  rename_with(., ~str_replace(., '2', '_2')) %>%  
  pivot_longer(
    cols = -id,
    names_to = c(".value", "year"), 
    names_sep = "_")
  id    year  count ratio
  <chr> <chr> <int> <dbl>
1 A     2020      1   0.1
2 A     2021      2   0.2
3 A     2020      3   0.3
4 A     2021      4   0.4
5 B     2020      5   0.5
6 B     2021      6   0.6
7 B     2020      7   0.7
8 B     2021      8   0.8
  • Related