Home > Enterprise >  How to pivot wider a Tibble when some rows are a subset of each other?
How to pivot wider a Tibble when some rows are a subset of each other?

Time:11-05

I have something like df1 and I want to obtain df2. I've tried to use mutate() and also pivot_wider() but I didn't know how to articulate the idea.

library(tidyverse)

 df1 <- tibble(
  v1 = c("name1","name1.1","name1.2","name1.3","name2","name2.1","name2.2"),
  v2 = c(9,2,3,4,13,6,7)
)

 df2 <- tibble(
  v1 = c(rep("name1",3),rep("name2",2)),
  v2 = c("name1.1","name1.2","name1.3","name2.1","name2.2"),
  v3 = c(2,3,4,6,7)
)

Notice that name1 is the sum of the values of name1.i, for i = 1,2,3. The same thing for name2.

CodePudding user response:

No pivoting required:

df1 %>%
  mutate(parent = sub("\\..*", "", v1)) %>%
  filter(v1 != parent)
# # A tibble: 5 × 3
#   v1         v2 parent
#   <chr>   <dbl> <chr> 
# 1 name1.1     2 name1 
# 2 name1.2     3 name1 
# 3 name1.3     4 name1 
# 4 name2.1     6 name2 
# 5 name2.2     7 name2 

(Names notwithstanding.)

CodePudding user response:

You can also make use of filter with str_detect and regular expressions. Here \\w stands for any word (letters and numbers) followed by a point \\. and followed by any digit \\d.

library(dplyr)
library(stringr)

df1 |>
  filter(str_detect(v1, "\\w\\.\\d")) |>
  mutate(v3 = str_extract(v1, "\\w "))

# A tibble: 5 × 3
#  v1         v2   v3
#  <chr>   <dbl> <chr> 
#1 name1.1     2 name1 
#2 name1.2     3 name1 
#3 name1.3     4 name1 
#4 name2.1     6 name2 
#5 name2.2     7 name2 
  • Related