I have a dataframe like this:
a <- c("a","b", "c", "d")
b <- c(7, 5, 4, 3)
c <- c("ABc","D", "EF", "BCEF")
m <- data.frame(a, b, c)
I want to subdivide each row into several rows, depending on how many letters are contained in the last column. So, I want a final dataset like this:
a1 <- c("a","a","a", "b", "c", "c", "d", "d", "d", "d")
b1 <- c(7, 7, 7,5, 4, 4, 3, 3, 3, 3)
c1 <- c("A","B", "C", "D", "E", "F", "B", "C", "E", "F")
m1 <- data.frame(a1, b1, c1)
How can I do?
CodePudding user response:
We can use separate_rows
library(tidyr)
separate_rows(m ,c, sep = "(?<=.)(?=.)")
-output
# A tibble: 10 × 3
a b c
<chr> <dbl> <chr>
1 a 7 A
2 a 7 B
3 a 7 c
4 b 5 D
5 c 4 E
6 c 4 F
7 d 3 B
8 d 3 C
9 d 3 E
10 d 3 F
Or in base R
lst1 <- strsplit(m$c, "")
m1 <- transform(m[rep(seq_len(nrow(m)), lengths(lst1)),], c = unlist(lst1))
row.names(m1) <- NULL
-output
> m1
a b c
1 a 7 A
2 a 7 B
3 a 7 c
4 b 5 D
5 c 4 E
6 c 4 F
7 d 3 B
8 d 3 C
9 d 3 E
10 d 3 F