I currently have a dataframe (df1) formatted as shown below:
ID | A1 | A1.1 | A1.2 | A1.3 | B1 | B1.1 | B1.2 | B1.3 | C1 | C1.1 | C1.2 | C1.3 |
---|---|---|---|---|---|---|---|---|---|---|---|---|
1 | 10 | 1 | 1 | 0 | 15 | 0 | 1 | 0 | 30 | 1 | 0 | 0 |
2 | 25 | 1 | 0 | 0 | 30 | 0 | 1 | 1 | 25 | 1 | 0 | 1 |
3 | 40 | 0 | 1 | 0 | 15 | 0 | 1 | 0 | 10 | 0 | 0 | 1 |
4 | 25 | 1 | 1 | 0 | 10 | 0 | 1 | 1 | 30 | 1 | 0 | 0 |
I would like to reformat it so that it is arranged as shown here in df2:
ID | A1_C1_value | R1 | R2 | R3 | A_C |
---|---|---|---|---|---|
1 | 10 | 1 | 1 | 0 | A |
1 | 15 | 0 | 1 | 0 | B |
1 | 30 | 1 | 0 | 0 | C |
2 | 25 | 1 | 0 | 0 | A |
2 | 30 | 0 | 1 | 1 | B |
2 | 25 | 1 | 0 | 1 | C |
3 | 40 | 0 | 1 | 0 | A |
3 | 15 | 0 | 1 | 0 | B |
3 | 10 | 0 | 0 | 1 | C |
4 | 25 | 1 | 1 | 0 | A |
4 | 10 | 0 | 1 | 1 | B |
4 | 30 | 1 | 0 | 0 | C |
I've been able to pivot longer to get the A1_C1 values and A_C column, however am leftover with the A1.2-1.3, B1.2-1.3, C1.2-1.3. How would I include these in my transformation? This is what I've done so far:
df2 <- pivot_longer(df1, cols = c("A1","B1","C1"), names_to ="A_C",
values_to = "A1_C1_value")
CodePudding user response:
You can use pivot_longer()
but it is easier if you rename the variables first as below:
x <- data.frame(
ID = 1:4,
A1 = c(10,25,40,25),
A1.1=c(1,1,0,1),
A1.2=c(1,0,1,1),
A1.3=c(0,0,0,0),
B1 = c(15,30,15,10),
B1.1=c(0,0,0,0),
B1.2=c(1,1,1,1),
B1.3=c(0,1,0,1),
C1 = c(30,25,10,30),
C1.1=c(1,1,0,1),
C1.2=c(0,0,0,0),
C1.3=c(0,1,1,0)
)
x %>%
rename("A1.0" = "A1",
"B1.0" = "B1",
"C1.0" = "C1") %>%
pivot_longer(`A1.0`:`C1.3`,
names_pattern=c("([A-C])\\d.(\\d)"),
names_to=c("A_C", ".value"),
names_prefix = "R") %>%
rename("A1_C1_value" = "0",
"R1" = "1",
"R2" = "2",
"R3" = "3")
# # A tibble: 12 × 6
# ID A_C A1_C1_value R1 R2 R3
# <int> <chr> <dbl> <dbl> <dbl> <dbl>
# 1 1 A 10 1 1 0
# 2 1 B 15 0 1 0
# 3 1 C 30 1 0 0
# 4 2 A 25 1 0 0
# 5 2 B 30 0 1 1
# 6 2 C 25 1 0 1
# 7 3 A 40 0 1 0
# 8 3 B 15 0 1 0
# 9 3 C 10 0 0 1
# 10 4 A 25 1 1 0
# 11 4 B 10 0 1 1
# 12 4 C 30 1 0 0**
CodePudding user response:
You can do this pretty efficiently using data.table
:
library(data.table)
df1 <- data.table(df1)
df2 <- melt(df1, measure = patterns("[A-Z]1$", "1.1$", "1.2$", "1.3$"),
value.name = c("A1_C1_value", "R1", "R2", "R3"), variable.name = "A_C")
df2[, A_C := LETTERS[A_C]][order(ID)]
Producing:
ID A_C A1_C1_value R1 R2 R3
1: 1 A 10 1 1 0
2: 1 B 15 0 1 0
3: 1 C 30 1 0 0
4: 2 A 25 1 0 0
5: 2 B 30 0 1 1
6: 2 C 25 1 0 1
7: 3 A 40 0 1 0
8: 3 B 15 0 1 0
9: 3 C 10 0 0 1
10: 4 A 25 1 1 0
11: 4 B 10 0 1 1
12: 4 C 30 1 0 0