I have a dataframe (matrix) that looks like that :
C0| A | B | C
A | 1 | 2 | 3
B | 4 | 5 | 6
C | 7 | 8 | 9
I want to normalize this dataframe and generate a new one with 3 columns , it will look like that :
A | A | 1
A | B | 2
A | C | 3
B | A | 4
B | B | 5
B | C | 6
C | A | 7
C | B | 8
C | C | 9
CodePudding user response:
For matrices you might do
cbind(rep(rownames(m), each=nrow(m)),
rep(colnames(m), ncol(m)),
matrix(t(m), ncol=1)) |> as.data.frame()
# V1 V2 V3
# 1 A A 1
# 2 A B 2
# 3 A C 3
# 4 B A 4
# 5 B B 5
# 6 B C 6
# 7 C A 7
# 8 C B 8
# 9 C C 9
and for the data frame case, convert it to a matrix beforehand,
m <- as.matrix(df[-1]) |> `rownames<-`(df$C0)
or try stack
.
stack(df[-1]) |> {\(.) .[order(.$values), ]}() |>
cbind(ind1=rep(df$C0, each=nrow(df))) |> subset(select=3:1)
# ind1 ind values
# 1 A A 1
# 4 A B 2
# 7 A C 3
# 2 B A 4
# 5 B B 5
# 8 B C 6
# 3 C A 7
# 6 C B 8
# 9 C C 9
R >= 4.1 was used.
Data:
m <- structure(c(1L, 4L, 7L, 2L, 5L, 8L, 3L, 6L, 9L), dim = c(3L,
3L), dimnames = list(c("A", "B", "C"), c("A", "B", "C")))
df <- structure(list(C0 = c("A", "B", "C"), A = c(1L, 4L, 7L), B = c(2L,
5L, 8L), C = c(3L, 6L, 9L)), class = "data.frame", row.names = c(NA,
-3L))
CodePudding user response:
if matrix:
data.frame(as.table(m))
Var1 Var2 Freq
1 A A 1
2 B A 4
3 C A 7
4 A B 2
5 B B 5
6 C B 8
7 A C 3
8 B C 6
9 C C 9
And if data.frame
:
data.frame(as.table(as.matrix(data.frame(df, row.names = 'C0'))))
Var1 Var2 Freq
1 A A 1
2 B A 4
3 C A 7
4 A B 2
5 B B 5
6 C B 8
7 A C 3
8 B C 6
9 C C 9
NB: Stole the data from @jay.sf
Note that if you want it in the order of var1
then:
with(a<-data.frame(as.table(m)), a[order(Var1, Var2), ])
Var1 Var2 Freq
1 A A 1
4 A B 2
7 A C 3
2 B A 4
5 B B 5
8 B C 6
3 C A 7
6 C B 8
9 C C 9
or simply:
data.frame(as.table(t(m)))[c(2,1,3)]
Var2 Var1 Freq
1 A A 1
2 A B 2
3 A C 3
4 B A 4
5 B B 5
6 B C 6
7 C A 7
8 C B 8
9 C C 9