Home > Net >  Reshape binary data group by column and count
Reshape binary data group by column and count

Time:10-21

I want to reshape a data.frame to a matrix in the following format:

Input:

   Typ1 Typ2 Maths Science English History
    1    1     1     1       1       1       
    0    1     0     1       0       0       
    1    0     1     0       0       0       

Output:

         Maths Science English History
   Typ1    2     2       1       1              
   Typ2    1     2       1       1              

And 2nd:

         Maths Science English History
   Typ1    1     1       1       1              
   Typ2    1     1       1       1              
   Typ2    0     1       0       0      
   Typ1    1     0       0       0

CodePudding user response:

For the first version, you can do:

`row.names<-`(rbind(sapply(df[-c(1:2)], function(x) sum(df[[1]] * x)),
                    sapply(df[-c(1:2)], function(x) sum(df[[2]] * x))),
              names(df)[1:2])
#>      Maths Science English History
#> Typ1     2       1       1       1
#> Typ2     1       2       1       1

The second version is a bit harder, but you could do something like

df2 <- df[rep(seq(nrow(df)), times = (df$Typ1 == 1 & df$Typ2 == 1)   1), -(1:2)]
df2 <- as.matrix(df2)
row.names(df2) <- names(df)[unlist(sapply(seq(nrow(df)), 
                                          function(x) which(df[x,1:2] == 1)))]

df2
#>      Maths Science English History
#> Typ1     1       1       1       1
#> Typ2     1       1       1       1
#> Typ2     0       1       0       0
#> Typ1     1       0       0       0

Data in reproducible format

df <- structure(list(Typ1 = c(1L, 0L, 1L), Typ2 = c(1L, 1L, 0L), Maths = c(1L, 
 0L, 1L), Science = c(1L, 1L, 0L), English = c(1L, 0L, 0L), History = c(1L, 
 0L, 0L)), class = "data.frame", row.names = c(NA, -3L))

Created on 2022-10-20 with reprex v2.0.2

  • Related