Home > Net >  Build identity matrix from dataframe (sparsematrix) in R
Build identity matrix from dataframe (sparsematrix) in R

Time:03-12

I am trying to create an identity matrix from a dataframe. The dataframe is like so:


i<-c("South Korea", "South Korea", "France", "France","France")
j <-c("Rwanda", "France", "Rwanda", "South Korea","France")
distance <-c(10844.6822,9384,6003,9384,0)
dis_matrix<-data.frame(i,j,distance)

dis_matrix

1   South Korea     South Korea        0.0000
2   South Korea          Rwanda    10844.6822
3   South Korea          France     9384.1793
4        France          Rwanda     6003.3498
5        France     South Korea     9384.1793
6        France          France        0.0000

I am trying to create a matrix that will look like this:

                South Korea           France         Rwanda     
South Korea               0        9384.1793     10844.6822
France            9384.1793                0      6003.3498
Rwanda           10844.6822        6003.3498              0

I have tried using SparseMatrix from Matrix package as described here (Create sparse matrix from data frame) The issue is that the i and j have to be integers, and I have character strings. I am unable to find another function that does what I am looking for. I would appreciate any help. Thank you

CodePudding user response:

A possible solution:

tidyr::pivot_wider(dis_matrix, id_cols = i, names_from = j,
         values_from = distance, values_fill = 0)

#> # A tibble: 2 × 4
#>   i           Rwanda France `South Korea`
#>   <chr>        <dbl>  <dbl>         <dbl>
#> 1 South Korea 10845.   9384             0
#> 2 France       6003       0          9384

CodePudding user response:

You can use igraph::get.adjacency to create the desired matrix. You can also create a sparse matrix with sparse = TRUE.

library(igraph)

g <- graph.data.frame(dis_matrix, directed = FALSE)
get.adjacency(g, attr="distance", sparse = FALSE)

            South Korea France   Rwanda
South Korea        0.00   9384 10844.68
France          9384.00      0  6003.00
Rwanda         10844.68   6003     0.00

CodePudding user response:

We may convert the first two columns to factor with levels specified as the unique values from both columns, and then use xtabs from base R

un1 <- unlist(unique(dis_matrix[1:2]))
dis_matrix[1:2] <- lapply(dis_matrix[1:2], factor, levels = un1)
xtabs(distance ~ i   j, dis_matrix)

-output

           j
i             South Korea   France   Rwanda
  South Korea        0.00  9384.00 10844.68
  France          9384.00     0.00  6003.00
  Rwanda             0.00     0.00     0.00
  • Related