Home > Software engineering >  creating kendall correlation matrix
creating kendall correlation matrix

Time:12-16

i have data that looks like this : enter image description here

in total 38 columns . data code sample :

    df <- structure(
        list(
          Christensenellaceae = c(
            0.010484508,
            0.008641566,
            0.010017172,
            0.010741488,
            0.1,
            0.2,
            0.3,
            0.4,
            0.7,
            0.8,
            0.9,
            0.1,
            0.3,
            0.45,
            0.5,
            0.55
),
          Date=c(27,27,27,27,27,27,27,27,28,28,28,28,28,28,28,28),
          Treatment = c(
            "Treatment 1",
            "Treatment 1",
            "Treatment 1",
            "Treatment 1",
            "Treatment 2",
            "Treatment 2",
            "Treatment 2",
            "Treatment 2",
             "Treatment 1",
             "Treatment 1",
              "Treatment 1",
             "Treatment 1",
            "Treatment 2",
             "Treatment 2",
             "Treatment 2",
             "Treatment 2"
   )
        ),class = "data.frame",
        row.names = c(NA,-9L)
      )

whay i wish to do is to create kendall correlation matrix (the data doesnt have linear behavor) between the treatment types(10 in total but 2 in example)for every column (except treatment and date) so in total 36 correlation matrix with size 1010 (here will be 22) .

this is my code:

res2 <- cor(as.matrix(data),method ="kendall")

but i get the error:

Error in cor(data, method = "kendall") : 'x' must be numeric

is there any way to solve this ? thank you:)

CodePudding user response:

You can do that using a tidyverse approach by first making some data wrangling and then using correlate to calculate the correlation in pairs for every combination of variables.

library(corrr)
library(tidyverse)

df |>
  # Transform data into wide format
  pivot_wider(id_cols = Date, 
              names_from = Treatment,
              values_from = -starts_with(c("Treatment", "Date"))) |>
  # Unnest lists inside each column
  unnest(cols = starts_with("Treatment")) |>
  # Remove Date from the columns
  select(-Date) |>
  # Correlate all columns using kendall
  correlate(method = "kendall")

# A tibble: 2 x 3
#  term        `Treatment 1` `Treatment 2`
#  <chr>               <dbl>         <dbl>
#1 Treatment 1        NA             0.546
#2 Treatment 2         0.546        NA    
  • Related