Home > front end >  Heatmap of Gene intensity values in R
Heatmap of Gene intensity values in R

Time:09-27

I have data that look like this:

Gene HBEC-KT-01 HBEC-KT-02 HBEC-KT-03 HBEC-KT-04 HBEC-KT-05 Primarycells-02 Primarycells-03 Primarycells-04 Primarycells-05
BPIFB1 15726000000 15294000000 15294000000 14741000000 22427000000 87308000000 2.00E 11 1.04E 11 1.51E 11
LCN2 18040000000 26444000000 28869000000 30337000000 10966000000 62388000000 54007000000 56797000000 38414000000
C3 2.52E 11 2.26E 11 1.80E 11 1.80E 11 1.78E 11 46480000000 1.16E 11 69398000000 78766000000
MUC5AC 15647000 8353200 12617000 12221000 29908000 40893000000 79830000000 28130000000 69147000000
MUC5B 965190000 693910000 779970000 716110000 1479700000 38979000000 90175000000 41764000000 50535000000
ANXA2 14705000000 18721000000 21592000000 18904000000 22657000000 28163000000 24282000000 21708000000 16528000000

I want to make a heatmap like the following using R. I am following a paper and they quoted "Heat maps were generated with the ‘pheatmap’ package76, where correlation clustering distance row was applied". Here is their heatmap.

Heatmap using pheatmap in R

I want the same like this and I am trying to make one using R by following tutorials but I am new to R language and know nothing about R.

Here is my code.

df <- read.delim("R.txt", header=T, row.names="Gene")
df_matrix <- data.matrix(df)
pheatmap(df_matrix, 
     main = "Heatmap of Extracellular Genes",
     color = colorRampPalette(rev(brewer.pal(n = 10, name = "RdYlBu")))(10),
     cluster_cols = FALSE,
     show_rownames = F,
     fontsize_col = 10,
     cellwidth = 40,
     )

This is what I get.

Heatmap

When I try using clustering, I got the error.

pheatmap(
mat = df_matrix,
  scale = "row",
  cluster_column = F,
  show_rownames = TRUE,
  drop_levels = TRUE,
  fontsize = 5,
  clustering_method = "complete",
  main = "Hierachical Cluster Analysis"
)

Error in hclust(d, method = method) : 
NA/NaN/Inf in foreign function call (arg 10)

Can someone help me with the code?

CodePudding user response:

You can normalize the data using scale to archive a more uniform coloring. Here, the mean expression is set to 0 for each sample. Genes lower expressed than average have a negative z score:

library(tidyverse)
library(pheatmap)

data <- tribble(
  ~Gene, ~`HBEC-KT-01`, ~`HBEC-KT-02`, ~`HBEC-KT-03`, ~`HBEC-KT-04`, ~`HBEC-KT-05`, ~`Primarycells-03`, ~`Primarycells-04`, ~`Primarycells-05`,
  "BPIFB1", 1.5726e 10, 1.5294e 10, 1.5294e 10, 1.4741e 10, 2.2427e 10, 2e 11, 1.04e 11, 1.51e 11,
  "LCN2", 1.804e 10, 2.6444e 10, 2.8869e 10, 3.0337e 10, 1.0966e 10, 5.4007e 10, 5.6797e 10, 3.8414e 10,
  "C3", 2.52e 11, 2.26e 11, 1.8e 11, 1.8e 11, 1.78e 11, 1.16e 11, 6.9398e 10, 7.8766e 10,
  "MUC5AC", 15647000, 8353200, 12617000, 12221000, 29908000, 7.983e 10, 2.813e 10, 6.9147e 10,
  "MUC5B", 965190000, 693910000, 779970000, 716110000, 1479700000, 9.0175e 10, 4.1764e 10, 5.0535e 10,
  "ANXA2", 1.4705e 10, 1.8721e 10, 2.1592e 10, 1.8904e 10, 2.2657e 10, 2.4282e 10, 2.1708e 10, 1.6528e 10
)
data %>%
  mutate(across(where(is.numeric), scale)) %>%
  column_to_rownames("Gene") %>%
  pheatmap(
    scale = "row",
    cluster_column = F,
    show_rownames = FALSE,
    show_colnames = TRUE,
    treeheight_col = 0,
    drop_levels = TRUE,
    fontsize = 5,
    clustering_method = "complete",
    main = "Hierachical Cluster Analysis (z-score)",
  )

Created on 2021-09-26 by the reprex package (v2.0.1)

  • Related