Home > Software engineering >  How to add a new element containing an equals sign to a character vector in R
How to add a new element containing an equals sign to a character vector in R

Time:06-14

I have a character vector containing genes and their associated colors:

gene_colors<-c("protein_coding"="#1F78B4", "lncRNA"="#de08a0") 

I'm trying to go through another list of genes and add the gene with a random color if it's not already in the vector:

library(tidyverse)
library(randomcoloR)

for(gene in other_genes){ 
  if(!(gene %in% names(gene_colors))){
    temp<-paste0(gene, '=', randomColor(1))
  }
}

This is what's in other_genes:

 [1] "IG_C_gene"                          "IG_C_pseudogene"                   
 [3] "IG_J_gene"                          "IG_V_gene"                         
 [5] "IG_V_pseudogene"                    "lncRNA"                            
 [7] "miRNA"                              "misc_RNA"                          
 [9] "Mt_rRNA"                            "polymorphic_pseudogene"            
[11] "processed_pseudogene"               "protein_coding"                    

As you can see, I tried to use paste0() and I previously tried to use str_c() but both of these give me a string like this "IG_C_gene=#ffd4bf". I want to use the gene_colors vector in a heatmap function so I need the equals sign to be separate (ie not inside the quotes like it would be if it were a character in a string) like the entries in gene_colors. Is there any way to do this?

CodePudding user response:

This can be solved as shown below:

index <-  other_genes[!other_genes %in% names(gene_colors)]
gene_colors[index] <- randomColor(length(index))
gene_colors
        protein_coding                 lncRNA                  IG_C_gene        IG_C_pseudogene 
             "#1F78B4"              "#de08a0"                  "#adc3ea"              "#6962c1" 
             IG_J_gene              IG_V_gene        IG_V_pseudogene                  miRNA               misc_RNA 
             "#f2ab96"              "#86a3e8"              "#2fe07b"              "#b6f5f9"              "#215b82" 
               Mt_rRNA polymorphic_pseudogene   processed_pseudogene 
             "#356ca3"              "#8098ce"              "#44c942" 

Data:

other_genes <- c("IG_C_gene", "IG_C_pseudogene", "IG_J_gene", "IG_V_gene", "IG_V_pseudogene", 
"lncRNA", "miRNA", "misc_RNA", "Mt_rRNA", "polymorphic_pseudogene", 
"processed_pseudogene", "protein_coding")

CodePudding user response:

We may use ifelse instead of a loop

ifelse(!(other_genes %in% names(gene_colors)),
    paste0('"', other_genes, '"', '="', randomColor(length(other_genes)), '"'), 
other_genes)

Or just by assignment after creating a logical vector

i1 <- !(other_genes %in% names(gene_colors))
other_genes[i1] <- paste0('"', other_genes[i1], '"="', randomColor(sum(i1)), '"')

Or with sprintf

other_genes[i1] <- sprintf('"%s"="%s"', other_genes[i1], randomColor(sum(i1)))

CodePudding user response:

I realized that the function I'm trying to use requires the use of named vectors. Therefore, thanks to the accepted answer here I have found a solution that works by just adding the color to the gene_colors vector with the gene name as its name:

gene_colors<-c("protein_coding"="#1F78B4", "lncRNA"="#de08a0")

for(gene in other_genes){ 
  if(!(gene %in% names(gene_colors))){
    gene_colors[gene]<-randomColor(1)
  }
}
  • Related