Home > Net >  Is there a way to automatically replace the "Character" in the data? in R
Is there a way to automatically replace the "Character" in the data? in R

Time:09-01

I want to insert a color code for a group from the following group information. Rather than being passive, we want to make a new data frame by counting the number of groups through a for statement or an if statement and assigning a color to each group.

Sample_Info <- read_excel("poppulation.xlsx", sheet = 1, skip = 0)
Sample_Info <- as.data.frame(Sample_Info)
#Sample_Info == The data we just read is the same as this one.
Sample_ID <- c("F","M", "F1_1", "F1_2")
Population <- c("parent", "parent", "child", "child")
data.frame(Sample_ID, Population)

If the code works properly, you can see that the same word appears twice in the Sample and 4 Populations. Here I want to input color codes for "parent" and "child" respectively.

Population_Count <- c(1:length(unique(unlist(Sample_Info[2]))))
pal <- c("#E85C90","#0D41E1", "#5CB270", "#432371", "#FBA949","red", "#0de189", "#db69f5", "#2AA8F2")
Population_Color <- sample(pal, length(Population_Count), replace = F)
# Population_Color == "#2AA8F2", "#FBA949" 

It's probably random, so it may be different from mine. So now I can even get the color to apply to the group. But I would like to have the data automatically output for "parent" and "child". For example, in the above case, you want to receive the following data.

SSS <- c("F","M", "F1_1", "F1_2")
CCC <- c("#2AA8F2","#2AA8F2","#FBA949","#FBA949")
Exam <- data.frame(SSS, CCC)
names(Exam) <- c("Sample_ID", "Population")
Exam

In other words, compared to the first data, instead of "parent" and "child", a code that means a color is inserted. In the future, the number of groups I will receive is unknown, and it will take a lot of time to do it manually every time I change an object and a group, so I want to change it like that. Because the code you're writing now is part of the whole code you're writing. If anyone knows about this, please let me know.

Thanks in advance for your advice.

CodePudding user response:

Here is an option to allocate a colour from your palette for each group (regardless of the number of group, as long as you have less group than colours):

set.seed(10)
Color_df <- data.frame(colour = sample(pal, length(unique(Sample_Info$Population)), replace = F),
                       Population = unique(Sample_Info$Population))

Sample_Info %>% left_join(Color_df, by = "Population")

  Sample_ID Population  colour
1         F     parent #2AA8F2
2         M     parent #2AA8F2
3      F1_1      child #0D41E1
4      F1_2      child #0D41E1
  • Related