Home > Enterprise >  How use labels data frame with list function instead of ggplot2 function?
How use labels data frame with list function instead of ggplot2 function?

Time:09-28

I have been trying to find a way to plot a data frame with only two columns: one for the value and the other for the label. Then, the plot could have 3 different colors (one for each label). Here my a part of my data frame:

dN     Label
0.0293 S
0.0273 S
0.0041 S
...
0.3070 E
0.3070 E
...

So I use this data frame to create a box plot in an individual way, I have 3 data frames similar to the above. Then I plot using a multiplot function:

multiplot(dN_plot, dS_plot, omega_plot, cols=3)

This result in a box plot: enter image description here

That plot is fine, however, I need to change the order and I need to use other functions. So, in a previous post another user help me to plot this boxplot using this code:

list(all_dN, all_dS, all_omega) %>% 
  set_names(c("S", "M", "E")) %>% 
  map_dfr(bind_rows, .id = "df") %>% 
  pivot_longer(-df) %>%
  mutate(df = factor(df, unique(df))) %>%
  ggplot()  
  geom_boxplot(aes(x = name, y = value, color = "label"), 
               fill = "blue",
               color = "blue",
               alpha = 0.2,
               notch = T,
               notchwidth = 0.8)  
  facet_wrap(~df, nrow = 1)

I know that the above code works because I reached the goal to plot the data with a very similar data frame. My problem with using this code with my new data frame is this error:

Error: Can't combine `dN` <double> and `label` <character>.
Run `rlang::last_error()` to see where the error occurred.

I suppose that the problem is the label or the data frame with only two columns, maybe? My question is: There is a way to fix that error using the list function, or do I need to change set_names? Any suggestion? Here is part of the data frame if you need to reproduce the error:

all_dN:
dN label
1   0.0293     S
2   0.0273     S
3   0.0041     M
4   0.0273     M
5   0.0041     M
6   0.0000     M
7   0.0276     S
8   0.0042     S
9   0.0000     S
10  0.0000     S
11  0.0281     E
12  0.0056     E
13  0.0015     S
14  0.0015     S
15  0.0015     S
16  0.0274     S
17  0.0071     S
18  0.0064     S
...
all_dS:
dS label
1   0.0757     S
2   0.0745     M
3   0.0085     M
4   0.0745     M
5   0.0109     M
6   0.0024     M
7   0.0741     S
8   0.0086     S
9   0.0000     S
10  0.0024     S
11  0.0798     E
12  0.0109     E
13  0.0048     E
14  0.0073     E
15  0.0049     S
16  0.0810     S
17  0.0170     S
18  0.0183     S
...
all_omega:
Omega label
1    0.3872     S
2    0.3668     M
3    0.4851     E
4    0.3668     S
5    0.3767     S
6   -1.0000     E
7    0.3730     S
8    0.4847     S
9   -1.0000     S
10  -1.0000     E
11   0.3521     E
12   0.5141     E
13   0.3078     S
14   0.2049     S
15   0.3076     S
16   0.3379     S
17   0.4189     S
18   0.3482     M

CodePudding user response:

In this case you don't need to use pivot_longer since your data is already in long format. Rename all the individual columns to one name so that you can bind them together.

library(tidyverse)

list(all_dN %>% rename(value = dN), 
     all_dS %>% rename(value = dS), 
     all_omega %>% rename(value = Omega)) %>%
  set_names(c("S", "M", "E")) %>% 
  map_dfr(bind_rows, .id = "df")  %>%
  mutate(across(c(df, label), ~factor(.x, unique(.x)))) %>% 
  ggplot()  
  geom_boxplot(aes(x = label, y = value, color = "label"), 
               fill = "blue",
               color = "blue",
               alpha = 0.2,
               notch = FALSE,
               notchwidth = 0.8)  
  facet_wrap(~df, nrow = 1, scales = 'free')

enter image description here

I changed notch = FALSE and added scales = 'free' in facet_wrap. Feel free to change them back according to your preference.

CodePudding user response:

Hard to tell without example data, but I think you should be fine if you replace

pivot_longer(-df) %>%

with

pivot_longer(-c(df, label)) %>%

That way pivot_longer only has to deal with numeric variables and should be happy ;)

  • Related