Home > Mobile >  How to rename the bins in ggplot in R
How to rename the bins in ggplot in R

Time:11-06

so basically I have created the bins and the have the means of each bin, having these two columns in a dataframe. Now I am plotting these two columns, but I want the exact number as x lable instead of bins. I am considering renaming each bin by its mid-point. please look at the pictures. The first one is my current plot and the second is the plot I want to acheive.

my current plot: enter image description here what I want to have: enter image description here my data frame is like this: enter image description here

CodePudding user response:

If you have groups that (I assume) you made with cut, you could pull out the max and min and then calc the mean before you summarize and plot. Note that I made the regex pretty long because I don't personally know if cut always makes left or inclusive or exclusive.

library(tidyverse)

#example like yours
mtcars |>
  mutate(grp = cut(hp, 10)) |>
  group_by(grp) |>
  summarise(mpg_mean = mean(mpg)) |>
  ggplot(aes(grp, mpg_mean)) 
  geom_point()


#solution
mtcars |>
  mutate(grp = cut(hp, 10)) |>
  extract(grp, 
          into = c("min", "max"), 
          remove = FALSE,
          regex = "(?:\\(|\\[)(.*),(.*)(?:\\)|\\])",
          convert = TRUE) |>
  mutate(mean_grp = (min   max)/2)|>
  group_by(mean_grp) |>
  summarise(mpg_mean = mean(mpg)) |>
  ggplot(aes(mean_grp, mpg_mean)) 
  geom_point()

CodePudding user response:

To reproduce the style of the plot image you included, you can do:

library(tidyverse)

df %>%
  mutate(bin_group = gsub("\\(|\\]", "", bin_group)) %>%
  separate(bin_group, sep = ",", into = c("lower", "upper")) %>%
  mutate(across(lower:upper, as.numeric)) %>%
  mutate(`Birth weight (g)` = (upper   lower) / 2) %>%
  ggplot(aes(`Birth weight (g)`, mean_28_day_mortality))  
  geom_vline(xintercept = 1500)  
  geom_point(shape = 18, size = 4)  
  scale_x_continuous(labels = scales::comma)  
  labs(title = "One-year mortality", y = NULL)  
  theme_bw(base_family = "serif", base_size = 20)  
  theme(panel.grid.major.x = element_blank(),
        panel.grid.minor = element_blank(),
        panel.grid.major.y = element_line(color = "black", size = 0.5),
        plot.title = element_text(hjust = 0.5))

enter image description here


Data used (obtained from image in question using OCR)

df <- structure(list(bin_group = structure(1:10, 
        levels = c("(1.35e 03,1.38e 03]", 
        "(1.38e 03,1.41e 03]", "(1.41e 03,1.44e 03]", "(1.44e 03,1.47e 03]", 
        "(1.47e 03,1.5e 03]", "(1.5e 03,1.53e 03]", "(1.53e 03,1.56e 03]", 
        "(1.56e 03,1.59e 03]", "(1.59e 03,1.62e 03]", "(1.62e 03,1.65e 03]"
        ), class = "factor"), mean_28_day_mortality = c(0.0563498, 0.04886257, 
        0.04467626, 0.04256053, 0.04248667, 0.04009187, 0.03625538, 0.03455094, 
        0.03349542, 0.02892909)), class = c("tbl_df", "tbl", "data.frame"
        ), row.names = c(NA, -10L))
  • Related