Home > Net >  Wrapping text and increasing the space between factors on the y-axis in a ggplot bar chart in R?
Wrapping text and increasing the space between factors on the y-axis in a ggplot bar chart in R?

Time:10-18

I have a figure where the factors on the y-axis are long category names. If I wrap them, they run together unless I make the font very small, and even then, the names are a bit too close. I would like to give these category names some room by increasing the space between the y-axis factors. However, I have not been able to find a way to do this. The bar size is fine, I just want the bars and ticks on the y-axis to be farther apart for better text spacing.

No wrap and font 10 is too much text, too little chart: nowrap_font10.png

Wrapping at 15 character width and font 10 overlaps text: wrap15_font10.png

Wrapping at 15 character width and font 5 makes small text that still isn't well spaced: wrap15_font5.png

Code:

library(tidyverse)
    
df <- tribble(
~sjr_categories,                                               ~labels,               ~subgr_count,
"Sociology and Political Science",                             "control",             22,
"Sociology and Political Science",                             "nest",                2,
"Sociology and Political Science",                             "rank",                22,
"Sociology and Political Science",                             "mixed types",         15,
"Sociology and Political Science",                             "unclear definitions", 52,
"Political Science and International Relations",               "control",             22,
"Political Science and International Relations",               "nest",                2,
"Political Science and International Relations",               "rank",                8,
"Political Science and International Relations",               "mixed types",         2,
"Political Science and International Relations",               "unclear definitions", 34,
"Arts and Humanities (miscellaneous)",                         "control",             3,
"Arts and Humanities (miscellaneous)",                         "rank",                15,
"Arts and Humanities (miscellaneous)",                         "mixed types",         9,
"Arts and Humanities (miscellaneous)",                         "unclear definitions", 38,
"Social Sciences (miscellaneous)",                             "control",             4,
"Social Sciences (miscellaneous)",                             "nest",                5,
"Social Sciences (miscellaneous)",                             "rank",                9,
"Social Sciences (miscellaneous)",                             "mixed types",         10,
"Social Sciences (miscellaneous)",                             "unclear definitions", 34,
"Economics and Econometrics",                                  "control",             13,
"Economics and Econometrics",                                  "rank",                2,
"Economics and Econometrics",                                  "mixed types",         12,
"Economics and Econometrics",                                  "unclear definitions", 27,
"Geography, Planning and Development",                         "control",             9,
"Geography, Planning and Development",                         "rank",                2,
"Geography, Planning and Development",                         "mixed types",         2,
"Geography, Planning and Development",                         "unclear definitions", 18,
"Business and International Management",                       "control",             3,
"Business and International Management",                       "nest",                2,
"Business and International Management",                       "rank",                2,
"Business and International Management",                       "mixed types",         8,
"Business and International Management",                       "unclear definitions", 15,
"Computer Science Applications",                               "control",             4,
"Computer Science Applications",                               "mixed types",         5,
"Computer Science Applications",                               "other types",         3,
"Computer Science Applications",                               "unclear definitions", 18,
"Medicine (miscellaneous)",                                    "nest",                5,
"Medicine (miscellaneous)",                                    "rank",                8,
"Medicine (miscellaneous)",                                    "mixed types",         4,
"Medicine (miscellaneous)",                                    "unclear definitions", 12,
"Biochemistry, Genetics and Molecular Biology (miscellaneous)","nest",                5,
"Biochemistry, Genetics and Molecular Biology (miscellaneous)","rank",                11,
"Biochemistry, Genetics and Molecular Biology (miscellaneous)","mixed types",         2,
"Biochemistry, Genetics and Molecular Biology (miscellaneous)","unclear definitions", 8,
"Business, Management and Accounting (miscellaneous)",         "control",             7,
"Business, Management and Accounting (miscellaneous)",         "mixed types",         5,
"Business, Management and Accounting (miscellaneous)",         "unclear definitions", 11,
"Management of Technology and Innovation",                     "nest",                2,
"Management of Technology and Innovation",                     "unclear definitions", 19,
"Organizational Behavior and Human Resource Management",       "control",             8,
"Organizational Behavior and Human Resource Management",       "rank",                8,
"Organizational Behavior and Human Resource Management",       "unclear definitions", 5
)

df %>%
  ggplot(aes(x=subgr_count, y=fct_reorder(sjr_categories, subgr_count),
             fill = labels
             ))  
  geom_col()  
  theme_bw()  
  scale_fill_viridis(discrete = TRUE)  
  scale_y_discrete(labels = function(x) str_wrap(x, width = 15))  
  theme(axis.text = element_text(size = 5))   
  labs(x = "count", y = NULL) 

ggsave(filename="wrap15_font5.png")

CodePudding user response:

Details on increasing space between bars in second part of answer.

I think this is really an exercise in your own preferences in what would look best: options include:

  1. Moving the legend to the bottom to give you more width for y axis text. You could shrink the legend a bit by reducing the key sizes to give you a bit more vertical space for the axis labels.
  2. Reduce bar width so there is more white space between axis text.
  3. The size of the final plot, I've assumed that it needs to be presented on an A4 portrait page with allowance for margins and a figure title. This is probably the defining criteria for what is possible.
  4. The criteria and trade-offs you have clearly identified in the question.

Personally I would go for a much reduced bar width: following Tufte’s principle of high data-ink ratio: what is important is the relative length of each stacked bar not the thickness. I've reduced the bar width a tad from the default 0.9 to 0.8.

library(tibble)
library(ggplot2)
library(viridis)
library(forcats)
library(stringr)


df |> 
  ggplot(aes(x=subgr_count, y=fct_reorder(sjr_categories, subgr_count), 
             fill = labels))  
  geom_col(width = 0.8)  
  scale_fill_viridis(discrete = TRUE)  
  scale_y_discrete(labels = function(x) str_wrap(x, width = 35))  
  guides(fill = guide_legend(nrow = 2)) 
  theme_bw()  
  theme(legend.position = "bottom",
        legend.direction = "horizontal",
        legend.justification = "left",
        legend.text = element_text(size = 10),
        legend.key.size = unit(4, "mm"),
        axis.text = element_text(size = 10))  
  
  labs(x = "count", y = NULL) 

enter image description here


 ggsave(filename="wrap35_font10.png", width = 180, height = 200, units = "mm")

Created on 2022-10-16 with enter image description here

Addendum: increasing the separation between bars

With regard to the core question as pointed out in the OP comment: How to increase the gap between the bars (to allow for better separation between the y axis labels). This can be achieved by the adjusting bar widths and output height of the plot:

To illustrate the point here is a minimal example:

For a discrete scale, discrete values, in ggplot, are placed apart by intervals of 1 unit. To increase the gap between bars you have a couple of options for horizontal bars:

a) reduce the bar width, but this retains the absolute distance between bars for a given plot, it just increases the white space between the bars. b) increase the overall plot height. For a given number of bars this increases the relative thickness of each bar but also the relative distance between each bar.

So to increase space between bars (and hence allow more room for wrapped text) increase the height of the plot and decrease the width of the bars.

Hopefully this is illustrated by these side by side plots where every thing is the same except the plot height.

df1 <- data.frame(var = c("Discrete value 1", "Discrete value 2 description", "Discrete value 3 even longer description"),
                  val = 1:3)

p1 <- 
  ggplot(df1, aes(val, var)) 
  geom_col(width = 0.1) 
  scale_y_discrete(labels = function(x) str_wrap(x, width = 15)) 
  labs(x = NULL,
       y = NULL) 
  theme_bw() 
  theme(axis.text.y = element_text(size = 10))

ggsave(p1, filename="wrap15_font10_100x100.png", height = 100, width = 100, units = "mm")

ggsave(p1, filename="wrap15_font10_150x100.png", height = 150, width = 100, units = "mm")


enter image description here

A hack

Not really recommending this but you could insert dummy discrete values between your actual values to force separation between bars...

library(ggplot2)
library(stringr)
library(forcats)

df1 <- data.frame(var = fct_inorder(c("Discrete value 1", "gap1", "Discrete value 2 description", "gap2", "Discrete value 3 even longer description")),
                  val = c(1, 0, 2, 0, 3))


  ggplot(df1, aes(val, var)) 
  geom_col(width = 0.9) 
  scale_y_discrete(labels = function(x) ifelse(str_detect(x, "gap"), "", str_wrap(x, width = 15))) 
  labs(title = "A hack: include dummy discrete values between actual values",
       x = NULL,
       y = NULL) 
  theme_bw() 
  theme(axis.text.y = element_text(size = 10))

Created on 2022-10-16 with reprex v2.0.2

  • Related