Home > Software design >  R - Modify ggplot columns by a specific variable and use such variable as a reference on the X-axis
R - Modify ggplot columns by a specific variable and use such variable as a reference on the X-axis

Time:07-08

I have a dataset where each rows corresponds to a country, while the other variables are:

  • quantity: average quantity per element in the country
  • elements: total number of elements in the country

The plot I would like to reproduce, is the same showed here, available on the Our World in Data website. Right now, I don't care to much about the arrows and the text. I would just like to have a barplot, where the columns width is proportioned to the number of elements and, at the same time, on the X-axis I don't want the country names, but a continuous sum of my elements (which should reach the total at the end of the X-axis).

Here's what I've achieved so far:

library(tidyverse)

data <- data.frame("country" = c("Argentina", "Peru", "Bolivia", "French Guinea"),
                   "quantity" = c(1.7, 2.2, 4.4, 4.3),
                   "elements" = c(11030725, 1082704, 4827537, 12665))

ggplot()  
  geom_col(data = data,
           aes(x = reorder(x = country,
                           X = -quantity),
               y = quantity,
               width = elements/10000000))  
  labs(x = "Country",
       y = "Average quantity")

I've managed to reorder the bars by the "quantity" variable, but I don't understand how to show the total number of my elements (grouped by 10 million) on the X-axis. In addition, it seems also quite difficult to remove all the white space between the bars.

CodePudding user response:

One option would be to switch to geom_rect which requires some data wrangling to compute the xmin and xmax:

library(tidyverse)

data <- data.frame("country" = c("Argentina", "Peru", "Bolivia", "French Guinea"),
                   "quantity" = c(1.7, 2.2, 4.4, 4.3),
                   "elements" = c(11030725, 1082704, 4827537, 12665))

data <- data |> 
  mutate(country = reorder(country, -quantity)) |>
  arrange(country) |> 
  mutate(xmax = cumsum(elements),
         xmin = lag(xmax, default = 0))
           
ggplot()  
  geom_rect(data = data,
           aes(xmin = xmin, xmax = xmax,
               ymin = 0, ymax = quantity, fill = country))  
  labs(x = "Country", y = "Average quantity")

  • Related