Home > Enterprise >  How to create a bar visualization with the sum of several columns as the variables of the chart?
How to create a bar visualization with the sum of several columns as the variables of the chart?

Time:04-06

I need to create a bar graph with the sum total of each year, but these values are separated in columns for each year.

This is data from Exports from country A to country B, I need to show the variance for each year (increase/decrease), so the idea is to show a graph with bars for each year in X axis and the USD value in Y axis

Product USD2016 USD2017 USD2018   
A        100     80      150   
B        150     160     10   
C        200     85      500    
D        125     30      350

I try to use a summary func, but this only allows me to put 1 variable "USD2016"

summary_sum <- exp %>% 
    group_by(Product) %>%
    summarise(Suma = sum(USD2016L, na.rm=TRUE))

ggplot(summary_sum, aes(Product, Suma))   
       geom_bar(stat = "identity")

CodePudding user response:

For this type of graph, I think it would be better to work on a "long format" table. First create your data table:

dat <- tribble(
  ~Product, ~USD2016, ~USD2017, ~USD2018,   
  "A",        100,     80,      150,   
  "B",        150,     160,     10,   
  "C",        200,     85,      500,    
  "D",        125,     30,      350)

Then "pivot" it into long format: (with tidyr::pivot_longer)

dat.m <- dat %>% 
  pivot_longer(
    starts_with("USD"),   # Pivot only the "USD" cols
    names_to = "Year",    # Name of the "year" col
    names_prefix = "USD") # Remove prefix

And use the result to create a "stacked" barplot.

dat.m %>% 
  ggplot(aes(Year, value, fill = Product))   
  geom_col(color = "white")   # shorter than using "geom_bar"
  theme_bw()

The result shows at the same time the total export for each year and information on the products, so maybe it will be a little busy if there are many products. In this case, just remove the "fill" aesthetic parameter.

Resulting stacked barplot

CodePudding user response:

a slightly different tidyverse approach

df %>%
  pivot_longer(-1, names_to = "year", names_pattern = ".*([0-9]{4})$") %>%
  ggplot(aes(x = year, y = value, fill = Product))  
    stat_summary(fun = "sum", geom="col", position = "stack")
  

enter image description here

  • Related