Home > front end >  Breaking up a large ggplot by category; subset shows no errors but plots no data
Breaking up a large ggplot by category; subset shows no errors but plots no data

Time:02-08

I have a very large dataset derived from a spreadsheet of the format below:

df = data.frame(name = c('Ger1', 'Ger2', 'Ger3', 'Ger4', 'Ger5', 'Ger6'),
                issued = c(UKS, USD, UKS, UKS, USD, USD),
                mat = c(2024-01-31, 2023-01-31, 2026-10-22, 2022-07-22, 2029-01-31, 2025-06-07)
                volume = c(0.476, 0.922, 0.580, 1.259, 0.932, 0.417)

Currently, I plot all the data on one very long ggplot with the following code:

chart1<-ggplot(df) geom_bar(stat="ID",aes(x=volume,y=name),fill="#1170aa") 
  theme(title=element_text(size=12),panel.background = element_rect(fill='white',color='black'),legend.position='right') 
    labs(title = "Total carriage by Volume on the day", x = "Volume", y = "Name")

Now while that worked for a while, given the size the dataset has grown to it is no longer feasible to use that way. Therefore I'd like to plot the data based on the contents of the "issued" column.

I first thought about a condition statement of the type:

if (df$issued == "UKS"){
chart1<-ggplot(df) geom_bar(stat="ID",aes(x=volume,y=name),fill="#1170aa") 
      theme(title=element_text(size=12),panel.background = element_rect(fill='white',color='black'),legend.position='right') 
        labs(title = "Total carriage by Volume on the day", x = "Volume", y = "Name")
}

It unfortunately didn't work (although on closer inspection my logic wasn't particularly well thought-out)

I have then tried using the subset() function in the hopes that would allow to only plot data meeting my requirements as so:

chart1<-ggplot(subset(df, 'issued' == "UKS")) geom_bar(stat="ID",aes(x=volume,y=name),fill="#1170aa") 
          theme(title=element_text(size=12),panel.background = element_rect(fill='white',color='black'),legend.position='right') 
            labs(title = "Total carriage by Volume on the day", x = "Volume", y = "Name")

This particular code didn't show up any errors, but the chart that was produced had no data on it all. Does anyone have any ideas on how I can filter and plot this data?

CodePudding user response:

You don't need quote "" for column names in subset().

ggplot(subset(df, issued == "UKS"))  
  geom_bar(stat="identity", aes(x=volume,y=name),fill="#1170aa") 
  theme(title=element_text(size=12),
        panel.background = element_rect(fill='white',color='black'),
        legend.position='right') 
  labs(title = "Total carriage by Volume on the day", x = "Volume", y = "Name")

Or use a tidyverse way of plotting:

library(tidyverse)

df %>% filter(issued == "UKS") %>% 
  ggplot()  
  geom_bar(stat="identity", aes(x=volume,y=name),fill="#1170aa") 
  theme(title=element_text(size=12),
        panel.background = element_rect(fill='white',color='black'),
        legend.position='right') 
  labs(title = "Total carriage by Volume on the day", x = "Volume", y = "Name")

subset_UKS

  •  Tags:  
  • Related