Home > OS >  r - ggplot2 - geom_histogram from count
r - ggplot2 - geom_histogram from count

Time:02-19

What I have is a dataset with groups within different groups like this below

    Year    Status     Group  N
    1992    Pre        Blue   7
    1993    Pre        Blue   2
    1995    Pre        Blue   11
    2002    Pre        Blue   10
    2003    Pre        Blue   8
    2006    Post       Green  7
    2007    Post       Green  14
    2008    Post       Green  13
    2009    Post       Green  9
    2010    Post       Green  7
    2011    Pre        Blue   3
    2011    Post       Green  2
    2012    Pre        Blue   2
    2012    Post       Green  4
    2013    Pre        Blue   5
    2013    Post       Green  2
    2014    Pre        Blue   4
    2014    Post       Green  10
    2015    Current    Blue   12
    2015    Post       Green  8
    2016    Current    Blue   3
    2016    Post       Green  5
    2017    Current    Blue   13
    2017    Post       Green  6
    2018    Pre        Blue   2
    2018    Post       Green  7
    2019    Pre        Blue   9
    2019    Post       Green  7

What I am trying to do is plot the count(column N) on Y axis, Year on x axis by Group.

  • histogram from group Blue to be colored Blue,
  • histogram from group Green to be colored
    • Grey when Status=Pre,
    • Green When Status=Current.
    • Yellow when Status = Post
  • Add a label showing count on top of each bar of the histogram.

Something like this below [please note the colors in this example below does not match the additional requirements mentioned above].

I am able to plot this using geom_bar but not sure how to plot this various counts by group as a histogram any help is much appreciated.

ggplot(df, aes(Year, N))  
  geom_bar(aes(fill=Group), position="dodge", stat="identity", width=.5) 
  geom_text(aes(label=N, group=Group), position=position_dodge(width=0.5), vadjust=-0.5) 
theme_bw(base_size=18) 
ylab('Bedbugs')   xlab('Year')

enter image description here

CodePudding user response:

Using your data (lets call it df) and color definitions, it seems there are only blue and yellow, but presumably you have more data than this You can approach it something like this:

df <- df %>%
  mutate(color=case_when(
    Group=="Blue"~"Blue",
    Group=="Green" & Status=="Pre" ~ "Green/Pre",
    Group=="Green" & Status=="Current"~ "Green/Curent",
    Group=="Green" & Status=="Post"~ "Yellow"))

colors = c("Blue"="blue","Green/Pre" = "grey","Green/Current" = "green","Yellow" = "yellow")

ggplot(df, aes(Year, N, color=color, fill=color))  
  geom_bar(position="dodge", stat="identity", width=.5) 
  geom_text(aes(label=N,group=Group), position=position_dodge(width=0.5), vjust=-0.5, color="black") 
  scale_color_manual(values=colors) 
  scale_fill_manual(values=colors) 
  theme_bw(base_size=18,) 
  ylab('Count')   xlab('BedBugs')

CodePudding user response:

I think the main problem is that the histograms are overlaying. With position="identity and bins=10 you get this:

library(ggplot2)

df %>% 
  ggplot(aes(x = N, fill = Status))  
  geom_histogram(position = "identity", alpha = 0.3, bins = 10)  
  theme_bw()

enter image description here

  • Related